pythonmatplotlib

How to plot cdf in matplotlib in Python?


I have a disordered list named d that looks like:

[0.0000, 123.9877,0.0000,9870.9876, ...]

I just simply want to plot a cdf graph based on this list by using Matplotlib in Python. But don't know if there's any function I can use

d = []
d_sorted = []
for line in fd.readlines():
    (addr, videoid, userag, usertp, timeinterval) = line.split()
    d.append(float(timeinterval))

d_sorted = sorted(d)

class discrete_cdf:
    def __init__(data):
        self._data = data # must be sorted
        self._data_len = float(len(data))

    def __call__(point):
        return (len(self._data[:bisect_left(self._data, point)]) / 
               self._data_len)

cdf = discrete_cdf(d_sorted)
xvalues = range(0, max(d_sorted))
yvalues = [cdf(point) for point in xvalues]
plt.plot(xvalues, yvalues)

Now I am using this code, but the error message is :

Traceback (most recent call last):
File "hitratioparea_0117.py", line 43, in <module>
cdf = discrete_cdf(d_sorted)
TypeError: __init__() takes exactly 1 argument (2 given)

Solution

  • As mentioned, cumsum from numpy works well. Make sure that your data is a proper PDF (ie. sums to one), otherwise the CDF won't end at unity as it should. Here is a minimal working example:

    import numpy as np
    from pylab import *
    
    # Create some test data
    dx = 0.01
    X  = np.arange(-2, 2, dx)
    Y  = np.exp(-X ** 2)
    
    # Normalize the data to a proper PDF
    Y /= (dx * Y).sum()
    
    # Compute the CDF
    CY = np.cumsum(Y * dx)
    
    # Plot both
    plot(X, Y)
    plot(X, CY, 'r--')
    
    show()
    

    enter image description here