numpymatplotlibscipyestimationprobability-distribution

Plotting an exponential function given one parameter


I'm fairly new to python so bare with me. I have plotted a histogram using some generated data. This data has many many points. I have defined it with the variable vals. I have then plotted a histogram with these values, though I have limited it so that only values between 104 and 155 are taken into account. This has been done as follows:

bin_heights, bin_edges = np.histogram(vals, range=[104, 155], bins=30)
bin_centres = (bin_edges[:-1] + bin_edges[1:])/2.
plt.errorbar(bin_centres, bin_heights, np.sqrt(bin_heights), fmt=',', capsize=2)

plt.xlabel("$m_{\gamma\gamma} (GeV)$")
plt.ylabel("Number of entries")
plt.show()

Giving the above plot: enter image description here

My next step is to take into account values from vals which are less than 120. I have done this as follows:

background_data=[j for j in vals if j <= 120] #to avoid taking the signal bump, upper limit of 120 MeV set

I need to plot a curve on the same plot as the histogram, which follows the form B(x) = Ae^(-x/λ) I then estimated a value of λ using the maximum likelihood estimator formula:

background_data=[j for j in vals if j <= 120] #to avoid taking the signal bump, upper limit of 120 MeV set
#print(background_data)
N_background=len(background_data)
print(N_background)
sigma_background_data=sum(background_data)
print(sigma_background_data)
lamb = (sigma_background_data)/(N_background) #maximum likelihood estimator for lambda
print('lambda estimate is', lamb)

where lamb = λ. I got a value of roughly lamb = 27.75, which I know is correct. I now need to get an estimate for A.

I have been advised to do this as follows:

Given a value of λ, find A by scaling the PDF to the data such that the area beneath
the scaled PDF has equal area to the data

I'm not quite sure what this means, or how I'd go about trying to do this. PDF means probability density function. I assume an integration will have to take place, so to get the area under the data (vals), I have done this:

data_area= integrate.cumtrapz(background_data, x=None, dx=1.0)
print(data_area)
plt.plot(background_data, data_area)

However, this gives me an error

ValueError: x and y must have same first dimension, but have shapes (981555,) and (981554,)

I'm not sure how to fix it. The end result should be something like:

enter image description here


Solution

  • See the cumtrapz docs:

    Returns: ... If initial is None, the shape is such that the axis of integration has one less value than y. If initial is given, the shape is equal to that of y.

    So you are either to pass an initial value like

    data_area = integrate.cumtrapz(background_data, x=None, dx=1.0, initial = 0.0)
    

    or discard the first value of the background_data:

    plt.plot(background_data[1:], data_area)