In my data, the first column is the x
values, and the second column is the y
values. I want to fit the data with a three-parameter Weibull function to describe the distribution.
I tried to find solutions by many searches, but a similar post on Stack Overflow seems to be only for one-column data. I also tried with the libraries scipy.stats.exponweib
and scipy.stats.weibull_min
, but failed due to the same problem.
So I tried to define the function and to fit the data with scipy.optimize.curve_fit
. The following is the code I used. The data.txt
file can be downloaded by this link.
import numpy as np
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit
# Load data
poresize, psd, psd_std = np.loadtxt("data.txt", unpack=True)
# Define the Weibull distribution function with the requested form
def weibull_func(x, a, b, c):
return a * b * (x - c) ** (b - 1) * np.exp(-a * (x - c) ** b)
# Perform curve fitting
popt, pcov = curve_fit(weibull_func, poresize, psd, p0=[1, 1, 1])
# Plot the original data and the fitted curve
plt.scatter(poresize, psd, label='Data')
x_range = np.linspace(min(poresize), max(poresize), 100)
plt.plot(x_range, weibull_func(x_range, *popt), 'r-', label='Fitted curve (Weibull)')
plt.xlabel('Particle Size')
plt.ylabel('PSD')
plt.title('Fitting Weibull Distribution')
plt.legend()
plt.grid(True)
plt.show()
# Display the optimized parameters
a_opt, b_opt, c_opt = popt
print("Optimized Parameters:")
print("a:", a_opt)
print("b:", b_opt)
print("c:", c_opt)
The function form is adapted from the Weibull distribution from Matlab, but x
is replaced with x-c
.
The problem is that I tried many combinations of the initial guess values for p0
, but none of them could give me a satisfactory result.
Could you please tell me whether there is something fundamentally wrong with the code? Or how can I get the proper initial guess fast?
Two pieces of advice when fitting a function:
x < c
when b
is not an integer. (This is because you have (x-c)^(b-1)
. If x < c
then x - c < 0
. If b
is not an integer then you will be taking a fractional power of a negative number, which produces a complex number.)x = 2
and x = 5
and it spikes up to y = 1
. Knowing that, I made this Desmos model and played with the variables until I got something decent.Here is the fixed function that can handle x < c
:
def weibull_func(x, a, b, c):
res = np.zeros_like(x)
cond = x > c
xcond = x[cond]
res[cond] = a*b*(xcond - c)**(b - 1)*np.exp(-a*(xcond - c)**b)
return res
I provided the curve_fit
call with the guess I got using Desmos:
popt, pcov = curve_fit(weibull_func, poresize, psd, p0=[1.4, 1.93, 2.4])
The tuned parameters are:
a = 1.1172264863332277
b = 2.2231297860177226
c = 2.301372954142873
Resulting plot: