Dear all: I have a dataset that contains multiple data points in 3D, and I'm trying to find the best fit curve for this dataset. I could do this in 2D as follow.
import numpy as np
import os
import pandas as pd
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit
df_reg1=pd.read_csv('C:\\Users\\wilso\\python\\datasets\\PCD\\df_clean_rail.csv')
df_reg=df_reg1[['x','y','z']]
data = df_reg.values
x, y, z = data[:, 0], data[:, 1], data[:, 2]
def objective(x, a, b, c):
return a * x + b * x**2 + c
popt, _ = curve_fit(objective, x, y)
a, b, c = popt
print('y = %.5f * x + %.5f * x^2 + %.5f' % (a, b, c))
plt.scatter(x, y)
x_line = np.arange(min(x), max(x), 1)
y_line = objective(x_line, a, b, c)
plt.plot(x_line, y_line, '--', color='red')
plt.xlim(-100,10)
plt.ylim(-10,10)
plt.show()
However, if I try the same method in 3D, I will get a traceback
def objective(x, y, a, b, c):
return a * x + b * y**2 + c
popt, _ = curve_fit(objective, x, y, z)
TypeError: objective() takes 5 positional arguments but 14884 were given
Could anyone please kindly offer me some hint? Thanks a lot for your time and great support.
Sincerely
Wilson
The function curve_fit
can be used for multidimensional curves, but cannot be called the way you did.
All of the independent variables should be in xdata
.
For example:
In [121]: data.shape
Out[121]: (1000, 3)
In [122]: f = lambda x,a,b,c,d: a*x[0]**2 +b*x[1]**2 + c*x[0]*x[1]+d
In [123]: curve_fit(f, data[:, :2].T, ydata=data[:, 2])
Here all my data is in one array, and I input the independent variables and the dependent variable separately.