from numpy import *
from matplotlib.pyplot import *
import pandas as pd
data = pd.read_csv('student-mat.csv', sep=';')
predict = 'Markup'
original = 'OriginalPrice'
y = np.array(data[predict])
x = np.array(data[original])
p1 = polyfit(x,y,1)
p2 = polyfit(x,y,2)
p3 = polyfit(x,y,3)
print(p1,p2,p3)
plot(x,y,'o')
plot(x,polyval(p1,x), 'r-')
plot(x,polyval(p2,x), 'b-')
plot(x,polyval(p3,x), 'm-')
show()
I am attempting to represent a set of data with a line of best fit, originally I used a polynomial but it seems I need a rational for this set of data. I'm not sure what function to use to generate a rational line of best fit model. Ideally, I would be able to simply replace my polyfit
function with a rational one. Thanks in advance, any help is welcome :).
you could write your own function and minimize the error using least squares? for example...
for arbitrary exponential looking data, in variables X and Y:
def exp(args):
a, b, c, d, e = args
curve = [a*b**(c*x-d)+e for x in X]
rmse = sum((y-pt)**2 for y,pt in zip(Y,curve))**0.5
return rmse
fit = optimize.minimize(exp, [2, 2.8, -1, 0, 1]).x #initial guess
see random points + curve fit