Cost function appears to work normally, as in it's not giving me unusually large or unusually small numbers.
Whereas the gradient descent function is making the parameters suddenly jump to riddiculous numbers.
m = len(x) # Training examples
def cost(w, b, x, y):
j_wb = 0.
for i in range(m):
f_wb = w * x[i] + b
err = f_wb - y[i]
j_wb += err**2
j_wb = j_wb / (m * 2)
return j_wb
Here's a test of it:
at itteration 0: COST = 0.0
at itteration 1: COST = 11.906281776347848
at itteration 2: COST = 24.406303301163156
at itteration 3: COST = 25.89142657618535
at itteration 4: COST = 31.71690104577324
at itteration 5: COST = 32.222444954452776
at itteration 6: COST = 52.79887560513525
at itteration 7: COST = 57.723294484239304
at itteration 8: COST = 59.252477506721256
at itteration 9: COST = 61.178601048944415
Final cost: 4.5292025547813575
def gradient(w, b, x, y, itterations, alphar):
# Initialize
dj_dw = 0
dj_db = 0
# Gradient descent
for i in range(itterations):
j_wb = cost(w, b, x, y)
dj_dw = j_wb * x[i]
dj_db = j_wb
w = w - alphar * dj_dw
b = b - alphar * dj_db
return dj_dw, dj_db
And now here is where it returns the main problem:
> Itteration 0 || Cost = 4.5292025547813575 || w = -0.08700861314752584
> Itteration 1 || Cost = 1919.5314293706836 || w = -959.8527232984893
> Itteration 2 || Cost = 1540639463.935084 || w = -231096879.44298592
> Itteration 3 || Cost = 8.924767986122691e+19 || w = -3.3914118347497325e+19
> Itteration 4 || Cost = 1.9197504655865147e+42 || w = -1.6701829050602676e+42
> Itteration 5 || Cost = 4.653919574489622e+87 || w = -1.675411046816264e+87
> Itteration 6 || Cost = 4.685387293902403e+177 || w = -5.622464752682884e+176
> Itteration 7 || Cost = inf || w = -inf
> Itteration 8 || Cost = nan || w = nan
> Itteration 9 || Cost = nan || w = nan
> Itteration 10 || Cost = nan || w = nan
Here are the first 10 itterations. On itteration 2 the cost suddenly shoots up to ~2000 and at itteration 3 the parameter w is at a very low number.
Then it climaxes at itteration 7 where they're both infinity.
Can someone please guide me to where i'm going wrong? Any help or advice will be greatly appreciated
I think the gradient function seems better:
def gradient_descent(w, b, x, y, iterations, alpha):
m = len(x)
for i in range(iterations):
dj_dw = 0
dj_db = 0
for j in range(m):
f_wb = w * x[j] + b
error = f_wb - y[j]
dj_dw += error * x[j]
dj_db += error
dj_dw /= m
dj_db /= m
w = w - alpha * dj_dw
b = b - alpha * dj_db
cost_value = cost(w, b, x, y)
print(f"Iteration {i} || Cost = {cost_value} || w = {w} || b = {b}")
return w, b