pythonmachine-learninggradient-descent

Gradient Descent blowing up in linear regression


I am coding a linear regression code in python,I used the formulas I learnt and checked them up, and also tried normalising the the dataset what happened then is the values of weight and bias changed the exponential increase of values but still there is some error because the bias is still like in in the range of 10^-18 i am using the real estate dataset from kaggle heres the link https://www.kaggle.com/datasets/nitinsharma05/real-estate-analysis-dataset

here is the code i wrote

m -> weight

c -> bias

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

def gradient_descent(x, y, m, c, n):
    delta_m = (-2/n) * np.sum(x * (y - (m*x + c)))
    delta_c = (-2/n) * np.sum(y - (m*x + c))
    return delta_m, delta_c

def linear_regression(epochs, m, c, learning_rate, x, y):
    n = len(x)
    for i in range(epochs):
        delta_m, delta_c = gradient_descent(x, y, m, c, n)
        m -= learning_rate * delta_m
        c -= learning_rate * delta_c
        if (i+1) % 1000 == 0:
            print(f"After {i+1} epochs: m = {m}, c = {c}")
    return m, c

df = pd.read_csv("archive/real_estate.csv")

df = df[['apartment_total_area', 'price_in_USD']].dropna()

df['apartment_total_area'] = (
    df['apartment_total_area']
    .astype(str)
    .str.replace(' m²', '', regex=False)
    .str.replace(' ', '', regex=False)
    .str.replace(',', '.', regex=False)
    .astype(float)
)

# Clean price_in_USD
df['price_in_USD'] = (
    df['price_in_USD']
    .astype(str)
    .str.replace('$', '', regex=False)
    .str.replace(',', '', regex=False)
    .str.strip()
    .astype(float)
)

x = df['apartment_total_area'].to_numpy()
y = df['price_in_USD'].to_numpy()

x_mean, x_std = x.mean(), x.std()
y_mean, y_std = y.mean(), y.std()
x_s = (x - x_mean)/x_std
y_s = (y - y_mean)/y_std

m, c = 0, 0
epochs = 20000
learning_rate = 0.0001  


m_s, c_s = linear_regression(epochs, m, c, learning_rate, x_s, y_s)

m_orig = m_s * (y_std / x_std)
c_orig = y_mean + c_s * y_std - m_orig * x_mean

print(f"Final slope (m): {m_orig}")
print(f"Final intercept (c): {c_orig}")

plt.scatter(x, y, color="blue")
plt.plot(x, m_orig*x + c_orig, color="red")
plt.xlabel("Apartment Total Area (m²)")
plt.ylabel("Price in USD")
plt.show()

can someone help me figure out whats the problem?

I also tried changing the learning rate from 1 to 10^-18 but nothing worked and specially the values fo wieghts and biases didnt change this image shows the current slope of linear regression of my code


Solution

  • Your implementation of gradient descent is basically correct — the main issues come from feature scaling and the learning rate. A few key points:

    1. Normalization:
      You standardized both x and y (x_s, y_s), which is fine for training. But then, when you “denormalize” the parameters back, the intercept c_orig can become very small (close to 0 or 1e-18) simply because the regression line passes very close to the origin in normalized space. That’s expected, not a bug.

    2. Learning rate:
      0.0001 may still be too small for standardized data. Try 0.01 or 0.1. On the other hand, with unscaled data, large rates will blow up. So:

      • If you scale → use a larger learning rate.

      • If you don’t scale → use a smaller one.

    3. Intercept near zero:
      That’s normal after scaling. If you train on (x_s, y_s), the model is y_s = m_s * x_s + c_s. When you transform back, c_orig is adjusted with y_mean and x_mean. So even if c_s0, your denormalized model is fine.

    4. Check against sklearn:
      Always validate your implementation by comparing with scikit-learn’s LinearRegression:

      from sklearn.linear_model import LinearRegression
      
      model = LinearRegression()
      model.fit(x.reshape(-1, 1), y)
      print(model.coef_, model.intercept_)
      
      

      If your slope and intercept match (up to tiny floating-point differences), your code is working correctly.