scikit-learnlinear-regression

Why does the y-intercept change when all features are used in a linear regression model vs only one feature?


I'm training a linear regression on the "advertising sales dataset"

When I train all of the features(columns) of the dataset together, I get a different y-intercept than if I trained the model on one column only (TV).

Shouldn't the y intercept stay the same?

The weight of the column stays the same.

import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
import numpy as np
from sklearn.linear_model import LinearRegression
from sklearn.model_selection import train_test_split

data = pd.read_csv('../datasets/Advertising Budget and Sales.csv')

data = data.rename(columns={
    'TV Ad Budget ($)': 'TV',
    'Radio Ad Budget ($)': 'Radio',
    'Newspaper Ad Budget ($)': 'Newspaper',
    'Sales ($)': 'Sales',
    })

data = data.drop(columns=['Unnamed: 0'])

use all columns together

X = data[['TV', 'Radio', 'Newspaper']]
y = data['Sales']

X_train, X_test, y_train, y_test = train_test_split(X, y, train_size=0.7, test_size=0.3, shuffle=True, random_state=100)


lr = LinearRegression().fit(X_train, y_train)

coeff = lr.coef_
intercept = lr.intercept_

print('coefficients of TV, Radio, and Newspaper:', coeff)
print('y intercept: ',intercept)

coefficients of TV, Radio, and Newspaper: [0.0454256 0.18975773 0.00460308]

y intercept: 2.652789668879496

plt.scatter(data['TV'], data.Sales)
plt.plot(data.TV, intercept + coeff[0] * data.TV);

enter image description here

use the TV feature only

X_train, X_test, y_train, y_test = train_test_split(data['TV'].values.reshape(-1, 1),
                                                    data[['Sales']],
                                                    train_size=0.7, test_size=0.3, shuffle=True, random_state=100)

lr = LinearRegression().fit(X_train, y_train)

coeff_2 = lr.coef_
intercept_2 = lr.intercept_

print('coefficients of TV, Radio, and Newspaper:', coeff_2)
print('y intercept: ', intercept_2)

coefficient of TV: [[0.04649736]]

y intercept: [6.98966586]

plt.scatter(data['TV'], data.Sales)
plt.plot(data.TV, intercept_2 + coeff_2[0][0] * data.TV);

enter image description here


Solution

  • No, the models will be different, including:

    1. Different number of columns will result in different model weights (coefficients)

    2. Intercepts will usually be different

    3. Prediction results and model explanatory power (e.g. R²) will also be different

    Because when you add more features, the model will readjust the contribution of all variables to minimize the overall error, which will also affect the optimal solution for the intercept.

    LinearRegression() is a linear regression model, and its learning formula is

    enter image description here

    y_hat = w0 + w1 * x1 + w2 * x2 + ... + wn * xn​

    therefor, When you change the number of columns in X (that is, the number of features fed into the model), for a linear regression model, it completely affects the learning results of the entire model.

    example

    import pandas as pd
    from sklearn.linear_model import LinearRegression
    from sklearn.model_selection import train_test_split
    from sklearn.metrics import r2_score
    
    # Create sample data set (simulated marketing budget and sales
    data = pd.DataFrame({
        'TV': [230.1, 44.5, 17.2, 151.5, 180.8, 8.7, 57.5, 120.2, 8.6, 199.8],
        'Radio': [37.8, 39.3, 45.9, 41.3, 10.8, 48.9, 32.8, 19.6, 2.1, 2.6],
        'Newspaper': [69.2, 45.1, 69.3, 58.5, 58.4, 75.0, 23.5, 11.6, 1.0, 21.2],
        'Sales': [22.1, 10.4, 9.3, 18.5, 12.9, 7.2, 11.8, 13.2, 4.8, 10.6]
    })
    
    # Prepare X, y separately (single feature vs multiple features)
    X1 = data[['TV']]
    X3 = data[['TV', 'Radio', 'Newspaper']]
    y = data['Sales']
    
    # Data segmentation (maintain consistency)
    X1_train, X1_test, y_train, y_test = train_test_split(X1, y, test_size=0.3, random_state=42)
    X3_train, X3_test, _, _ = train_test_split(X3, y, test_size=0.3, random_state=42)
    
    # Build and train the model
    model1 = LinearRegression().fit(X1_train, y_train)
    model3 = LinearRegression().fit(X3_train, y_train)
    
    # predict
    y_pred1 = model1.predict(X1_test)
    y_pred3 = model3.predict(X3_test)
    
    # Output comparison
    print("Univariate Model:")
    print(f"  Intercept: {model1.intercept_:.4f}")
    print(f"  TV Coefficient: {model1.coef_[0]:.4f}")
    print(f"  R² : {r2_score(y_test, y_pred1):.4f}")
    
    print("\nMultivariate Model:")
    print(f"  Intercept: {model3.intercept_:.4f}")
    print(f"  Coefficients (TV, Radio, Newspaper): {model3.coef_}")
    print(f"  R²: {r2_score(y_test, y_pred3):.4f}")