I'm training a linear regression on the "advertising sales dataset"
When I train all of the features(columns) of the dataset together, I get a different y-intercept than if I trained the model on one column only (TV).
Shouldn't the y intercept stay the same?
The weight of the column stays the same.
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
import numpy as np
from sklearn.linear_model import LinearRegression
from sklearn.model_selection import train_test_split
data = pd.read_csv('../datasets/Advertising Budget and Sales.csv')
data = data.rename(columns={
'TV Ad Budget ($)': 'TV',
'Radio Ad Budget ($)': 'Radio',
'Newspaper Ad Budget ($)': 'Newspaper',
'Sales ($)': 'Sales',
})
data = data.drop(columns=['Unnamed: 0'])
X = data[['TV', 'Radio', 'Newspaper']]
y = data['Sales']
X_train, X_test, y_train, y_test = train_test_split(X, y, train_size=0.7, test_size=0.3, shuffle=True, random_state=100)
lr = LinearRegression().fit(X_train, y_train)
coeff = lr.coef_
intercept = lr.intercept_
print('coefficients of TV, Radio, and Newspaper:', coeff)
print('y intercept: ',intercept)
coefficients of TV, Radio, and Newspaper: [0.0454256 0.18975773 0.00460308]
y intercept: 2.652789668879496
plt.scatter(data['TV'], data.Sales)
plt.plot(data.TV, intercept + coeff[0] * data.TV);
X_train, X_test, y_train, y_test = train_test_split(data['TV'].values.reshape(-1, 1),
data[['Sales']],
train_size=0.7, test_size=0.3, shuffle=True, random_state=100)
lr = LinearRegression().fit(X_train, y_train)
coeff_2 = lr.coef_
intercept_2 = lr.intercept_
print('coefficients of TV, Radio, and Newspaper:', coeff_2)
print('y intercept: ', intercept_2)
coefficient of TV: [[0.04649736]]
y intercept: [6.98966586]
plt.scatter(data['TV'], data.Sales)
plt.plot(data.TV, intercept_2 + coeff_2[0][0] * data.TV);
No, the models will be different, including:
Different number of columns will result in different model weights (coefficients)
Intercepts will usually be different
Prediction results and model explanatory power (e.g. R²) will also be different
Because when you add more features, the model will readjust the contribution of all variables to minimize the overall error, which will also affect the optimal solution for the intercept.
LinearRegression()
is a linear regression model, and its learning formula is
y_hat = w0 + w1 * x1 + w2 * x2 + ... + wn * xn
w0
is the intercept
w1
~ wn
is the weight of each feature column
x1
~ xn
features (like: TV、Radio)
therefor, When you change the number of columns in X (that is, the number of features fed into the model), for a linear regression model, it completely affects the learning results of the entire model.
example
import pandas as pd
from sklearn.linear_model import LinearRegression
from sklearn.model_selection import train_test_split
from sklearn.metrics import r2_score
# Create sample data set (simulated marketing budget and sales
data = pd.DataFrame({
'TV': [230.1, 44.5, 17.2, 151.5, 180.8, 8.7, 57.5, 120.2, 8.6, 199.8],
'Radio': [37.8, 39.3, 45.9, 41.3, 10.8, 48.9, 32.8, 19.6, 2.1, 2.6],
'Newspaper': [69.2, 45.1, 69.3, 58.5, 58.4, 75.0, 23.5, 11.6, 1.0, 21.2],
'Sales': [22.1, 10.4, 9.3, 18.5, 12.9, 7.2, 11.8, 13.2, 4.8, 10.6]
})
# Prepare X, y separately (single feature vs multiple features)
X1 = data[['TV']]
X3 = data[['TV', 'Radio', 'Newspaper']]
y = data['Sales']
# Data segmentation (maintain consistency)
X1_train, X1_test, y_train, y_test = train_test_split(X1, y, test_size=0.3, random_state=42)
X3_train, X3_test, _, _ = train_test_split(X3, y, test_size=0.3, random_state=42)
# Build and train the model
model1 = LinearRegression().fit(X1_train, y_train)
model3 = LinearRegression().fit(X3_train, y_train)
# predict
y_pred1 = model1.predict(X1_test)
y_pred3 = model3.predict(X3_test)
# Output comparison
print("Univariate Model:")
print(f" Intercept: {model1.intercept_:.4f}")
print(f" TV Coefficient: {model1.coef_[0]:.4f}")
print(f" R² : {r2_score(y_test, y_pred1):.4f}")
print("\nMultivariate Model:")
print(f" Intercept: {model3.intercept_:.4f}")
print(f" Coefficients (TV, Radio, Newspaper): {model3.coef_}")
print(f" R²: {r2_score(y_test, y_pred3):.4f}")