I have a dataframe which represents features of a linear regression model.
df1 = pd.DataFrame({'yyyyww': ['2022-01','2022-02','2022-03', '2022-04','2022-05','2022-06','2022-07','2022-08','2022-09','2022-10'],
'feature1': [1000,2000,4000,3000,5000,2000,8000,2000,4000,3000],
'feature2': [9000,7000,3000,1000,2000,3000,6000,8000,1000,1000],
'feature3': [3000,1000,2000,5000,9000,7000,2000,3000,5000,9000]})
I run the model and calculate the coefficients which produces another dataframe, below.
df2 = pd.DataFrame({'feature': ['feature1','feature2','feature3'],
'coefficient': [-1,2,0.5]})
I then want to produce a third dataframe where the contents are the product of the values from df1 and the corresponding coefficients from df2. Desired output below.
df3 = pd.DataFrame({'yyyyww': ['2022-01','2022-02','2022-03', '2022-04','2022-05','2022-06','2022-07','2022-08','2022-09','2022-10'],
'feature1': [-1000,-2000,-4000,-3000,-5000,-2000,-8000,-2000,-4000,-3000],
'feature2': [18000,14000,6000,2000,4000,6000,12000,16000,2000,2000],
'feature3': [1500,500,1000,2500,4500,3500,1000,1500,2500,4500]})
I have tried to achieve this using mul
and multiply
in the following manner, however this does not produce the desired result.
features = [['feature1', 'feature2', 'feature3']]
results = pd.DataFrame()
for cols in features:
results[cols] = df1[cols]
results = df1.mul(df2['coefficient'], axis =0)
results
Try this using pandas intrinsic data alignment tenet:
df1.set_index('yyyyww').mul(df2.set_index('feature')['coefficient'])
Output:
feature1 feature2 feature3
yyyyww
2022-01 -1000.0 18000.0 1500.0
2022-02 -2000.0 14000.0 500.0
2022-03 -4000.0 6000.0 1000.0
2022-04 -3000.0 2000.0 2500.0
2022-05 -5000.0 4000.0 4500.0
2022-06 -2000.0 6000.0 3500.0
2022-07 -8000.0 12000.0 1000.0
2022-08 -2000.0 16000.0 1500.0
2022-09 -4000.0 2000.0 2500.0
2022-10 -3000.0 2000.0 4500.0