I'm new to pandas and python and I'm struggling with the implementation of loops in my code. I hope that someone can help me.
I have the following Dataframe:
import pandas as pd
from pandas import Timestamp
pd.DataFrame({'DateTime': {0: Timestamp('2021-06-13 00:00:00'),
1: Timestamp('2021-06-13 02:00:00'),
2: Timestamp('2021-06-13 05:00:00'),
3: Timestamp('2021-06-13 07:00:00'),
4: Timestamp('2021-06-13 10:00:00')},
'actual_value': {0: 180.0949105082311,
1: 183.93185469787613,
2: 191.48399886639095,
3: 188.31358023933768,
4: 159.32768035801615},
'forecast_0': {0: nan,
1: 185.0,
2: 206.0,
3: 193.0,
4: 130.0},
'forecast_1': {0: 187.0,
1: 185.0,
2: 206.0,
3: 192.0,
4: 130.0},
'forecast_2': {0: 186.0,
1: nan,
2: 200.0,
3: 192.0,
4: nan},
'forecast_3': {0: 186.0,
1: 185.0,
2: 200.0,
3: 192.0,
4: 130.0},
'forecast_4': {0: 186.0,
1: 183.0,
2: 200.0,
3: 188.0,
4: 130.0}})
DateTime actual_value forecast_0 forecast_1 forecast_2 \
0 2021-06-13 00:00:00 180.094911 NaN 187.0 186.0
1 2021-06-13 02:00:00 183.931855 185.0 185.0 NaN
2 2021-06-13 05:00:00 191.483999 206.0 206.0 200.0
3 2021-06-13 07:00:00 188.313580 193.0 192.0 192.0
4 2021-06-13 10:00:00 159.327680 130.0 130.0 NaN
forecast_3 forecast_4
0 186.0 186.0
1 185.0 183.0
2 200.0 200.0
3 192.0 188.0
4 130.0 130.0
I want to create a new Dataframe or replace the numbers in the existing one with a simple calculation. I want to determine the deviation of all forecast values relative to the actual value in the second column. Since there are over 40 such forecast columns it is simply too time consuming to write down the calculation for every column. That's why I would like to implement a loop. I tried the following code, which didn't work:
for i, col in enumerate(df.columns, -2):
df[col] = (df[col]-df['actual_value'])/df['actual_value']
I get the error, that 'subtract' cannot use operands with types dtype('<M8[ns]') and dtype('float64'). Does anyone has an idea how to solve this issue? I'm thankful for every message.
The error 'subtract' cannot use operands with types dtype('<M8[ns]') and dtype('float64') is because your loop is trying to subtract the first column which is datetime and the float in actual_value column.
To do this correctly, you could change your loop to for col in df.columns[2:]:
Even though I agree with other solutions posted here - it is more elegant to do it without using loop.