I have a dataframe:
df = pd.DataFrame({
'BU': ['AA', 'AA', 'AA', 'BB', 'BB', 'BB'],
'Line_Item': ['Revenues','EBT', 'Expenses', 'Revenues', 'EBT', 'Expenses'],
'201901': [100, 120, 130, 200, 190, 210],
'201902': [100, 120, 130, 200, 190, 210],
'201903': [200, 250, 450, 120, 180, 190],
'202001': [200, 250, 450, 120, 180, 190],
'202002': [200, 250, 450, 120, 180, 190],
'202003': [200, 250, 450, 120, 180, 190]
})
The columns represent years and months respectively. I would like to sum the columns for months into a new columns for the year. The result should look like the following:
df = pd.DataFrame({
'BU': ['AA', 'AA', 'AA', 'BB', 'BB', 'BB'],
'Line_Item': ['Revenues','EBT', 'Expenses', 'Revenues', 'EBT', 'Expenses'],
'201901': [100, 120, 130, 200, 190, 210],
'201902': [100, 120, 130, 200, 190, 210],
'201903': [200, 250, 450, 120, 180, 190],
'202001': [200, 250, 450, 120, 180, 190],
'202002': [200, 250, 450, 120, 180, 190],
'202003': [200, 250, 450, 120, 180, 190],
'2019': [400, 490, 710, 520, 560, 610],
'2020': [600, 750, 1350, 360, 540, 570]
})
My actual dataset has a number of years and has 12 months for each year. Hoping not to have to add the columns manually.
Try creating a DataFrame that contains the year columns and convert the column names to_datetime
:
data_df = df.iloc[:, 2:]
data_df.columns = pd.to_datetime(data_df.columns, format='%Y%m')
2019-01-01 2019-02-01 2019-03-01 2020-01-01 2020-02-01 2020-03-01
0 100 100 200 200 200 200
1 120 120 250 250 250 250
2 130 130 450 450 450 450
3 200 200 120 120 120 120
4 190 190 180 180 180 180
5 210 210 190 190 190 190
resample sum
the columns by Year and rename
columns to just the year values:
data_df = (
data_df.resample('Y', axis=1).sum().rename(columns=lambda c: c.year)
)
2019 2020
0 400 600
1 490 750
2 710 1350
3 520 360
4 560 540
5 610 570
Then join
back to the original DataFrame:
new_df = df.join(data_df)
new_df
:
BU Line_Item 201901 201902 201903 202001 202002 202003 2019 2020
0 AA Revenues 100 100 200 200 200 200 400 600
1 AA EBT 120 120 250 250 250 250 490 750
2 AA Expenses 130 130 450 450 450 450 710 1350
3 BB Revenues 200 200 120 120 120 120 520 360
4 BB EBT 190 190 180 180 180 180 560 540
5 BB Expenses 210 210 190 190 190 190 610 570