pythonpandasdateweatherdata

How to transform wide date data to long format in Pandas


I have a large weather dataset in this format:

'Daily Mean Temp for place name 2015'   # One table title per year
'Day'  'JAN'  'FEB'  'MAR'  ...  'DEC'
 1      23     26     21    ...   14
 2      20     30     22    ...   12
 3      26     27     22    ...   16
 ...    ...    ...    ...   ...   ...
 31     28     -      19    ...   11

And I want to get it into this format:

'date'         'mean_temp'
2015-01-01      23
2015-01-02      20
2015-01-03      26

I have been unable to find a solution and would appreciate any ideas?


Solution

  • First map your months and then unpivot your df

    import padas as pd
    
    df.columns = [col.title() for col in df.columns]
    df_unpivot = df.melt(id_vars=["Day"], var_name="month", value_name="mean_temp")
    

    Then add new column with your date (if date is not exists it is naT)

    df_unpivot['date'] = pd.to_datetime(df_unpivot["Day"].map(str) + "-" + df_unpivot["month"] + "-2015", format='%d-%b-%Y', errors="coerce")
    

    Drop unneeded columns & invalid dates

    df_unpivot.drop(["Day", "month"], axis=1, inplace=True)
    df_unpivot.dropna(inplace=True)
    

    Set date as index

    df_unpivot.set_index("date", inplace=True)