pythonpandaspandas-groupbydayofmonth

Python Pandas - Get the rows of first and last day of particular months


My data set df looks as follows:

Date         Value
...
2012-07-31   61.9443
2012-07-30   62.1551
2012-07-27   62.3328
...          ... 
2011-10-04   48.3923
2011-10-03   48.5939
2011-09-30   50.0327
2011-09-29   51.8350
2011-09-28   50.5555
2011-09-27   51.8470
2011-09-26   49.6350
...          ...
2011-08-03   61.3948
2011-08-02   61.5476
2011-08-01   64.1407
2011-07-29   65.0364
2011-07-28   65.7065
2011-07-27   66.3463
2011-07-26   67.1508
2011-07-25   67.5577
...          ...
2010-10-05   57.3674
2010-10-04   56.3687
2010-10-01   57.6022
2010-09-30   58.0993
2010-09-29   57.9934

Below are the data type of the two columns:

Type                 Column Name              Example Value
-----------------------------------------------------------------
datetime64[ns]       Date                     2020-06-19 00:00:00
float64              Value                    108.82

I would like to have a subset of df that contains only the rows where the first entry in October and the last entry of July are selected:

Date         Value
...
2012-07-31   61.9443
2011-10-03   48.5939
2011-07-29   65.0364
2010-10-01   57.6022

Any idea how to do that?


Solution

  • An elegant solution without group just by using index from sorted dataframe:

       # Sort you data by Date and convert date string to datetime
       df['Date']=pd.to_datetime(df['Date'])
       df = df.sort_values(by='Date')
       # For selecting first row just subset by index where month is 7 and select first index i.e. 0
       jul = df.loc[[df.index[df['Date'].dt.month == 7].tolist()[0]]]
       # For sleecting last row just subset by index where months is 10 and select last index i.e -1
       oct = df.loc[[df.index[df['Date'].dt.month == 10].tolist()[-1]]]
       #Finally concatenate both    
       pd.concat([jul,oct]).reset_index(drop=True)