python-3.xdatedatetimemonthcalendardateinterval

month starting date and ending date between a range of date in python


the input is a range of date for which we need to find the starting date of the month and end date of the month of all date in between the interval. example is given below

input:

  start date: 2018-6-15
  end date: 2019-3-20

desired output:

[ 
  ["month starting date","month ending date"],
  ["2018-6-15","2018-6-30"],
  ["2018-7-1","2018-7-31"],
  ["2018-8-1","2018-8-31"],
  ["2018-9-1","2018-9-30"],
  ["2018-10-1","2018-10-31"],
  ["2018-11-1","2018-11-30"],
  ["2018-12-1","2018-12-31"],
  ["2019-1-1","2019-1-31"],
  ["2019-2-1","2019-2-28"],
  ["2019-3-1","2019-3-20"]
]

Solution

  • An option using pandas: create a date_range from start to end date, extract the month numbers from that as a pandas.Series, shift it 1 element forward and 1 element backward to retrieve a boolean mask where the months change (!=). Now you can create a DataFrame to work with or create a list of lists if you like.

    Ex:

    import pandas as pd
    
    start_date, end_date = '2018-6-15', '2019-3-20'
    dtrange = pd.date_range(start=start_date, end=end_date, freq='d')
    months = pd.Series(dtrange .month)
    
    starts, ends = months.ne(months.shift(1)), months.ne(months.shift(-1))
    df = pd.DataFrame({'month_starting_date': dtrange[starts].strftime('%Y-%m-%d'),
                       'month_ending_date': dtrange[ends].strftime('%Y-%m-%d')})
    # df
    #   month_starting_date month_ending_date
    # 0          2018-06-15        2018-06-30
    # 1          2018-07-01        2018-07-31
    # 2          2018-08-01        2018-08-31
    # 3          2018-09-01        2018-09-30
    # 4          2018-10-01        2018-10-31
    # 5          2018-11-01        2018-11-30
    # 6          2018-12-01        2018-12-31
    # 7          2019-01-01        2019-01-31
    # 8          2019-02-01        2019-02-28
    # 9          2019-03-01        2019-03-20
    
    # as a list of lists:
    l = [df.columns.values.tolist()] + df.values.tolist()
    # l
    # [['month_starting_date', 'month_ending_date'],
    #  ['2018-06-15', '2018-06-30'],
    #  ['2018-07-01', '2018-07-31'],
    #  ['2018-08-01', '2018-08-31'],
    #  ['2018-09-01', '2018-09-30'],
    #  ['2018-10-01', '2018-10-31'],
    #  ['2018-11-01', '2018-11-30'],
    #  ['2018-12-01', '2018-12-31'],
    #  ['2019-01-01', '2019-01-31'],
    #  ['2019-02-01', '2019-02-28'],
    #  ['2019-03-01', '2019-03-20']]
    

    Note that I use strftime when I create the DataFrame. Do this if you want the output to be of dtype string. If you want to continue to work with datetime objects (timestamps), don't apply strftime.