the input is a range of date for which we need to find the starting date of the month and end date of the month of all date in between the interval. example is given below
input:
start date: 2018-6-15
end date: 2019-3-20
desired output:
[
["month starting date","month ending date"],
["2018-6-15","2018-6-30"],
["2018-7-1","2018-7-31"],
["2018-8-1","2018-8-31"],
["2018-9-1","2018-9-30"],
["2018-10-1","2018-10-31"],
["2018-11-1","2018-11-30"],
["2018-12-1","2018-12-31"],
["2019-1-1","2019-1-31"],
["2019-2-1","2019-2-28"],
["2019-3-1","2019-3-20"]
]
An option using pandas
: create a date_range from start to end date, extract the month numbers from that as a pandas.Series, shift it 1 element forward and 1 element backward to retrieve a boolean mask where the months change (!=). Now you can create a DataFrame to work with or create a list of lists if you like.
Ex:
import pandas as pd
start_date, end_date = '2018-6-15', '2019-3-20'
dtrange = pd.date_range(start=start_date, end=end_date, freq='d')
months = pd.Series(dtrange .month)
starts, ends = months.ne(months.shift(1)), months.ne(months.shift(-1))
df = pd.DataFrame({'month_starting_date': dtrange[starts].strftime('%Y-%m-%d'),
'month_ending_date': dtrange[ends].strftime('%Y-%m-%d')})
# df
# month_starting_date month_ending_date
# 0 2018-06-15 2018-06-30
# 1 2018-07-01 2018-07-31
# 2 2018-08-01 2018-08-31
# 3 2018-09-01 2018-09-30
# 4 2018-10-01 2018-10-31
# 5 2018-11-01 2018-11-30
# 6 2018-12-01 2018-12-31
# 7 2019-01-01 2019-01-31
# 8 2019-02-01 2019-02-28
# 9 2019-03-01 2019-03-20
# as a list of lists:
l = [df.columns.values.tolist()] + df.values.tolist()
# l
# [['month_starting_date', 'month_ending_date'],
# ['2018-06-15', '2018-06-30'],
# ['2018-07-01', '2018-07-31'],
# ['2018-08-01', '2018-08-31'],
# ['2018-09-01', '2018-09-30'],
# ['2018-10-01', '2018-10-31'],
# ['2018-11-01', '2018-11-30'],
# ['2018-12-01', '2018-12-31'],
# ['2019-01-01', '2019-01-31'],
# ['2019-02-01', '2019-02-28'],
# ['2019-03-01', '2019-03-20']]
Note that I use strftime
when I create the DataFrame. Do this if you want the output to be of dtype string. If you want to continue to work with datetime objects (timestamps), don't apply strftime
.