I am trying to generate last day of the month for a date field in my dataframe
:
so there is field start date, I want to add say 5 columns like
nep_0
, nep_1
, nep_2
, nep_3
, nep_4
.
So for nep_0
logic is as below:
last day ( policy start date+ relative months (0))
For nep_1
:
last day(policy start date+ relative months (1)).
Was using this:
date_after_month =pol_start_date + relativedelta(months=i) nep_date=datetime.datetime(date_after_month.year,date_after_month.month,calendar.monthrange(date_after_month.year, date_after_month.month)[1])
But this used to iterate through each row of the data frame. My data frame has 1.5 million rows and it is taking a lot of time. can you tell me a better way to achieve the same. may be using apply or map.
I would also like these columns (nep_0
, nep_1
) to be created dynamically as in not passing relative months argument of 0,1,2,. they should also be looped through.
You can use MonthEnd
and rollforward
:
from pandas.tseries.offsets import MonthEnd
from datetime import datetime
month_count = 5 #define count of months here
cols = ["nep_" + str(i) for i in range(month_count)] #['nep_0', 'nep_1', 'nep_2', 'nep_3', 'nep_4']
for n,i in enumerate(cols):
if i == 'nep_0':
df[i] = df["dates"].apply(lambda x: MonthEnd().rollforward(x)) #get current month's last day.
else:
df[i] = df["nep_0"] + MonthEnd(n)
Testing:
df = pd.DataFrame({"dates":[datetime(2023,10,31),datetime(2023,10,30)]})
'''
dates
0 2023-10-31
1 2023-10-30
'''
Now let's use Monthend:
for n,i in enumerate(cols):
if i == 'nep_0':
df[i] = df["dates"].apply(lambda x: MonthEnd().rollforward(x)) #get current month's last day.
else:
df[i] = df["nep_0"] + MonthEnd(n)
Out::
dates nep_0 nep_1 nep_2 nep_3 nep_4
0 2023-10-31 2023-10-31 2023-11-30 2023-12-31 2024-01-31 2024-02-29
1 2023-10-30 2023-10-31 2023-11-30 2023-12-31 2024-01-31 2024-02-29