below is my df
import pandas as pd
df = pd.DataFrame({'Date': ['2014-03-27', '2014-03-28', '2014-03-31', '2014-04-01', '2014-04-02', '2014-04-03', '2014-04-04', '2014-04-07','2014-04-08', '2014-04-09']})
I can create a column with another date which is two months prior to the date on the row.
df['Date1'] = df.Date.apply(lambda x: pd.to_datetime(x) + relativedelta(months=-2))
is there a much faster way to perform that same task (vectorization for instance)? my original df is really huge 30-40k rows.
You can use pandas datetime functions vectorized
df['Date1'] = pd.to_datetime(df.Date) - pd.DateOffset(months=2)
df
Out:
Date Date1
0 2014-03-27 2014-01-27
1 2014-03-28 2014-01-28
2 2014-03-31 2014-01-31
3 2014-04-01 2014-02-01
4 2014-04-02 2014-02-02
5 2014-04-03 2014-02-03
6 2014-04-04 2014-02-04
7 2014-04-07 2014-02-07
8 2014-04-08 2014-02-08
9 2014-04-09 2014-02-09