I have a pandas dataframe with id and date as the 2 columns - the date column has all the way to seconds.
data = {'id':[17,17,17,17,17,18,18,18,18],'date':['2018-01-16','2018-01-26','2018-01-27','2018-02-11',
'2018-03-14','2018-01-28','2018-02-12','2018-02-25','2018-03-04'],
}
df1 = pd.DataFrame(data)
I would like to have a new column - (tslt) - 'time_since_last_transaction'. The first transaction for each unique user_id could be a number say 1. Each subsequent transaction for that user should measure the difference between the 1st time stamp for that user and its current time stamp to generate a time difference in seconds.
I used the datetime and timedelta etc. but did not have too much of luck. Any help would be appreciated.
You can try groupby().transform()
:
df1['date'] = pd.to_datetime(df1['date'])
df1['diff'] = df1['date'].sub(df1.groupby('id').date.transform('min')).dt.total_seconds()
Output:
id date diff
0 17 2018-01-16 0.0
1 17 2018-01-26 864000.0
2 17 2018-01-27 950400.0
3 17 2018-02-11 2246400.0
4 17 2018-03-14 4924800.0
5 18 2018-01-28 0.0
6 18 2018-02-12 1296000.0
7 18 2018-02-25 2419200.0
8 18 2018-03-04 3024000.0