I have a pandas dataframe with a column of timestamps and a column of values, and I want to do linear interpolation and get values for different timestamps. The dataframe looks like this:
timestamp c0
0 2014-01-01T00:00:03.500Z 38605.324219
2 2014-01-01T00:00:21.500Z 37872.890625
4 2014-01-01T00:00:39.600Z 38124.664062
6 2014-01-01T00:00:57.600Z 38185.699219
8 2014-01-01T00:01:15.700Z 38460.367188
I wrote a function like this to give original dataframe and get interpolated one:
def interp18to9(df):
dates = pd.date_range(pd.to_datetime(df.iloc[0]['timestamp']),
pd.to_datetime(df.iloc[-1]['timestamp']), freq='9S')
new_df = pd.DataFrame()
new_df['timestamp'] = pd.to_datetime(dates)
new_df['c0'] = np.interp(x=dates,
xp=pd.to_datetime(df.iloc[:]['timestamp']),
fp=df.iloc[:]['c0'])
return new_df
I get an error which says:
TypeError: Cannot cast array data from dtype('<M8[ns]') to dtype('float64') according to the rule 'safe'
I couldn't find a solution to this problem from searching for previous cases, thank you in advance.
How about using pandas' internal functions:
# 'floor' date to seconds
df['timestamp'] = pd.to_datetime((df['timestamp'].
astype(np.int64)//10**9 * 10**9).astype('datetime64[ns]'))
# new range
new_range = pd.date_range(df.timestamp[0], df.timestamp.values[-1], freq='9S')
# resample and interpolate
df.set_index('timestamp').reindex(new_range).interpolate().reset_index()
Output:
+----+----------------------+--------------+
| | index | c0 |
+----+----------------------+--------------+
| 0 | 2014-01-01 00:00:03 | 38605.324219 |
| 1 | 2014-01-01 00:00:12 | 38239.107422 |
| 2 | 2014-01-01 00:00:21 | 37872.890625 |
| 3 | 2014-01-01 00:00:30 | 37998.777343 |
| 4 | 2014-01-01 00:00:39 | 38124.664062 |
| 5 | 2014-01-01 00:00:48 | 38155.181640 |
| 6 | 2014-01-01 00:00:57 | 38185.699219 |
| 7 | 2014-01-01 00:01:06 | 38323.033204 |
| 8 | 2014-01-01 00:01:15 | 38460.367188 |
+----+----------------------+--------------+