pythondataframedatetime

Find time stamp from which a dataset becomes monotonic


I have a dataframe which contains an altitude column. In the beginning the altitude remains the same in terms of GPS signal fluctuations. At some point the altitude changes, when the GPS receiver is moved during a walk up a mountain. How would I find the timestamp, when the altitude starts to be continuously moved upwards? A simple 'larger than' comparison would not work as the altitude is jumping a little in the beginning during warm-up.

An example timerow is here:

import pandas as pd
import numpy as np
 
df_GPS = pd.DataFrame([['2024-06-21 06:22:38', 605.968389],
                     ['2024-06-21 06:22:39', 606.009398],
                     ['2024-06-21 06:22:40', 605.630573],
                     ['2024-06-21 06:22:41', 605.476367 ],
                     ['2024-06-21 06:22:42', 605.322161],
                     ['2024-06-21 06:22:43', 605.268389],
                     ['2024-06-21 06:22:44', 605.559398],
                     ['2024-06-21 06:22:45', 606.630573],
                     ['2024-06-21 06:22:46', 607.476367 ],
                     ['2024-06-21 06:22:47', 609.322161],
                    ], columns=['time', 'Altitude'])

enter image description here

Ideally I would get as result: time = 2024-06-21 06:22:43 as from there the altitude rises monotonously.

My code is still faulty

start_move = df_GPS.loc[df_GPS['Altitude'] == df_GPS["Altitude"].is_monotonic_increasing, 'time'] 
print(start_move)    

Where is my misspelling?


Solution

  • Assuming time has been converted to datatime / timestamp then you could just look backwards from the end to find the change from monotonic. Assuming the time data needs to be sorted then firs reverse the order:

    df_GPS_rev = df_GPS.sort_values(by = 'time', ascending = False)
    

    Alternatively if the data is already in time order then as the only requirement is to reverse the existing order, so it is only necessary to reindex:

    df_GPS_rev = df_GPS.reindex(index = df_GPS.index[::-1])
    

    then

    start_move = df_GPS_rev[df_GPS_rev['Altitude'] < df_GPS_rev['Altitude'].shift(-1)].iloc[0]
    
    print(start_move['time'])
    

    gives:

    2024-06-21 06:22:43
    

    If (as later defined in OP comments) the data is NOT always monotonically increasing but for example has decreases at the end, then rolling windows could be checked for being monotonically increasing:

    # define window size
    win= 3
    
    # find windows which are monotonic increasing
    df_GPS['win_inc'] = df_GPS.rolling(win)['Altitude'].apply(lambda x: x.is_monotonic_increasing)
    
    #get index of end of first increasing window
    idx = df_GPS[df_GPS['win_inc'].eq(True)].index[0]
    
    # get time at start of first increasing window
    start_move = df_GPS.loc[idx-win+1, 'time']
    print(start_move)
    

    which again gives:

    2024-06-21 06:22:43
    

    A faster but less concise approach could use Numpy:

    # define window size
    win= 3
    #create numpy array
    alt = df_GPS['Altitude'].to_numpy()
    
    #search through np array looking for first increasing window
    start_idx = None
    for idx, val in enumerate(alt[:1-win]):
        if np.all(np.diff(alt[idx:idx+win]) > 0):
            start_idx = idx
            break
    
    if start_idx is not None:
        start_move = df_GPS.loc[start_idx, 'time']
        print(start_move)
    else:
        print('no increasing windows')
    

    which again gives the same result