pythonpandasdata-science

pandas calculate delta time


Here's some code where that will generate some random data, and chart plus lines representing 30th & 90th percentiles.

import pandas as pd
import numpy as np
from numpy.random import randint
import matplotlib.pyplot as plt
%matplotlib inline

np.random.seed(10)  # added for reproductibility

rng = pd.date_range('10/9/2018 00:00', periods=10, freq='1H')
df = pd.DataFrame({'Random_Number':randint(1, 100, 10)}, index=rng)
df.plot()

plt.axhline(df.quantile(0.3)[0], linestyle="--", color="g")
plt.axhline(df.quantile(0.90)[0], linestyle="--", color="r")

plt.show()

Outputs: (minus the highlighted part of the chart)

enter image description here

Im trying to figure out if its possible to calculate the time in the data it takes to reach (highlighted yellow) from green to the red line.

I can manually enter in the data:

minStart = df.loc[df['Random_Number'] < 18].index[0]

maxStart = df.loc[df['Random_Number'] > 90].index[0]

hours = maxStart - minStart
hours

Which will output:

Timedelta('0 days 05:00:00')

But if I attempt to use:

minStart = df.loc[df['Random_Number'] < df.quantile(0.3)].index[0]

maxStart = df.loc[df['Random_Number'] > df.quantile(0.90)].index[0]

hours = maxStart - minStart
hours

This will throw an ValueError: Can only compare identically-labeled Series objects

Would there be a better method to madness? Ideally it would be nice to create some sort of an algorithm that can calculate delta Time to it takes to go from 30th - 90th percentile and then delta back from 90th - 30th.. But I may have to put some thought towards how that could be accomplished..


Solution

  • minStart = df.loc[df['Random_Number'] < df.quantile(0.3)[0]].index[0]
    
    maxStart = df.loc[df['Random_Number'] > df.quantile(0.90)[0]].index[0]
    
    hours = maxStart - minStart
    hours
    

    df.quantile doesn't return a number so you need to get the first entry of it