pythondatetimepandasmean

Datetime objects with pandas mean function


I am new to programming so I apologize in advance if this question does not make any sens. I noticed that when I try to calculate the mean value of a pandas data frame with a date time object formatted like this: datetime.datetime(2014, 7, 10), it can not calculate the mean value of it however it seems to be able to calculate the minimum and maximum value of that same data frame with out a problem.

d={'one' : Series([1, 2, 3], index=['a', 'b', 'c']), 'two' :Series([datetime.datetime(2014, 7, 9) , datetime.datetime(2014, 7, 10) , datetime.datetime(2014, 7, 11) ], index=['a', 'b', 'c'])}
df=pd.DataFrame(d)

df
Out[18]: 
      one        two    
   a    1 2014-07-09
   b    2 2014-07-10
   c    3 2014-07-11

df.min()
Out[19]: 
   one             1
   two    2014-07-09
dtype: object

df.mean()
Out[20]: 
   one    2
dtype: float64

I did notice that the min and the max function converted all the columns to objects, where as the mean function only outputs floats. Could anyone explain to me why the mean function can only handle floats? Is there another way I to get the mean values of a data frame with a date time object? I can work around it by using epoch time (as integer), but it would be very convenient if there was a direct way. I use Python 2.7

I am grateful for any hints.


Solution

  • You can use datetime.timedelta

    import functools
    import operator
    import datetime
    
    import pandas as pd
    
    d={'one' : pd.Series([1, 2, 3], index=['a', 'b', 'c']), 'two' :pd.Series([datetime.datetime(2014, 7, 9) , datetime.datetime(2014, 7, 10) , datetime.datetime(2014, 7, 11) ], index=['a', 'b', 'c'])}
    df = pd.DataFrame(d)
    
    def avg_datetime(series):
        dt_min = series.min()
        deltas = [x-dt_min for x in series]
        return dt_min + functools.reduce(operator.add, deltas) / len(deltas)
    
    print(avg_datetime(df['two']))