pythonpandasrowvariancedeviation

calculate variance and standar deviation of each 15 rows in python


I have a dataframe which contains 300 value of height and I want to calculate the standard deviation and also variance of each 15 rows. I supposed to have 20 var and st.deviation. So far I have done like the written script, but it doesn't work. I think my problem is on how should I call the standard deviation and variance, because when I did to calculate mean and median the script works fine. How can I fix it by using python? thankyou

import statistics
grouper = df.groupby(df.index // 15)
df_var = grouper.agg( 
        statistics.pstdev(df["height"])
       ,statistics.stdev(df["height"])
)

Solution

  • One way to potentially approach the problem. First, I've created a dummy dataset of a single column, and 300 rows (populated with random numbers):

    import pandas as pd
    import numpy as np
    
    df = pd.DataFrame(np.random.randint(0,100, size=(300,1)))
    

    If all you want is the values of standard deviation and variance for every 15 rows and your dataset is consistently 300 rows long you can do this by the method below.

    std_val = []
    var_val = []
    
    for i in range(0, len(df)-15):
        df_sub = df[i:i+15]
        std = df_sub.std(axis=0)
        std_val.append(std)
        var = df_sub.var(axis=0)
        var_val.append(var)
    
    print(std_val, var_val) # print list of all values