pythoncsvpandas

python store new value


I'm new to Python programming and I'm still learning it. I want to ask I have csv data which have 5 columns(TC,TD,TG,P,CTTR). and I want to store new value for CTTR column.

the new formula to CTTR column is

enter image description here

i is the line's number.

but I do not know how to write the formula in Python. I know how to get the original CTTR value as input with pandas.

import pandas as pd

data_df = pd.read_csv("output1.csv")

cttr = data_df['CTTR']

print cttr

I have 12 number of lines and I would like the output have same number of lines as the input. if index [0] , the value of cttr[i-2] and cttr[i-1] will be ==0 and when index [11], the value of cttr[i+2] and cttrs[i+1] will be ==0. and then I would like replace the original value of CTTR with the New value of CTTR to csv.

The original CTTR value

0
2
2
23
18
28
27
58
41
12
35
20

the output i expected

0.6------> (0 + 2*0 + 4*0 + 2*2 + 2)/10 
3.5 -----> (0 + 2*0 +4*2 +2*2 + 23)/10
7.6
16.2
20.3
28.3
33.9
40.8
36.6
27.8
24.5-----> (41 + 2*12 + 4*35 + 2*20 + 0)/10
16.2-----> (12 + 2*35 + 4*20 +2*0 + 0)/10

Thank You!


Solution

  • You're not exactly clear on the output you're looking for, but one way you could do it is:

    vals = df.CTTR.values
    
    np.array([vals[i-2] + 2*vals[i-1] + 4*vals[i] + 2*vals[i+1] + vals[i+2] for i in range(2,len(vals)-2)])/10.
    

    That said, basically you're implementing a weighted rolling average for which I'm sure there is an implementation already (maybe have a look around pandas rolling)

    If you want to have zeros in the first/last to positions where no observations are available, you can do:

    np.append(np.append([0,0],np.array([vals[i-2] + 2*vals[i-1] + 4*vals[i] + 2*vals[i+1] + vals[i+2] for i in range(2,len(vals)-2)])/10.),[0,0])
    

    Although this is getting a bit unwieldy, so you might want to wrap this in a function like so:

    def calc_new_cttr(series):
        vals = series.values
        cttrs = np.array([vals[i-2] + 2*vals[i-1] + 4*vals[i] + 2*vals[i+1] + vals[i+2] for i in range(2,len(vals)-2)])/10.
        return np.append(np.append([0,0], cttrs),[0,0])
    

    Which you can then call directly on your series like this:

    calc_new_cttr(data_df["CTTR"])
    

    To write to csv, you can assign the transformed column to your dataframe and export to csv using pandas csv writer:

    data_df["transformed_cttr"] = calc_new_cttr(data_df["CTTR"])
    data_df.to_csv("/path/to/your/file.csv")
    

    ------------EDIT-------------

    The boundary behavior described in your latest comment can be achieved by simply padding out the series before applying the transformation and then returning the resulting array without the first and last two entries:

    def cttr(series):
        vals = np.append(np.append([0,0], series.values),[0,0])
        cttrs = np.array([vals[i-2] + 2*vals[i-1] + 4*vals[i] + 2*vals[i+1] + vals[i+2] for i in range(2,len(vals)-2)])/10.
        return cttrs