pythondataframedynamic-columns

Dynamic columns to existing Dataframe and value assignment from previous record in python


i am trying to insert new columns to my existing dataframe dynamically and assign values from the previous records that Current Index-5,Index-10,Index-15,Index-20----Index-300. Basically I need to add 180 columns(3 columns from one of the previous records) and its value to each row of the dataframe.

existing DF output:

index----Price----Bid------ask

1        70.15    70.10    70.20
2        70.18    70.13    70.16
3    70.09    70.09    70.10
4    70.14    70.11    70.14 
5    70.13    70.12    70.13
6    70.16    70.16    70.19
7    70.14    70.12    70.16
:
:
n 

New DF should look like :-

index----Price----Bid-----Ask----5_Price--5_Bid--5_Ask--10_Price--10_Bid--10_Ask.....300_Price--300_Bid--300_Ask
1        70.15    70.10    70.20  70.03   70.04  70.05  70.05     70.06   70.07      70.14       70.14    70.16
2        70.18    70.13    70.16  70.01   70.02  70.03  70.09     70.09   70.10      70.17       70.16    70.17
3    70.09    70.09    70.10  70.05   70.03  70.05  70.06     70.04   70.06      70.20       70.18    70.19
4    70.14    70.11    70.14  70.07   70.09  70.10  70.05     70.06   70.07      70.14       70.14    70.16
5    70.13    70.12    70.13  70.03   70.04  70.05  70.05     70.06   70.07      70.14       70.14    70.16
6    70.16    70.16    70.19  70.18   70.16  70.19  70.13     70.11   70.12      70.17       70.18    70.19
7    70.14    70.12    70.16  70.20   70.21  70.19  70.21     70.19   70.21      70.08       70.07    70.09
:
:
n 

Code: Works but it takes very long time even for 50000 records(several hours)

t = 5
p = 305

last = '_Last'
cbid = '_Cbid'
cask = '_Cask'

for i in range(p, df.shape[0]):
    n = i - 300
    for j in range(5,p,5):
        in_last = str(j) + last
        in_cbid = str(j) + cbid
        in_cask = str(j) + cask
        df.loc[i, in_last] = df.loc[n+j, "Nusdinr Last"]
        df.loc[i, in_cbid] = df.loc[n+j, "Nusdinr Close Bid"]
        df.loc[i, in_cask] = df.loc[n+j, "Nusdinr Close Ask"]...

Solution

  • Here's a more effective way to do it, using 'shift'.

    for shift_days in range(0, 300, 5):
        col_names = [f"{col}_{shift_days}" for col in ["Price", "Bid", "ask"]]
        df[col_names] = df[["Price", "Bid", "ask"]].shift(shift_days)