pandasdataframe

Add column with missing values by position to timeseries


I have a dataframe that I like to add a column of values from array of tuples. The tuple contains the coordinates (position, value). An example:

import pandas as pd
import numpy as np

alpha = [chr(i) for i in range(ord('A'), ord('K')+1)]

dt = pd.date_range(start='2025-1-1', freq='1h', periods = len(alpha))

df = pd.DataFrame ( alpha , index = dt )
df.index.name = 'timestamp'
df.columns = ['item']

c = np.array( [(1, 100), (2, 202), (6, 772)] )

which gives:

timestamp item
2025-01-01 00:00:00 A
2025-01-01 01:00:00 B
2025-01-01 02:00:00 C
2025-01-01 03:00:00 D
2025-01-01 04:00:00 E
2025-01-01 05:00:00 F
2025-01-01 06:00:00 G
2025-01-01 07:00:00 H
2025-01-01 08:00:00 I
2025-01-01 09:00:00 J
2025-01-01 10:00:00 K

I am trying to join column c, in such a way that ROW[1] contains [B and 100].

I have accomplished what I want with the following:

df.reset_index(inplace = True) 
df.index.name = 'pos'

for x,y in c:
    df.loc[ int(x) , 'price'] = y

df.set_index("timestamp", inplace=True)

This gave me the desired results:

timestamp item price
2025-01-01 00:00:00 A nan
2025-01-01 01:00:00 B 100
2025-01-01 02:00:00 C 202
2025-01-01 03:00:00 D nan
2025-01-01 04:00:00 E nan
2025-01-01 05:00:00 F nan
2025-01-01 06:00:00 G 772
2025-01-01 07:00:00 H nan
2025-01-01 08:00:00 I nan
2025-01-01 09:00:00 J nan
2025-01-01 10:00:00 K nan

However, the idea of dropping and recreating the index for this feels a bit awkward, especially if I have multiple indexes.

My question, is there a better way that dropping and recreating an index to add a column with missing values, using position ?


Solution

  • Index.take returns the index of your dataframe based on the position and we can use the first column of your array to get the index.

    df.loc[df.index.take(c[:, 0]), 'price'] = c[:, 1]
    

    You can also use a combination of loc and iloc.

    
    df.loc[df.iloc[c[:, 0]].index, 'price'] = c[:, 1]
    

    End result:

                        item  price
    2025-01-01 00:00:00    A    NaN
    2025-01-01 01:00:00    B  100.0
    2025-01-01 02:00:00    C  202.0
    2025-01-01 03:00:00    D    NaN
    2025-01-01 04:00:00    E    NaN
    2025-01-01 05:00:00    F    NaN
    2025-01-01 06:00:00    G  772.0
    2025-01-01 07:00:00    H    NaN
    2025-01-01 08:00:00    I    NaN
    2025-01-01 09:00:00    J    NaN
    2025-01-01 10:00:00    K    NaN