I have a dataframe that I like to add a column of values from array of tuples. The tuple contains the coordinates (position, value). An example:
import pandas as pd
import numpy as np
alpha = [chr(i) for i in range(ord('A'), ord('K')+1)]
dt = pd.date_range(start='2025-1-1', freq='1h', periods = len(alpha))
df = pd.DataFrame ( alpha , index = dt )
df.index.name = 'timestamp'
df.columns = ['item']
c = np.array( [(1, 100), (2, 202), (6, 772)] )
which gives:
timestamp | item |
---|---|
2025-01-01 00:00:00 | A |
2025-01-01 01:00:00 | B |
2025-01-01 02:00:00 | C |
2025-01-01 03:00:00 | D |
2025-01-01 04:00:00 | E |
2025-01-01 05:00:00 | F |
2025-01-01 06:00:00 | G |
2025-01-01 07:00:00 | H |
2025-01-01 08:00:00 | I |
2025-01-01 09:00:00 | J |
2025-01-01 10:00:00 | K |
I am trying to join column c, in such a way that ROW[1] contains [B and 100].
I have accomplished what I want with the following:
df.reset_index(inplace = True)
df.index.name = 'pos'
for x,y in c:
df.loc[ int(x) , 'price'] = y
df.set_index("timestamp", inplace=True)
This gave me the desired results:
timestamp | item | price |
---|---|---|
2025-01-01 00:00:00 | A | nan |
2025-01-01 01:00:00 | B | 100 |
2025-01-01 02:00:00 | C | 202 |
2025-01-01 03:00:00 | D | nan |
2025-01-01 04:00:00 | E | nan |
2025-01-01 05:00:00 | F | nan |
2025-01-01 06:00:00 | G | 772 |
2025-01-01 07:00:00 | H | nan |
2025-01-01 08:00:00 | I | nan |
2025-01-01 09:00:00 | J | nan |
2025-01-01 10:00:00 | K | nan |
However, the idea of dropping and recreating the index for this feels a bit awkward, especially if I have multiple indexes.
My question, is there a better way that dropping and recreating an index to add a column with missing values, using position ?
Index.take returns the index of your dataframe based on the position and we can use the first column of your array to get the index.
df.loc[df.index.take(c[:, 0]), 'price'] = c[:, 1]
You can also use a combination of loc and iloc.
df.loc[df.iloc[c[:, 0]].index, 'price'] = c[:, 1]
End result:
item price
2025-01-01 00:00:00 A NaN
2025-01-01 01:00:00 B 100.0
2025-01-01 02:00:00 C 202.0
2025-01-01 03:00:00 D NaN
2025-01-01 04:00:00 E NaN
2025-01-01 05:00:00 F NaN
2025-01-01 06:00:00 G 772.0
2025-01-01 07:00:00 H NaN
2025-01-01 08:00:00 I NaN
2025-01-01 09:00:00 J NaN
2025-01-01 10:00:00 K NaN