I want to add columns in my df with values based on the sample list in one column and the next column headers as sample numbers. In detail: based on the 11
column, I want to add 3 columns designed as 11_1
, 11_2
and 11_3
with values according to the sample list in the 11
and then the same for 00
.
My tiny part of input data:
df_matrix_data = {'11': [['P4-1', 'P4-2', 'P4-3'], ['P4-1', 'P4-3', 'P4-4']],
'00': [['P4-4', 'P4-6', 'P4-7',], ['P4-2', 'P4-5', 'P4-7']],
'P4-1': [1, 2], 'P4-2': [6, 8], 'P4-3': [5, 2], 'P4-4': [2, 3], 'P4-5': [np.nan, 2], 'P4-6': [6, np.nan],
'P4-7': [3, 2]}
df_matrix = pd.DataFrame.from_dict(df_matrix_data)
will look like this:
11 00 P4-1 P4-2 P4-3 P4-4 P4-5 P4-6 P4-7
0 [P4-1, P4-2, P4-3] [P4-4, P4-6, P4-7] 1 6 5 2 NaN 6.0 3
1 [P4-1, P4-3, P4-4] [P4-2, P4-5, P4-7] 2 8 2 3 2.0 NaN 2
and desired output should look like this:
11 00 P4-1 P4-2 P4-3 P4-4 P4-5 P4-6 P4-7 11_1 11_2 11_3 00_1 00_2 00_3
0 [P4-1, P4-2, P4-3] [P4-4, P4-6, P4-7] 1 6 5 2 NaN 6.0 3 1 6 5 2 6 3
1 [P4-1, P4-3, P4-4] [P4-2, P4-5, P4-7] 2 8 2 3 2.0 NaN 2 2 2 3 8 2 2
Any ideas on how to perform it?
Another possible solution:
df_matrix.assign(
**{f"{k}_{i+1}": df_matrix.apply(
lambda row: row[row[k][i]], axis=1)
for k in ['11', '00'] for i in range(3)})
It uses a dictionary comprehension within assign
, iterating over each key (e.g., '11') and list index (0-2), then generates columns like 11_1
by mapping the list's element (e.g., row['11'][0]
) to its corresponding value in the row via lambda
.
To avoid the inefficient apply
:
df_matrix.assign(
**{f"{k}_{i+1}": df_matrix.values[
np.arange(len(df_matrix)),
df_matrix.columns.get_indexer(df_matrix[k].str[i])]
for k in ['11', '00'] for i in range(3)})
It uses index.get_indexer
to convert column names to numeric indices.
Output:
11 00 P4-1 P4-2 P4-3 P4-4 P4-5 P4-6 \
0 [P4-1, P4-2, P4-3] [P4-4, P4-6, P4-7] 1 6 5 2 NaN 6.0
1 [P4-1, P4-3, P4-4] [P4-2, P4-5, P4-7] 2 8 2 3 2.0 NaN
P4-7 11_1 11_2 11_3 00_1 00_2 00_3
0 3 1 6 5 2 6.0 3
1 2 2 2 3 8 2.0 2