I have the following reference dataframe (ref_df) and a dataframe_1 (df1)
ref_df
| Sample_Name | Value |
| -------- | -------- |
| CTRL | 18.17 |
| E-1 | 19.06 |
| FIB | 17.78 |
| 10nM Benz | 17.89 |
df1
| Sample_Name | Value |
| -------- | -------- |
| CTRL | 27.49 |
| E-1 | 28.89 |
| FIB | 24.81 |
| 10nM Benz | 24.24 |
| CTRL | 27.36 |
| E-1 | 28.87 |
| FIB | 24.66 |
| 10nM Benz | 24.32 |```
I would like to take make the following happen: df1 - ref_df to obtain the df_result dataframe.
Notice how each Sample id is subtracted by it corresponding id in the ref_df.
The problem here seems to be the difference in size of the dfs and making sure
each value is subtracted by the correct corresponding ref_df ID.
``` df_result
| Sample_Name | Value |
| -------- | -------- |
| CTRL | 9.32 |
| E-1 | 9.83 |
| FIB | 7.03 |
| 10nM Benz | 6.35 |
| CTRL | 9.19 |
| E-1 | 9.81 |
| FIB | 6.88 |
| 10nM Benz | 6.43 |
How do I subtract the ref_df from df1 while making sure the correct things get subtracted according to Sample_name. On top of that, I'm unsure how to address the mismatch in the number of values (since ref_df has 4 values and df1 has 8). I'd like to keep the duplicate values as they are replicates and will need them for additional arithmetic later on.
I tried duplicating the ref_df to account for the mismatch in shape between the two dataframes, however that results in subtracting taking place without taking into account Sample_Name
Try:
df_result = (
df1.set_index("Sample_Name")["Value"] - ref_df.set_index("Sample_Name")["Value"]
).reset_index()
Prints:
Sample_Name Value
0 10nM Benz 6.35
1 10nM Benz 6.43
2 CTRL 9.32
3 CTRL 9.19
4 E-1 9.83
5 E-1 9.81
6 FIB 7.03
7 FIB 6.88
OR: Put result into df1
:
df1["Result"] = df1["Value"] - df1["Sample_Name"].map(
ref_df.set_index("Sample_Name")["Value"]
)
print(df1)
Prints:
Sample_Name Value Result
0 CTRL 27.49 9.32
1 E-1 28.89 9.83
2 FIB 24.81 7.03
3 10nM Benz 24.24 6.35
4 CTRL 27.36 9.19
5 E-1 28.87 9.81
6 FIB 24.66 6.88
7 10nM Benz 24.32 6.43