I have a dataframe containing rows which describe financial stocks. The following is a simplified version:
df = pd.DataFrame(
{
"stockprice": [100, 103, 240],
"Characteristic1": [1, 3, 3],
"Characteristic2": [5, 7, 1],
"Characteristic3": [1, 4, 6],
},
index=["Company A", "Company B", "Company C"],
)
# stockprice Characteristic1 Characteristic2 Characteristic3
# Company A 100 1 5 1
# Company B 103 3 7 4
# Company C 240 3 1 6
I would like to add a column which should contain for each cell a long dictionary which will be generated based on some of these characteristics - a series of cashflows. Later I will want to do some calculation on this generated dictionary.
Here is a sample function which generates the dictionary, and then the assign function to put it into my dataframe:
def cashflow_series(ch1=1, ch2=2):
return {0: ch1, 0.5: ch2, 1: 7, 2: 8, 3: 9}
df.assign(
cashflows=lambda x: cashflow_series(
ch1=x["Characteristic1"], ch2=x["Characteristic3"]
)
)
This returns
stockprice Characteristic1 Characteristic2 Characteristic3 cashflows
Company A 100 1 5 1 NaN
Company B 103 3 7 4 NaN
Company C 240 3 1 6 NaN
How can I fix this?
I want the new column 'cashflows' to contain a dictionary for each row, not a NaN.
I want something like this:
stockprice Characteristic1 Characteristic2 Characteristic3 cashflows
Company A 100 1 5 1 {0:1,..3:9}
Company B 103 3 7 4 {0:3,..3:9}
Company C 240 3 1 6 {0:3,..3:9}
You'll need to use a list comprehension:
def cashflow_series(ch1=1,ch2=2):
return {0:ch1,0.5:ch2,1:7,2:8,3:9}
df.assign(cashflows=lambda x: [cashflow_series(ch1=x.loc[i, 'Characteristic1'],
ch2=x.loc[i, 'Characteristic3'])
for i in x.index])
Or with parameter unpacking:
df.assign(cashflows=lambda x: [cashflow_series(*params)
for params in
df[['Characteristic1', 'Characteristic2']].values])
Output:
stockprice Characteristic1 Characteristic2 Characteristic3 cashflows
Company A 100 1 5 1 {0: 1, 0.5: 1, 1: 7, 2: 8, 3: 9}
Company B 103 3 7 4 {0: 3, 0.5: 4, 1: 7, 2: 8, 3: 9}
Company C 240 3 1 6 {0: 3, 0.5: 6, 1: 7, 2: 8, 3: 9}