arrayslistnumpydictionarydata-wrangling

How to Add New Column in Dictionary?


Based on the data below, I want to calculate the BMI Index for each row and the average for the total row. The BMI Index formula is 'berat' / 'tinggi'. enter image description here

data = [{'nama': 'Senpai', 'tinggi': 1.55, 'berat': 63.41},
 {'nama': 'Yui Rio', 'tinggi': 1.53, 'berat': 61.17},
 {'nama': 'Yuna Hina', 'tinggi': 1.62, 'berat': 70.98},
 {'nama': 'Koharu Hinata', 'tinggi': 1.77, 'berat': 53.45},
 {'nama': 'Mei Mio', 'tinggi': 1.58, 'berat': 67.81},
 {'nama': 'Saki Miyu', 'tinggi': 1.57, 'berat': 68.12},
 {'nama': 'Kokona Haruka', 'tinggi': 1.76, 'berat': 61.96},
 {'nama': 'Haruto Yuto', 'tinggi': 1.52, 'berat': 64.89},
 {'nama': 'Sota Yuki', 'tinggi': 1.62, 'berat': 56.73},
 {'nama': 'Hayato Haruki', 'tinggi': 1.68, 'berat': 69.07},
 {'nama': 'Ryusei Koki', 'tinggi': 1.66, 'berat': 53.02},
 {'nama': 'Sora Sosuke', 'tinggi': 1.5, 'berat': 55.89},
 {'nama': 'Riku Soma', 'tinggi': 1.62, 'berat': 78.24}]

The expected result should be shown in the BMI index on the right of the key 'berat'. The expected result should be like this: enter image description here

I'm trying to calculate the BMI index first and then join the data. But it didn't work. Here's my code:

data_array = pd.DataFrame(data)

# for i in data_array:
#     print(i)

imt = data_array['berat']/data_array['tinggi']
tes = list(imt)

join_list = data + tes
join_list_array = pd.DataFrame(join_list)
print(join_list_array)

Do you have any thoughts about it? I'm sorry, but I'm still learning about Data Wrangling. I appreciate any help you can provide.


Solution

  • Here is a method using "element-wise" calculation for a new column in a dataframe.

    First, the syntax data_array['imt'] is creating a new column. Then, the value for each row in 'imt' column is being calculated on the right of the equals sign. The technique is called "element-wise" calculation.

    import pandas as pd
    
    data = [{'nama': 'Senpai', 'tinggi': 1.55, 'berat': 63.41},
     {'nama': 'Yui Rio', 'tinggi': 1.53, 'berat': 61.17},
     {'nama': 'Yuna Hina', 'tinggi': 1.62, 'berat': 70.98},
     {'nama': 'Koharu Hinata', 'tinggi': 1.77, 'berat': 53.45},
     {'nama': 'Mei Mio', 'tinggi': 1.58, 'berat': 67.81},
     {'nama': 'Saki Miyu', 'tinggi': 1.57, 'berat': 68.12},
     {'nama': 'Kokona Haruka', 'tinggi': 1.76, 'berat': 61.96},
     {'nama': 'Haruto Yuto', 'tinggi': 1.52, 'berat': 64.89},
     {'nama': 'Sota Yuki', 'tinggi': 1.62, 'berat': 56.73},
     {'nama': 'Hayato Haruki', 'tinggi': 1.68, 'berat': 69.07},
     {'nama': 'Ryusei Koki', 'tinggi': 1.66, 'berat': 53.02},
     {'nama': 'Sora Sosuke', 'tinggi': 1.5, 'berat': 55.89},
     {'nama': 'Riku Soma', 'tinggi': 1.62, 'berat': 78.24}]
    
    # create dataframe
    data_array = pd.DataFrame(data)
    
    # create new column and add values
    data_array['imt'] = data_array['berat'] / (data_array['tinggi'] ** 2)
    
    print(data_array)
    

    I see you had a for-each loop to print the values of each row. You can use a loop to calculate new values too. Like this:

    for row in data:
        bmi = row['berat'] / (row['tinggi']**2)
        row['imt'] = round(bmi, 2)
    data_array = pd.DataFrame(data)