pythonpandascsvchartsxls

Python Pandas - can not draw correct chart with two columns. In col 1 are several values of type A and several of type B. In col 2 are numbers


I have table in xls that I need to visulaise in chart. Im using Python Pandas. Table has this structure:

ORP Produkce_obyv._kg
a 289,77
a 333,31
... ...
b 198,69
b 214,71
... ...

Im using this code:

group1 = data[data['ORP']=='A']  
group2 = data[data['ORP']=='B']

graf = group1.groupby('ORP')['Produkce_obyv._kg'].mean().to_frame(name='A')
graf['B'] = group2.groupby('ORP')['Produkce_obyv._kg'].mean()

graf.plot.bar()
plt.show()

Chart prints only one bar (A): enter image description here

I need to see data (B) as second bar. I am using them for statistcs:

ttest_ind_result = ttest_ind(group1['Produkce_obyv._kg'], group2['Produkce_obyv._kg'], equal_var=False)
print(ttest_ind_result)

That works fine. Where is problem with chart? Thanks!

I tried to use different library, tried to use analysis

print(graf.isna().sum())
and
print(len(group2))
print(group2['ORP'].unique())

All seems to be correct.


Solution

  • Okay, taking the data in the question, you could plot this with matplotlib quite quickly.

    The main difference is the use of df.groupby('ORP').mean() as you will see below.

    here is an example:

    import pandas as pd
    import matplotlib.pyplot as plt
    
    
    # example data
    data = {'ORP': ['a', 'a', 'a', 'a', 'a', 'a', 'b', 'b', 'b', 'b'],
            'Produkce_obyv._kg': [289.77, 333.31, 309.42, 271.53, 367.32, 351.37, 198.69, 214.71, 244.35, 224.15]}
    df = pd.DataFrame(data)
    
    
    
    # do the plot
    grouped = df.groupby('ORP').mean()
    grouped.plot(kind='bar')
    
    plt.xlabel('ORP')
    plt.ylabel('Produkce_obyv._kg')
    plt.title('Average Produkce_obyv._kg by ORP')
    
    plt.show()
    

    and this is the result:

    enter image description here