pythonpandasdataframescatter-plotscatter

How to get all values in scatter plot?


I have a dataframe containing days and wind strength. The problem is that my scatter plot only shows the first value for each day. Not both wind strength for monday, and all three wind strength for thuesday. My code is:

import pandas as pd
import matplotlib.pyplot as plt
import matplotlib.transforms as transforms
from matplotlib.lines import Line2D

df = pd.DataFrame(
    {
        'Day': ['Monday'] * 2
        + ['Tuesday'] * 3
        + ['Wednesday'] * 3
        + ['Thursday'] * 3
        + ['Friday'] * 3
        + ['Saturday'] * 2
        + ['Sunday'] *2,
        'WindStrength': [1, 5, 4, 7, 3, 6, 8, 4, 2, 9, 8, 5, 2, 6, 7, 3, 8, 1],
    }
)

plt.figure()

x = ['Monday', 'Thursday', 'Wednesday', 'Thursday', 'Friday', 'Saturday', 'Sunday']


for i, [person, pgroup] in enumerate(df.groupby('Day')):
    monday = df.loc[df.Day == 'Monday', 'WindStrength'].values[0]
    tuesday = df.loc[df.Day == 'Tuesday', 'WindStrength'].values[0]
    wednesday = df.loc[df.Day == 'Wednesday', 'WindStrength'].values[0]
    thursday = df.loc[df.Day == 'Thursday', 'WindStrength'].values[0]
    friday = df.loc[df.Day == 'Friday', 'WindStrength'].values[0]
    saturday = df.loc[df.Day == 'Saturday', 'WindStrength'].values[0]
    sunday = df.loc[df.Day == 'Sunday', 'WindStrength'].values[0]
 
    plt.scatter(
        x,
        [monday, tuesday, wednesday, thursday, friday, saturday, sunday],
        s=25,
        marker='o',
    )   

plt.ylabel('WindStrength')
plt.xlabel('Day')
#plt.margins(x=0.5)
plt.show()

The output


Solution

  • The reason for just one point being shown is because of values[0] will always point to the first point for each day of the week. So, the same value is being printed. If I understand correctly, you want to see the 2 or 3 points for each day. This is taken care by matplotlib. You don't need to group or have a for loop. Once you have the df ready, this would take care of it.

    plt.scatter(x= df.Day, y = df.WindStrength, s=25, marker='o')
    plt.ylabel('WindStrength')
    plt.xlabel('Day')
    plt.show() 
    

    you can add color = 'orange'/'red' if you don't like the default color. The output will be like below. Hope that is what you were trying to achieve...

    enter image description here