pythondatematplotlib

Scatter Plot multiple values for the same date


I need to plot multiple Y-values for the same date and I have it in a format like this:

dates = ['2015-03-12','2015-03-12', '2015-03-20','2015-03-20']  
values = [80, 55, 1, 100]
dates = [pd.to_datetime(d) for d in dates]
plt.scatter(dates, values)

When I plot, there are random dates in between, which I dont need. Is there a way to overcome this issue. Due to that, the spacing between the points seems off, both to the left and right is no space at all.

Tried different date formats, trying to convert to str, float. Using unique when scatter plotting, forcing x ticks. This is the plot I get, with the points being far left and far right

Edit: I figured it out, it didnt work without the datetime in the xlims, but it works like this. Care, I dont use plt.scatter anymore, I use axs.scatter due to having multiple subplots.

dates = ["2015-03-12", "2015-03-12", "2015-03-20", "2015-03-20"]
values = [80, 55, 1, 100]
dates = [pd.to_datetime(d) for d in dates]

fig, ax = plt.subplots()
ax.scatter(dates, values)

ax.set_xlim([datetime(2015, 3, 6), datetime(2015, 3, 25)])

Solution

  • Try this solution

    import pandas as pd
    import matplotlib.pyplot as plt
    import numpy as np
    
    
    dates = ['2015-03-12', '2015-03-12', '2015-03-20', '2015-03-20']
    values = [80, 55, 1, 100]
    df = pd.DataFrame({'date': pd.to_datetime(dates), 'value': values})
    grouped = df.groupby('date')['value'].apply(list)   # Group by date
    
    
    plt.figure(figsize=(10, 6))
    # Plot
    for date, values in grouped.items():
        plt.scatter(x=[date] * len(values), y=values, label=date.strftime('%Y-%m-%d'))
    # Ensure only the dates from the dataset are on the x-axis
    plt.xticks(grouped.index, rotation=45)   # rotation=90 make them appear vertical
    # Add padding to the x-axis
    padding = pd.Timedelta(days=7)  # Adjust padding as needed
    min_date = grouped.index.min() - padding
    max_date = grouped.index.max() + padding
    plt.xlim(min_date, max_date)
    
    plt.xlabel("Date")
    plt.ylabel("Values")
    plt.title("Values Over Time (Multiple per Date)")
    #plt.grid(True)   # Remove comment to show grids
    plt.tight_layout()
    plt.show()
    

    I got this enter image description here