pythondatetimematplotlibbar-chartvisualize

Weird behavior of barplot from python matplotlib with datetime


import matplotlib.pyplot as plt
import datetime
x = [datetime.datetime(1943,3, 13,12,0,0),
     datetime.datetime(1943,3, 13,12,5,0),
     datetime.datetime(1943,3, 13,12,10,0),
     datetime.datetime(1943,3, 13,12,15,0),
     datetime.datetime(1943,3, 13,12,20,0),
     datetime.datetime(1943,3, 13,12,25,0),
     datetime.datetime(1943,3, 13,12,30,0),
     datetime.datetime(1943,3, 13,12,35,0)]
y = [1,2,3,4,2,1,3,4]

# plot the data out but does not provide sufficient detail on the lower    values
plt.figure()
plt.bar(x,y)

# plot the data out but ommit the datetime information
plt.figure()
plt.bar(range(0,len(x)),y)

Hello guys, I am just starting with the matplotlib in transition from matlab to python. However, I encountered weird behavior of matplotlib as it is not able to display the data along with the datetime element. My question here would be the output of both bar plot yield two different results.

enter image description here

The first one directly convert the data into some kind of continuous data where as the second one more like categorical data. Do anyone encountered similar problem as mine and dont mind share their way of approaching this?

P/s: i tried seaborn and it works but somehow does not play well with dual axis plotting. I also googled for similar issue but somehow not such issue?


Solution

  • I'm not sure if I would call the observed behaviour unexpected. In the first case you provide dates to the x variable of the bar plot, hence it will plot the bars at those dates. In the second case you provide some numbers to the x variable, hence it will plot the numbers.

    Since you didn't tell which of those you actually prefer, a solution is to make them both equal visually. Still, the respective concept is different.

    import matplotlib.pyplot as plt
    from matplotlib.dates import DateFormatter
    import datetime
    x = [datetime.datetime(1943,3, 13,12,0,0),
         datetime.datetime(1943,3, 13,12,5,0),
         datetime.datetime(1943,3, 13,12,10,0),
         datetime.datetime(1943,3, 13,12,15,0),
         datetime.datetime(1943,3, 13,12,20,0),
         datetime.datetime(1943,3, 13,12,25,0),
         datetime.datetime(1943,3, 13,12,30,0),
         datetime.datetime(1943,3, 13,12,35,0)]
    y = [1,2,3,4,2,1,3,4]
    
    # plot numeric plot
    plt.figure()
    plt.bar(x,y, width=4./24/60) # 4 minutes wide bars
    plt.gca().xaxis.set_major_formatter(DateFormatter("%H:%M"))
    
    # Plot categorical plot
    plt.figure()
    plt.bar(range(0,len(x)),y, width=0.8) # 0.8 units wide bars
    plt.xticks(range(0,len(x)), [d.strftime("%H:%M") for d in x])
    
    plt.show()
    

    enter image description here

    The difference between the concepts would however be more clearly observable when using different data,

    x = [datetime.datetime(1943,3, 13,12,0,0),
         datetime.datetime(1943,3, 13,12,5,0),
         datetime.datetime(1943,3, 13,12,15,0),
         datetime.datetime(1943,3, 13,12,25,0),
         datetime.datetime(1943,3, 13,12,30,0),
         datetime.datetime(1943,3, 13,12,35,0),
         datetime.datetime(1943,3, 13,12,45,0),
         datetime.datetime(1943,3, 13,12,50,0)]
    

    enter image description here