I have a Pandas Series having many zero values and DateTimeIndex. I want to plot them only with some zero values and handle datetime spacing proper.
For example, the series is as follows. The simple plot shows too many zeros. I only want to show a few zeros before and after the non-zero values. And at the same time, hiding the dates between properly.
ser_tmp = pd.Series(0, index=pd.date_range('2020-01-01', '2020-01-30'))
ser_tmp.loc[[pd.Timestamp('2020-01-03'), pd.Timestamp('2020-01-04'), pd.Timestamp('2020-01-23'), pd.Timestamp('2020-01-24')]] = 1
ser_tmp.plot()
Plot after replacing all zeros is not what I want.
ser_tmp.replace(0, np.nan).plot()
What I want is something as ser_tmp2
. But plotting ser_tmp2
didn't hide the dates gap between 2020-01-06
and 2020-01-21
. Moreover, I would like to detect the dates automatically, rather than setting zero as in ser_tmp2
mannually.
ser_tmp2 = ser_tmp.replace(0, np.nan).copy()
ser_tmp2.loc[[pd.Timestamp('2020-01-01'), pd.Timestamp('2020-01-02'), pd.Timestamp('2020-01-05'), pd.Timestamp('2020-01-06'), pd.Timestamp('2020-01-21'), pd.Timestamp('2020-01-22'), pd.Timestamp('2020-01-25'), pd.Timestamp('2020-01-26')]] = 0
ser_tmp2 = ser_tmp2.dropna()
ser_tmp2
ser_tmp2.plot()
If I got you right, you want to drop multiple consecutive zeros.
My solution:
ser_tmp = pd.Series(0, index=pd.date_range('2020-01-01', '2020-01-30'))
ser_tmp.loc[[pd.Timestamp('2020-01-03'), pd.Timestamp('2020-01-04'), pd.Timestamp('2020-01-23'), pd.Timestamp('2020-01-24')]] = 1
ser_tmp.index = ser_tmp.index.strftime('%Y-%m-%d') # Set the index to string so it will not automatically add missing dates
ser_tmp.drop(ser_tmp[(ser_tmp == 0) & (ser_tmp.shift(1) == 0) & (ser_tmp.shift(-1) == 0)].index, inplace=True) # Drop all zeros values that have zero before and zero after as values
ser_tmp.plot(rot=45)
Result: