pythonmatplotlib

matplotlib: fill_between function, how to 'Interpolate' and 'where' correctly?


The length of value arrays for my line chart with matplotlib could be irregular, for example:

[18,17,0,0,2]
[6,4,0,0]
[3]

These are values on a time line, the last being the latest, every value corresponding to a week. Thus this means the second user joined a week later than the first, and the last only joined the last week.

In order to draw a line chart, I fill the smaller arrays with None values (not really relevant to fill_between).

[18, 17, 0, 0, 2]
[None, None, None, None, 3]
[None, 6, 4, 0, 0]

I am using interpolate=True because, I need to fill the curve if values go to zero. However, this has the unintended effect that for people starting later, the filling will also interpolate, starting from 0 to the actual value.

fill_between how it looks like now

I need it to only fill the space if there are actual values:

should be

Can this be done? I looked at the docs and it says something about the where array.

Here's the structure of my where array for this given chart:

[True, True, False, False, True, False]
[False, True, True, False, False, False]
[False, False, False, False, True, False]

As you can see I set the first where element to be False for the second value row, but because of "interpolate=True, it will start filling...

(Color codes: first row blue, second row green, third red)

Here's the actual code for the fill:

for i, values in enumerate(time_series):
    ax.plot(x, y[i], ',-', label=label, color=linecolor, markersize=8, fillstyle="full",linewidth=1.5, clip_on=True, zorder=200)
    ax.fill_between(x, 0, zero_y, where=where, color=fillcolors[i], alpha=0.8, edgecolor="white", zorder=zorder[i], interpolate=True)

Take note that fill_between actually crashed when fed with None values... I need to "translate" None values to 0...which is further complicating issues maybe...


Solution

  • Internally mpl converts your lists to np.ndarray. The proper value for 'not a number' (which can be used for missing data) is np.nan. Replacing all of the None with np.nan will fix the problem.