I am plotting a time series where x
is a series of datetime.datetime
objects and y
is a series of doubles.
I'd like to map the marker size to a third series z
(and possibly also map marker color to a fourth series w
), which in most cases could be accomplished with:
scatter(x, y, s=z, c=w)
except scatter()
does not permit x
being a series of datetime.datetime
objects.
plot(x, y, marker='o', linestyle='None')
on the other hand works with x
being datetime.datetime
(with properly tick label), but markersize/color can only be set for all points at once, namely no way to map them to extra series.
Seeing that scatter
and plot
each can do half of what I need, is there a way to do both?
UPDATE following @tcaswell's question, I realized scatter
raised an KeyError
deep in the default_units()
in matplotlib/dates.py
on the line:
x = x[0]
and sure enough my x
and y
are both Series taken from a pandas DataFrame which has no '0' in index. I then tried two things (both feel somewhat hacky):
First, I tried modify the DataFrame index to 0..len(x)
, which led to a different error inside matplotlib/axes/_axes.py
at:
offsets = np.dstack((x,y))
dstack
doesn't play nice with pandas Series. So I then tried convert x
and y
to numpy.array:
scatter(numpy.array(x), numpy.array(y), s=numpy.array(z))
This almost worked except scatter seemed to have trouble auto-scaling x
axis and collapsed everything into a straight line, so I have to reset xlim
explicitly to see the plot.
All of this is to say that scatter
could do the job albeit with a bit of convolution. I had always thought matplotlib can take any array-like inputs but apparently that's not quite true if the data is not simple numbers that require some internal gymnastics.
UPDATE2 I also tried to follow @user3666197's suggestion (thanks for the editing tips btw). If I understood correctly, I first converted x
into a series of 'matplotlib style days':
mx = mPlotDATEs.date2num(list(x))
which then allows me to directly call:
scatter(mx, y, s=z)
then to label axis properly, I call:
gca().xaxis.set_major_formatter( DateFormatter('%Y-%m-%d %H:%M'))
(call show()
to update the axis label if interactive mode)
It worked quite nicely and feels to me a more 'proper' way of doing things, so I'm going to accept that as the best answer.
However, let's work by example:
step A: from a datetime
to a matplotlib
convention-compatible float
for dates/times
step B: adding 3D
| 4D
| 5D
capabilities ( using additional { color
| size
| alpha
} --coded dimensionality of information )
As usual, devil is hidden in detail.
matplotlib
dates are almost equal, but not equal:
# mPlotDATEs.date2num.__doc__
#
# *d* is either a class `datetime` instance or a sequence of datetimes.
#
# Return value is a floating point number (or sequence of floats)
# which gives the number of days (fraction part represents hours,
# minutes, seconds) since 0001-01-01 00:00:00 UTC, *plus* *one*.
# The addition of one here is a historical artifact. Also, note
# that the Gregorian calendar is assumed; this is not universal
# practice. For details, see the module docstring.
So, highly recommended to re-use their "own" tool:
from matplotlib import dates as mPlotDATEs # helper functions num2date()
# # and date2num()
# # to convert to/from.
Nevertheless, matplotlib brings you arms for this part too:
from matplotlib.dates import DateFormatter, \
AutoDateLocator, \
HourLocator, \
MinuteLocator, \
epoch2num
from matplotlib.ticker import ScalarFormatter, FuncFormatter
and may for example do:
aPlotAX.set_xlim( x_min, x_MAX ) # X-AXIS LIMITs ------------------------------------------------------------------------------- X-LIMITs
#lt.gca().xaxis.set_major_locator( matplotlib.ticker.FixedLocator( secs ) )
#lt.gca().xaxis.set_major_formatter( matplotlib.ticker.FuncFormatter( lambda pos, _: time.strftime( "%d-%m-%Y %H:%M:%S", time.localtime( pos ) ) ) )
aPlotAX.xaxis.set_major_locator( AutoDateLocator() )
aPlotAX.xaxis.set_major_formatter( DateFormatter( '%Y-%m-%d %H:%M' ) ) # ----------------------------------------------------------------------------------------- X-FORMAT
#--------------------------------------------- # 90-deg x-tick-LABELs
plt.setp( plt.gca().get_xticklabels(), rotation = 90,
horizontalalignment = 'right'
)
#------------------------------------------------------------------
3D
| 4D
| 5D
} transcodingJust to imagine the approach, check this example, additional dimensionality of information was coded using different tools into { color
| size
| alpha
}. Whereas { size
| alpha
} are scatter-point related, for color
there are additional tools in matplotlib
included a set of colouring scaled for various domain-specific or human-eye vision / perception adapted colour-scales. A nice explanation of color-scale / normalisation scaler is presented here.
You may have noticed, that this 4D
example still has a constant alpha
( unused for 5th DOF in true 5D
dimensionality visualisation ).