When I create a simple matpltlib plot with numpy arrays, the tick labels are well-behaved, chosen intelligently to not overlap and spaced to span the data range evenly.
However when I imported data into numpy arrays, the tick labels are a mess. It appears that it has added a tick label for each datapoint, rather than auto-generating a sensible scale.
Why is my data causing it to not be automatic?
How do I get MPL to do this automatically for real-world data with irregularly-spaced X/Y data?
"Normal" behavior:
import matplotlib.pyplot as plt
import numpy as np
import numpy.random as rnd
x = np.array( range(1000000) )
y = rnd.rand(1,1000000)[0]
fig, ax = plt.subplots()
ax.plot(x,y)
Real-world data with non-equally-spaced X-axis data, imported from file.
Snippet of data file:
-1900.209922,-106.022
-1900.176409,-103.902
-1900.142897,-112.337
-1900.109384,-109.252
...
Plotting script:
import numpy as np
import matplotlib.pyplot as plt
import csv
# Read CSV file
with open(r"graph.csv", encoding='utf-8-sig') as fp:
reader = csv.reader(fp, delimiter=",", quotechar='"', )
data_read = [row for row in reader]
#end with file
d = np.array(data_read).T # transpose
x = d[0][0:10]
y = d[1][0:10]
fig, ax = plt.subplots()
ax.plot( x, y, "." )
fig.show()
Zoomed in, you can see it added ticks at exactly my data points:
If I change the X-data to a linear array, then it auto-ticks the x-axis, putting labels at intuitive locations (not at datapoints):
y = d[1][0:100]
x = range( len(y) ) # integer x-axis points
fig, ax = plt.subplots()
ax.plot( x, y, "." )
fig.show()
By the way, even if I load 20,000 data points, such that the y axis spans from -106 --> -88 (in case the values were too closely spaced), the y-axis labels still collide:
y[-1]
Out[31]: '-88.109'
y[0]
Out[32]: '-106.022'
Ultimately I'll be loading a large number of datapoints (200,000), so need this solved.
This is an extremely common issue and it occurs when plotting string data rather than numeric. Essentially, the csv package is reading in the data as strings rather than numbers. You can fix this by following this answer, but you have two other options.
d
to be a numpy array of floats as shown below.# when creating d
d = np.array(data_read, dtype=float).T
# if d is already created
d = d.astype(float)
P.S. You can save figures using plt.savefig("filename.png")
rather than taking screenshots of them.