I am reading a csv file containing 1 column with dates, 1 column with times and 120 columns of integers. The file may contain many rows (could be over 100,000).
The data points are converted to color indexes between 0 and 1 and I plot them to a heatmap.
My problem is with displaying the times column as Y-axis labels.
Whatever I tried, the labels are displayed in a non-readable view. Full view - labels are un-redable
If I zoom in to see more color details, the labels are more readable. Zoom in - labels are readable
I would like to have the labels readable in any view, even when in full view. It is acceptable that some of the labels will not be displayed in the zoom-out views and will only be displayed when zooming in.
I tried to create y ticks like below, where 'data' is a pandas dataframe and case_times is a dataframe with datetime objects.
import numpy as np
import pandas as pd
from datetime import datetime
data = pd.read_csv(csvfile, delimiter=r',|\s|\|', header=None)
data = np.split(dsa_data, [1, 2, 62, 122], axis=1)
case_times = dsa_data[1]
case_times[1] = pd.to_datetime(case_times[1], format='%H:%M:%S').dt.time
case_times = np.squeeze(case_times)
fig, axs = plt.subplots(1,2)
fig.tight_layout()
im1 = axs[0].imshow(data, cmap='DSAColorMap')
axs[0].set_yticks(ticks=np.arange(1, len(case_times)+1, 1), labels=case_times)
axs[0].set_autoscale_on(True)
plt.show()
I then tried to start from a column of strings instead of the dataframe, but I get the same results:
with open(fileName, newline='') as csvfile:
# Advance rows 3 time to skip headers and extract case start date
# without the need to iterate over the file
row = next(csvfile)
row = next(csvfile)
row = next(csvfile)
text_to_split = re.split(r',|\s|\|', row)
case_start_date = text_to_split[0]
# Remove blank items created after split by '|' delimiter
text_to_split = text_to_split[:len(text_to_split)-3]
#Create a list of time stamps to be later used as x axis for the plot
case_times.append(text_to_split.pop(1))
# Create the first data row
data.append(text_to_split[1:61])
spamreader = csv.reader(csvfile, delimiter='|')
# Iterate over all remaining rows
# to create the data rows
for row in spamreader:
text_to_split = ' '.join(row) # Create a list from all elements of data in a row
text_to_split = re.split(r',|\s', text_to_split) # split row by delimiters
text_to_split.pop(len(text_to_split)-1) # Remove last element of the row which is blank
text_to_split.pop(0) # Remove date from row
#Create a list of time stamps to be later used as axis labels for the plot
case_times.append(text_to_split.pop(0))
# Add data row to each hemisphere data list
data.append(text_to_split[:60])
converted_case_times = [] # transform from string to datetime
for i in range(len(case_times)):
converted_case_times.append(datetime.strptime(case_times[i], '%H:%M:%S'))
fig, axs = plt.subplots(1,2)
fig.tight_layout()
im1 = axs[0].imshow(dsa_data_left, cmap='DSAColorMap')
axs[0].set_yticks(ticks=range(1, len(converted_case_times)+1), labels=converted_case_times)
plt.show()
I also tried the following formatting change to overcome the problem, but it also didn't work:
from matplotlib.dates import DateFormatter
axs[0].yaxis.set_major_formatter(DateFormatter("%H:%M:%S"))
plt.setp(axs[0].get_yticklabels(), rotation=45, ha="right",
rotation_mode="anchor")
I assume the problem lies in the creation of the ticks (as a fixed range) however I do not know how to resolve this and I could not find an explanation or example that works in my case.
EDIT: Following the response I was given, I added this to my code:
def generate_yticks(ax, case_times, ticks):
ylims = ax.get_ylim()
ticks_in_window = ticks[ticks>ylims[1]][ticks<ylims[0]]
shown_ticks =
ticks_in_window[np.arange(10)*int(len(ticks_in_window)/10)]
shown_labels = case_times[shown_ticks-1]
ax.set_yticks(ticks=shown_ticks, labels=shown_labels)
print(ylims)
ticks = np.arange(1, len(case_times)+1)
axs[0].callbacks.connect('ylim_changed', generate_yticks(axs[0],
case_times, ticks))
This seems to call the function only one time, meaning it is not called again when I zoom in. I tried changing the 'ylim_changed' to another event, but I get an error of unsupported event.
If the issue is the method of using generate_ticks, i.e. that is has more than one parameter, I don't know how to pass the ticks and case_times into the function, since it is a function called by another function. I prefer not to make ticks a global variable, and I definitely cannot make case_times a global since it is generated within the code.
I tried to nest the generate_ticks function within the calling function, but that triggered the "boolean index did not match indexed array along axis 0; size of axis is 268 but size of corresponding boolean axis is 503" error again.
The easiest way to achieve this is to just remove a part of your labels.
Changing your third-to-last line in the first block of code:
indices = np.arange(0,len(case_times), 100)
axs[0].set_yticks(ticks=indices+1, labels=case_times[indices])
Here I set it to take only one element every 100 using the optional step
parameter of arange
. You might need to increase this step depending on the number of labels you have; an idea might be to just use
indices = np.arange(n)/n * len(case_times)
which will just show n
labels, equally spaced.
EDIT: clarification after comment
I believe you should also take a look at ax[0].callbacks.connect
and create a function that sets the indices as needed. It should look something like this:
ticks = np.arange(1, len(case_times)+1)
def generate_yticks(ax):
ylims = ax.get_ylim()
ticks_in_window = ticks[np.logical_and(ticks>ylims[0], ticks<ylims[1])]
shown_ticks = ticks_in_window[np.arange(n)/n * len(ticks_in_window)]
shown_labels = case_times[shown_ticks-1]
ax.set_yticks(ticks=shown_ticks, labels=shown_labels)
ax[0].callbacks.connect("ylim_changed", generate_yticks)