pythonheatmapaxis-labels

How to display dynamic (scalable) time labels on a y-axis of a heatmap in python


I am reading a csv file containing 1 column with dates, 1 column with times and 120 columns of integers. The file may contain many rows (could be over 100,000).

The data points are converted to color indexes between 0 and 1 and I plot them to a heatmap.

My problem is with displaying the times column as Y-axis labels.

Whatever I tried, the labels are displayed in a non-readable view. Full view - labels are un-redable

If I zoom in to see more color details, the labels are more readable. Zoom in - labels are readable

I would like to have the labels readable in any view, even when in full view. It is acceptable that some of the labels will not be displayed in the zoom-out views and will only be displayed when zooming in.

I tried to create y ticks like below, where 'data' is a pandas dataframe and case_times is a dataframe with datetime objects.

import numpy as np
import pandas as pd
from datetime import datetime

data = pd.read_csv(csvfile, delimiter=r',|\s|\|', header=None)
data = np.split(dsa_data, [1, 2, 62, 122], axis=1)
case_times = dsa_data[1]
case_times[1] = pd.to_datetime(case_times[1], format='%H:%M:%S').dt.time
case_times = np.squeeze(case_times)

fig, axs = plt.subplots(1,2)
fig.tight_layout()
im1 = axs[0].imshow(data, cmap='DSAColorMap')
    
    
axs[0].set_yticks(ticks=np.arange(1, len(case_times)+1, 1), labels=case_times)
axs[0].set_autoscale_on(True)
plt.show()

I then tried to start from a column of strings instead of the dataframe, but I get the same results:

with open(fileName, newline='') as csvfile:
    # Advance rows 3 time to skip headers and extract case start date 
    # without the need to iterate over the file
    row = next(csvfile)
    row = next(csvfile)
    row = next(csvfile) 
    text_to_split = re.split(r',|\s|\|', row)
    case_start_date = text_to_split[0]
    # Remove blank items created after split by '|' delimiter
    text_to_split = text_to_split[:len(text_to_split)-3]
    #Create a list of time stamps to be later used as x axis for the plot
    case_times.append(text_to_split.pop(1)) 
    # Create the first data row
    data.append(text_to_split[1:61])
    spamreader = csv.reader(csvfile, delimiter='|')
    # Iterate over all remaining rows
    # to create the data rows
    for row in spamreader:
            text_to_split = ' '.join(row) # Create a list from all elements of data in a row
            text_to_split = re.split(r',|\s', text_to_split) # split row by delimiters
            text_to_split.pop(len(text_to_split)-1) # Remove last element of the row which is blank
            text_to_split.pop(0) # Remove date from row
            #Create a list of time stamps to be later used as axis labels for the plot
            case_times.append(text_to_split.pop(0)) 
            # Add data row to each hemisphere data list
            data.append(text_to_split[:60])

converted_case_times = [] # transform from string to datetime
for i in range(len(case_times)):
    converted_case_times.append(datetime.strptime(case_times[i], '%H:%M:%S'))
    
fig, axs = plt.subplots(1,2)
fig.tight_layout()
im1 = axs[0].imshow(dsa_data_left, cmap='DSAColorMap')
axs[0].set_yticks(ticks=range(1, len(converted_case_times)+1), labels=converted_case_times)
plt.show()

I also tried the following formatting change to overcome the problem, but it also didn't work:

from matplotlib.dates import DateFormatter

axs[0].yaxis.set_major_formatter(DateFormatter("%H:%M:%S"))
    plt.setp(axs[0].get_yticklabels(), rotation=45, ha="right",
         rotation_mode="anchor")

I assume the problem lies in the creation of the ticks (as a fixed range) however I do not know how to resolve this and I could not find an explanation or example that works in my case.

EDIT: Following the response I was given, I added this to my code:

    def generate_yticks(ax, case_times, ticks):
        ylims = ax.get_ylim()
        ticks_in_window = ticks[ticks>ylims[1]][ticks<ylims[0]]
        shown_ticks = 
        ticks_in_window[np.arange(10)*int(len(ticks_in_window)/10)]
        shown_labels = case_times[shown_ticks-1]
        ax.set_yticks(ticks=shown_ticks, labels=shown_labels)
        print(ylims)

   ticks = np.arange(1, len(case_times)+1)
   axs[0].callbacks.connect('ylim_changed', generate_yticks(axs[0], 
         case_times, ticks))

This seems to call the function only one time, meaning it is not called again when I zoom in. I tried changing the 'ylim_changed' to another event, but I get an error of unsupported event.

If the issue is the method of using generate_ticks, i.e. that is has more than one parameter, I don't know how to pass the ticks and case_times into the function, since it is a function called by another function. I prefer not to make ticks a global variable, and I definitely cannot make case_times a global since it is generated within the code.

I tried to nest the generate_ticks function within the calling function, but that triggered the "boolean index did not match indexed array along axis 0; size of axis is 268 but size of corresponding boolean axis is 503" error again.


Solution

  • The easiest way to achieve this is to just remove a part of your labels.

    Changing your third-to-last line in the first block of code:

    indices = np.arange(0,len(case_times), 100)
    axs[0].set_yticks(ticks=indices+1, labels=case_times[indices])
    

    Here I set it to take only one element every 100 using the optional step parameter of arange. You might need to increase this step depending on the number of labels you have; an idea might be to just use

    indices = np.arange(n)/n * len(case_times)
    

    which will just show n labels, equally spaced.

    EDIT: clarification after comment

    I believe you should also take a look at ax[0].callbacks.connect and create a function that sets the indices as needed. It should look something like this:

    ticks = np.arange(1, len(case_times)+1)
    def generate_yticks(ax):
        ylims = ax.get_ylim()
        ticks_in_window = ticks[np.logical_and(ticks>ylims[0], ticks<ylims[1])]
        shown_ticks = ticks_in_window[np.arange(n)/n * len(ticks_in_window)]
        shown_labels = case_times[shown_ticks-1]
        ax.set_yticks(ticks=shown_ticks, labels=shown_labels)
    
    ax[0].callbacks.connect("ylim_changed", generate_yticks)