pythonmatplotliblabeling

Coloring in matplotlib scatter plot does not obey the predefined color sequence of my ListedColormap(cmap)


I have an issue when I am trying to use predefined color sequence for the labels of my data. In detail, I am using the parameter c of scatter plot for the labels of my data, and then cmap=ListedColormap(km_colors) for coloring them according to my colors list. However, it looks like the colormap decides for itself how to color the labeled data, for two classes that is if label=1 then it's colored black, which also belongs to my list of colors, and if label=0 then chooses the lightest(?) color of my color list. So, it does not obey to the order of the colors I set.

For example, in the code below you can see that even though km_colors[1]='cyan', it chooses the black color for label=1.

Thanks a lot for any help in advance.

km_colors = ['green', 'cyan', 'brown', 'darkorange', 'purple', 'black']
fig, ax = plt.subplots(3,3, sharex='col',figsize = (10,8))

for i in range(len(data_list)):
    for j in range(len(n_Enm_clusters)):
        ### c = [km_colors[int(l)] for k,l in enumerate(km_Enm_labels_list[i][j])]
        data_PCA = ax[j,i].scatter(PCA_bold[i][:,0],
                                  PCA_bold[i][:,1],
                                  c=km_Enm_labels_list[i][j], s=15,
                                  cmap = mcolors.ListedColormap(km_colors), 
                                  alpha = 0.5)

    # produce a legend with the unique colors from the scatter
        if i == len(data_list)-1:
            legend1 = ax[j,i].legend(*data_PCA.legend_elements(),
                                    loc="lower right", title="edge \n classes", prop={'size': 6})
            ax[j,i].add_artist(legend1)

plt.tight_layout()
plt.show()


pca data]:

pca data


Solution

  • When an array is provided as an input to ListedColormap(), the colors in that list are NOT picked up serially. While I am not aware of the exact process, it usually spreads it between the colors. So, if there were 6 colors in the list and...

    and so on...

    To fix the colors to be chosen as per your list, you will need to restrict the km_colors array to the number of colors required. Below is a sample scatter plot with random data created to show how this can be done. Note that I am restricting the colors picked up by the scatter plot using cmap = ListedColormap(km_colors[0:(i*3+j+1)]), which provides scatter plot with just the first (i*3 + j)

    from matplotlib.colors import ListedColormap
    x = np.random.rand(100)
    y = np.random.rand(100)
    km_colors = ['green', 'cyan', 'brown', 'darkorange', 'purple', 'black']
    fig, ax = plt.subplots(2,3, sharex='col',figsize = (10,8))
    for i in range(2):
        for j in range(3):
            clr_col = np.random.randint(i*3+j+1, size=(100))
            data_PCA = ax[i,j].scatter(x,y, s=55, c=clr_col,
                                      cmap = ListedColormap(km_colors[0:(i*3+j+1)]), 
                                      alpha = 0.5)
            print(i*3+j+1)
            print(np.unique(clr_col))
            print(km_colors[0:(i*3+j+1)])
    
    plt.tight_layout()
    plt.show()
    

    Output plot

    enter image description here