pythonmatplotlibkernel-densitydensity-plot

How can I make a scatter plot colored by density?


I'd like to make a scatter plot where each point is colored by the spatial density of nearby points.

I've come across a very similar question, which shows an example of this using R:

R Scatter Plot: symbol color represents number of overlapping points

What's the best way to accomplish something similar in python using matplotlib?


Solution

  • In addition to hist2d or hexbin as @askewchan suggested, you can use the same method that the accepted answer in the question you linked to uses.

    If you want to do that:

    import numpy as np
    import matplotlib.pyplot as plt
    from scipy.stats import gaussian_kde
    
    # Generate fake data
    x = np.random.normal(size=1000)
    y = x * 3 + np.random.normal(size=1000)
    
    # Calculate the point density
    xy = np.vstack([x,y])
    z = gaussian_kde(xy)(xy)
    
    fig, ax = plt.subplots()
    ax.scatter(x, y, c=z, s=100)
    plt.show()
    

    enter image description here

    If you'd like the points to be plotted in order of density so that the densest points are always on top (similar to the linked example), just sort them by the z-values. I'm also going to use a smaller marker size here as it looks a bit better:

    import numpy as np
    import matplotlib.pyplot as plt
    from scipy.stats import gaussian_kde
    
    # Generate fake data
    x = np.random.normal(size=1000)
    y = x * 3 + np.random.normal(size=1000)
    
    # Calculate the point density
    xy = np.vstack([x,y])
    z = gaussian_kde(xy)(xy)
    
    # Sort the points by density, so that the densest points are plotted last
    idx = z.argsort()
    x, y, z = x[idx], y[idx], z[idx]
    
    fig, ax = plt.subplots()
    ax.scatter(x, y, c=z, s=50)
    plt.show()
    

    enter image description here