gnuplothistogram2d

Plot with 2D histograms bins


I have seen in posts Normalizing histogram bins in Gnuplot that is possible to bin some x samples and plot a histogram, with

binwidth=5
bin(x,width)=width*floor(x/width) + binwidth/2.0
plot 'file.dat' using (bin($1, binwidth)):(1/(binwidth*num_points) smooth freq with boxes

I would like to achieve the same result of that post but with the 2D dataset ((x,y)-points) and plot a type of heat map of that data, for example with heat-map indicating the probability, or intensity (i.e. (number of samples)/(bin area)).

How could I compute the 2d-bin_plot with Gnuplot?

Thank you very much for your help


Solution

  • Binning of 2D data is the same principle as for 1D. The special point about it is that the option smooth freq which is used for 1D binning will only accept one value for the bins (not two: x and y). Hence, you simply enumerate your bins from 0 to BinCountX * BinCountY - 1 and define functions BinValueX(), BinValueY() to get back from your bin number to the x- and y-bin values.

    The test data section creates random x,y and z-values. The z-values within a x,y-bin will be added during the binning process.

    Alternatively, depending on the data, a density plot could also be of interest.

    Script: (works with gnuplot>=5.0.0)

    ### 2D binning of data
    reset session
    
    # create some random test data
    set table $Data
        set samples 5000
        plot '+' u (invnorm(rand(0))):(invnorm(rand(0))):(int(rand(0)*10+1)) w table
        set samples 1000
        plot '+' u (invnorm(rand(0))+2):(invnorm(rand(0))+2):(int(rand(0)*10+1)) w table
    unset table
    
    BinWidthX = 0.25
    BinWidthY = 0.25
    
    # get data range min, max
    stats $Data u 1:2 nooutput
    Xmin = floor(STATS_min_x/BinWidthX)*BinWidthX
    Ymin = floor(STATS_min_y/BinWidthY)*BinWidthY
    Xmax = ceil(STATS_max_x/BinWidthX)*BinWidthX
    Ymax = ceil(STATS_max_y/BinWidthY)*BinWidthY
    
    BinCountX      = int((Xmax-Xmin)/BinWidthX)
    BinCountY      = int((Ymax-Ymin)/BinWidthY)
    XYtoBinNo(x,y) = (floor((y-Ymin)/BinWidthY))*BinCountX + floor((x-Xmin)/BinWidthX)
    BinNoToX(n)    = Xmin + (int(n)%BinCountX)*BinWidthX
    BinNoToY(n)    = Ymin + (int(n)/BinCountY)*BinWidthY   # integer division!
    
    # get data into bins
    set table $Bins
        plot [*:*][*:*] $Data u (XYtoBinNo($1,$2)):3 smooth freq
    unset table
    
    set size ratio -1
    set xrange [Xmin:Xmax]
    set yrange [Ymin:Ymax]
    set key noautotitle
    set style fill solid 1.0
    set grid x,y
    
    set multiplot layout 1,2
    
        set title "Raw data"
        plot $Data u 1:2:3 w p pt 7 ps 0.2 lc palette
        
        set title "2D binned data"
        dx=BinWidthX*0.5
        dy=BinWidthY*0.5
        plot $Bins u (BinNoToX($1)+dx):(BinNoToY($1)+dy):(dx):(dy):2 w boxxy fc palette z
    unset multiplot
    ### end of script
    

    Result:

    enter image description here