plotjuliahistogramgadfly

Gadfly xticks number in Julia histogram


suppose you have a dataframe with a column called c_lab that contains values from 0 to 100 and resembles a log-normal distribution.

I plotted the histogram with this code using gadfly package in Julia:

plot(df_plot, x = "c_lab", Geom.histogram(bincount = 100), 
    Guide.title("Pre-Tax Labor Income = h x w"),
    Guide.xlabel("Gross Laboral Income"),
    Guide.ylabel("Frequency"), 
    Theme(background_color = "white"))

And produces this plot:

enter image description here

The thing is in the x-axis I only have 3 values, I want to increment the number of breaks to say 10. (10, 20, 30,..., 100) or 4 (25, 50, 75,100).

In R I'd do it in a ggplot2 code and add +scale_x_continuous(n.breaks = 10)

How can I do the same in Julia Gadfly Package?

Thanks in Advance!!


Solution

  • Short answer: It doesn't appear that this feature exists directly in Gadfly, you may have to create something equivalent using Guide.xticks(ticks = ticks))

    Long answer: I had a look at the source code for the scales in Gadfly, and though it has minticks and maxticks (so theoretically, if one had access to those, they could set both to the same number to force a certain number of ticks), it's unclear if that is accessible using the function which face the user.

    In any case, it's straightforward to do something similar to this by creating an interval from the min to max, with i steps, where i is the number of ticks you want, and min and max are the minimum and maximum values on the x axis.

    using DataFrames
    using Distributions
    using Gadfly
    
    # Generate log-normal distributed values
    dist = LogNormal(0, 1)
    values = quantile.(dist, range(0, stop=1, length=101))
    
    # Create DataFrame
    df_plot = DataFrame(c_lab = values)
    
    # using a basic vector
    ticks = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
    
    # Plot the DataFrame using Gadfly
    p = plot(df_plot, x = "c_lab", Geom.histogram(bincount = 100),
             Guide.title("Pre-Tax Labor Income = h x w"),
             Guide.xlabel("Gross Laboral Income"),
             Guide.ylabel("Frequency"),
             Theme(background_color = "white"),
             Guide.xticks(ticks = ticks))
    
    # other option: a function 
    
    function tick_vector_creator(df::DataFrame, column::Symbol, num_ticks::Int64)
        # pull the column out into a vector
        column = sort(df[!, column])
        # get the first and last value of column which isn't infinite
        first_value = column[findfirst(column .> -Inf)]
        last_value = column[findlast(column .< Inf)]
        
        interval = (last_value - first_value) ÷ num_ticks
        return range(first_value, stop=last_value, length=num_ticks)
    end
    
    # then use this in place of the other ticks variable
    ticks = tick_vector_creator(df_plot, :c_lab, 10)
    

    resulting graph

    update: I asked on the Github page, and a maintainer confirmed that the way to do it is with xticks. Hope this helps!