kotlinplotlets-plot

Multiple axis scale in Lets plot Kotlin


I'm learning some data science related topics and oh boy, this is a jungle of different libraries for everything 😅

Because of things, I went with Lets-plot, which has a nice Kotlin API that I'm using combined with Kotlin kernel for Jupyter notebooks

Overall, things are going pretty good. Most tutorials & docs I see online use different libraries for plotting (e.g. Seaborn, Matplotlib, Plotly) so most of the time I have to do some reading of the Lets-Plot-Kotlin reference and try/error until I find the equivalent code for my graphs

Currently, I'm trying to graph the distribution of differences between two values. Overall, this looks pretty good. I can just do something like

(letsPlot(df)
    + geomHistogram { x = "some-column" }
).show()

which gives a nice graph result of geomHistogram

It would be interesting to see the density estimator as well, geomDensity to the rescue!

(letsPlot(df)
    + geomDensity(color = "red") { x = "some-column" }
).show()

result of geomDensity

Nice! Now let's watch them both together

(letsPlot(df)
    + geomDensity(color = "red") { x = "some-column" }
    + geomHistogram() { x = "some-column" }
).show()

result of both graphs

As you can see, there's a small red line in the bottom (the geomDensity!). Problem here (I would say) is that both layers are using the same Y scale. Histogram is working with 0-20 values and density with 0-0.02 so when plotted together it's just a line at the bottom

Is there any way to add several layers in the same plot that use their own scale? I've read some blogposts that claim that you should not go for it (seems to be pretty much accepted by the community.

My target is to achieve something similar to what you can do with Seaborn by doing

plt.figure(figsize=(10,4),dpi=200)
sns.histplot(data=df,x='some_column',kde=True,bins=25)

graph with seaborn

(yes I know I took the lets plot screenshot without the bins configured. Not relevant, I'd say ¯_(ツ)_/¯ )

Maybe I'm just approaching the problem with a mindset I should not? As mentioned, I'm still learning so every alternative will be highly welcomed 😃

Just, please, don't go with the "Switch to Python". I'm exploring and I'd prefer to go one topic at a time


Solution

  • In order for histogram and density layers to share the same y-scale you need to map variable "..density.." to aesthetic "y" in the histogram layer (by default histogram maps "..count.." to "y").

    You will find an example of it in cell [4] in this notebook: https://nbviewer.org/github/JetBrains/lets-plot-kotlin/blob/master/docs/examples/jupyter-notebooks/distributions.ipynb

    BWT, many of the pages in Lets-Plot Kotlin API Reference are equipped with links on demo-notebooks, in "Examples" section: geomHistogram().

    And of course you can find a lot of info online on the R ggplot2 package which is largely applicable to Lets-Plot as well. For example: Histogram with kernel density estimation.

    Finally :) , calling show() is not necessary - Jupyter Kotlin kernel will render plot automatically if plot expression is the last one in the cell which is often the case.