pythonplotaltair

altair mark_area() with unexpected behaviour of y2


I'm trying to make an area chart of a CDF plot (which should be always increasing), instead I get the result in the image below.

import altair as alt
import polars as pl

_data = pl.DataFrame({"outcomes": [16950, 17050, 18750, 18750, 20950]})
(
    alt.Chart(
        _data,
    )
    .transform_quantile("outcomes", step=0.1)
    .mark_area(line=True, opacity=0.5)
    .encode(
        alt.X("value:Q"),
        alt.Y("prob:Q").title("Prob"),
        # alt.Y2(alt.datum(0))
        # y2=alt.datum(0),
    )
)

error1

If I uncomment the alt.Y2(...) line I still don't get what I'm looking for (namely, there's no area being plotted, although it's at least the actual CDF).

error2

When I use the last formulation y2=alt.datum(0) it does work.

Questions

  1. I expected Y2 to be 0 by default (I recall having read it somewhere but I cannot find it not, maybe I made it up?). It does seem to be so most of the time, when is it not? Why?
  2. I expected the alt.Y2(...) API to behave exactly the same as y2=... but it does not seem to be the case (I've had problems with this in other plots involving different channels too). When are they different? Why? Or am I doing something wrong?

Regards.

Versions



Solution

  • Because there are so few outcomes relative to the step size, when you apply the quantile transformation there are some probabilities where the value is the same. This can be seen in the online editor (Open the Chart in the Vega Editor) where you can view the intermediate data tables:

    online editor table

    Note how .55, .65, and 0.75 have the same value.

    When this happens the area chart trys to stack them which causes the weird behavior. This will probably be fixed with real data, but the stacking can be disabled as shown below if not. You could also increase the step size.

    import altair as alt
    import polars as pl
    
    _data = pl.DataFrame({"outcomes": [16950, 17050, 18750, 18750, 20950]})
    (
        alt.Chart(
            _data,
        )
        .transform_quantile("outcomes", step=0.1)
        .mark_area(line=True, opacity=0.5)
        .encode(
            alt.X("value:Q"),
            alt.Y("prob:Q", stack=None).title("Prob")
        )
    )