pythonggplot2plotplotnine

How to prevent ggplot2 from prioritizing ordering alphabetically?


I can't seem to get this plot to show the Weight column ordered, it prioritizes ordering the Breed column alphabetically even though I made the Weight a categorical column and set ordered to True, I don't know what to do... When I print the dataframe before plotting, it orders the dataframe by the Weight value.

sorted_weight = traits_n_weights.sort_values(by=["Weight"])

top_weight = sorted_weight.head(70)
bins = sorted(top_weight["Weight"].unique())

top_weight = top_weight.assign(
    Weight=pd.Categorical(top_weight["Weight"], categories=bins, ordered=True),
) 

plot = ggplot(top_weight, aes(x="factor(Weight)", y="Breed")) + geom_point() + labs(y="")

plot

The Weight column contains float values, the Breed contains string values. I have tried removing the factor part of x="factor(Weight)", can someone help me?

These are the imports of the code

import numpy as np
import pandas as pd
from plotnine import *

Solution

  • Instead of using factor, plotnine has another internal function reorder that changes the order of first variable based on values of the second.

    In your case, that would be

    ggplot(top_weight, aes(x="Weight", y="reorder(Breed, Weight)"))
    

    See aes() notes section for more.