rggplot2ggalluvial

Aesthetics must be either length 1 or the same as the data: fill, y and axis1


Aesthetics must be either length 1 or the same as the data (5): fill, y and axis1 Any suggestion would be appreiated. Thank you!

Here's the background:

dim(fly)
Rows: 21,000
Columns: 4

head(fly,5)

Date.       Airport.            Count     Type
2022-01-02  Brussels             256      Arrival   
2022-01-24  Charleroi            84       Departure
2022-02-03  Berlin              148       Departure 
2022-03-18  Dresden               95      Arrival   
2022-03-19  Erfurt                29      Departure 
2022-04-01  Frankfurt           391       Departure
structure(list(Date = structure(c(1640995200, 1640995200, 
1640995200, 1640995200, 1640995200, 1640995200, 1640995200, 1640995200, 
1640995200, 1640995200), tzone = "UTC", class = c("POSIXct", 
"POSIXt")), Airport = c("Brussels", "Charleroi", "Berlin - Brandenburg", 
"Dresden", "Erfurt", "Frankfurt", "Muenster-Osnabrueck", "Hamburg", 
"Cologne-Bonn", "Dusseldorf"), Count = c(148, 54, 148, 
5, 0, 391, 6, 78, 60, 103), Type = c("Departure", "Departure", 
"Arrival", "Departure", "Arrival", "Arrival", "Departure", 
"Arrival", "Departure", "Arrival")), row.names = c(NA, -10L
), class = c("tbl_df", "tbl", "data.frame"))

The requirement is to compare arrival flight counts only from the 3 airports - Brussels, London, Dresden using alluvial.

The below code works but it's producing the total (5 months) count instead of the total for each month/airport.

df_fly <- filter(fly, Airport %in% c("Brussels", "Dresden", "London"), Type =="Arrival") %>% 
  group_by(Airport) %>% 
  summarise(Flight_Count = sum(Flight_Count))
df_fly<- as.data.frame(df_fly) 
ggplot(df_fly,
aes(y = Count, axis1 = Airport, axis2 =Count)) + geom_alluvium(aes(fill = Airport), width = 1/8) +
geom_stratum(width = 1/8, fill = "black", color = "grey") + geom_label(stat = "stratum", aes(label = after_stat(stratum))) + scale_x_discrete(limits = c("Airport", "Count"),
expand = c(.05, .05)) +
scale_fill_brewer(type = "qual", palette = "Set2") +
ggtitle("Arrival Flight Comparison")

I then tried using this one to populate in a monthly manner per airport but it produced an error:

Aesthetics must be either length 1 or the same as the data (5): fill, y and axis1

df_fly <- filter(fly, Airport %in% c("Brussels", "Dresden", "London"), Type =="Arrival") %>%
  group_by(month = lubridate::floor_date(Date, 'month')) %>%
    summarize(Count = sum(Count))

df_fly<- as.data.frame(df_fly) 
ggplot(df_fly,
aes(y = Count, axis1 = Airport, axis2 =Count)) + geom_alluvium(aes(fill = Airport), width = 1/8) +
geom_stratum(width = 1/8, fill = "black", color = "grey") + geom_label(stat = "stratum", aes(label = after_stat(stratum))) + scale_x_discrete(limits = c("Airport", "Count"),
expand = c(.05, .05)) +
scale_fill_brewer(type = "qual", palette = "Set2") +
ggtitle("Arrival Flight Comparison")

Solution

  • So its gonna be a bit hard to debug without a reprex. My first suggestion would be to try format your code so its a bit easier to read, vertical space is free! I had a go rewriting it how I would below.

    I think the main issue is the group_by() call, you want to group by each combination of airport and month, so it should be group_by(Airport, month). Without grouping by the Airport as well, you are missing the airport column after summarising, hence the error in ggplot as it cannot find the airport column. Let me know if the code below works:

    df_fly = fly %>%
        filter(
            Airport %in% c("Brussels", "Dresden", "London"),
            Type == "Arrival"
            ) %>%
        group_by(
            Airport,
            month = lubridate::floor_date(Date, 'month'),
            ) %>%
        summarize(Count = sum(Count))
    
    df_fly %>%
        ggplot(aes(y = Count, axis1 = Airport, axis2 = Count)) +
        geom_alluvium(
            aes(fill = Airport),
            width = 1/8
            ) +
        geom_stratum(
            width = 1/8,
            fill = "black",
            color = "grey"
            ) +
        geom_label(
            stat = "stratum",
            aes(label = after_stat(stratum))
            ) +
        scale_x_discrete(
            limits = c("Airport", "Count"),
            expand = c(.05, .05)) +
        scale_fill_brewer(
            type = "qual",
            palette = "Set2"
            ) +
        ggtitle("Arrival Flight Comparison")