rdataframeplotdplyr

Is there a way to rearrange columns uniformly between multiple tables and plot?


I have three dataframes describing frequencies of certain tags by year. Each DF has the same column headings, except one is missing 1+ column because there is zero frequencies and one row because there is no frequency for that year.

df1 <- data.frame(Year = c("2000", "2001", "2002", "2003", "2004"),
             Country = c(1, 4, 5, 2, 26), 
             Flag = c(23, 2, 4, 2, 5),
             Anthem = c(3, 7, 8, 2, 3)
             )

df2 <- data.frame(Year = c("2000", "2001", "2002", "2003", "2004"),
             Country = c(1, 4, 5, 2, 26), 
             Anthem = c(23, 2, 4, 2, 5),
             Flag = c(3, 7, 8, 2, 3)
             )

df3missing <- data.frame(Year = c("2001", "2002", "2003", "2004"),
             Anthem = c(4, 5, 2, 26), 
             Country = c(2, 4, 2, 5)
             )

If I were to plot them now, the colors would not represent the same column heading for each table because the order is different in the tables. And one table has a different number of columns and rows.

df11 <- melt(df1, id.vars="Year")
df22 <- melt(df2, id.vars="Year")
df3missing2 <- melt(df3missing, id.vars="Year")


ggplot(df11, aes(x = Year, y = value, fill = variable)) +
geom_bar(position = "fill", stat = "identity")

ggplot(df22, aes(x = Year, y = value, fill = variable)) +
geom_bar(position = "fill", stat = "identity")

ggplot(df3missing2, aes(x = Year, y = value, fill = variable)) +
geom_bar(position = "fill", stat = "identity")

Is there a way to reorganize the tables so that all the columns are in the same order so that the same color across 3 plots corresponds to the same column for all tables? In my data, there are 10+ columns.


Solution

  • Maybe this gets you closer to your desired output.

    First add NA to the missing frame, then get the column order to use with melt. This allows a somewhat symmetric outcome.

    Lastly, manually choose your own color scheme to get the same colors for the same variables.

    library(reshape2)
    library(ggplot2)
    
    df3 <- df3missing
    
    df3$Flag <- NA
    cc <- colnames(df1)
    
    dff1 <- melt(df1[,cc])
    dff2 <- melt(df2[,cc])
    dff3 <- melt(df3[,cc], id.vars="Year")
    
    ggplot(dff1) + 
      geom_bar( aes( Year, value, fill=variable ), stat = "identity" ) + 
      scale_fill_manual("legend", values = c("Country" = "red", "Flag" = "orange", "Anthem" = "blue"))
    ggplot(dff2) + 
      geom_bar( aes( Year, value, fill=variable ), stat = "identity" ) + 
      scale_fill_manual("legend", values = c("Country" = "red", "Flag" = "orange", "Anthem" = "blue"))
    ggplot(dff3) + 
      geom_bar( aes( Year, value, fill=variable ), stat = "identity" ) + 
      scale_fill_manual("legend", values = c("Country" = "red", "Flag" = "orange", "Anthem" = "blue"))
    

    ggplot2 3 plots