rggplot2histogramdistributiondata-representation

How to plot the distribution of two variable(meter & height) for two different scenario?


I have two data frames which represent two scenarios. Each dataframe have 3 variables: Area, Height and Scenario_Name(scenario 1 and scenario 2).

What I want:

  1. I want to show how much Suitable_Area do I have at each Height for the two scenario.
  2. I want a label INSIDE the plot at each height (e.g. on top of a histogram bar), which tells me the exact dimension of Area.
  3. The two scenario should be visible in the same plot (e.g. overlapping)

Scenario1:

structure(list(height = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10),
Suitable_Area = c(20462L, 
        21952L, 23069L, 20184L, 18836L, 18141L, 17988L, 17732L, 17227L, 
        17184L), ScenarioTyp = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 
        1L, 1L, 1L), .Label = "scenario1", class = "factor")), class = "data.frame", row.names = c("1","2", "3", "4", "5", "6", "7", "8",
"9", "10"))

Scenario2:

structure(list(height = c(1, 2, 3, 4, 5, 6, 7, 8, 9), Suitable_Area =
c(20462L, 
     20462L, 20457L, 16826L, 14847L, 13505L, 12726L, 11821L, 10853L
     ), ScenarioTyp = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
     1L), .Label = "scenario2", class = "factor")), class = "data.frame", row.names = c("1", 
     "2", "3", "4", "5", "6", "7", "8", "9"))

I already tried a couple of plot(with ggplot2) which show me the Suitable_Area but do not provide clear and precise information as I want(see the above mentioned point 1-2 )

ggplot(Scenario1,aes(x=height,y=Suitable_Area))+
 geom_area(alpha=0.8,fill="lightblue")+
 geom_area(data= Scenario2,aes(y=Suitable_Area),fill="lightgrey",alpha=0.9)

overplotted geom_area

and,

ggplot(Scenario2 , aes(x= height,y= Suitable_Area,colour="Scenario2"))+ 
  geom_line()+ 
  geom_line(data = Scenario1, aes(x= height,y= Suitable_Area,colour="Scenario1"))

lines

I could imagine that two half-transparent overplotting histograms (with width 1, corresponding to 1 meter in height) could do the work, but I couldn't manage to put the Suitable_Area as Y axes which is usually either density or count.

I also consider the idea of merge the two dataframes but it didnt really helped me.

Is this possible at all? Or is to much stuff in a chart for ggplot2?

Thank you in advance.


Solution

  • I am unsure if this is what you are after, but would a grouped bar plot work?

    library(tidyverse)
    s1 = structure(list(height = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10),
                        Suitable_Area = c(20462L, 
                                          21952L, 23069L, 20184L, 18836L, 18141L, 17988L, 17732L, 17227L, 
                                          17184L), ScenarioTyp = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 
                                                                             1L, 1L, 1L), .Label = "scenario1", class = "factor")), class = "data.frame", row.names = c("1","2", "3", "4", "5", "6", "7", "8",
                                                                                                                                                                        "9", "10"))
    s2 = structure(list(height = c(1, 2, 3, 4, 5, 6, 7, 8, 9), Suitable_Area =
                     c(20462L, 
                       20462L, 20457L, 16826L, 14847L, 13505L, 12726L, 11821L, 10853L
                     ), ScenarioTyp = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
                                                  1L), .Label = "scenario2", class = "factor")), class = "data.frame", row.names = c("1", 
                                                                                                                                     "2", "3", "4", "5", "6", "7", "8", "9"))
    
    s3 =  bind_rows(list(s1,s2)) %>%
      as_tibble()
    
    s3 %>%
      ggplot(aes(x = height, y = Suitable_Area, fill = ScenarioTyp)) +
        geom_bar(position = position_dodge(0.8) , stat = "identity", width = 0.8) +
        geom_text(aes(y = Suitable_Area + 1200, label = Suitable_Area), position = position_dodge(0.8), size = 4, angle = 90) +
        scale_x_continuous(breaks = 1:10) +
        theme_classic()
    

    enter image description here

    If you want areas with labels something like this might work:

    s3 %>%
      ggplot(aes(x = height, y =Suitable_Area, fill = ScenarioTyp)) +
        geom_area(alpha = 0.5, position = "identity") +
        geom_label(aes(y = Suitable_Area, label = Suitable_Area, color = ScenarioTyp), size = 3, fill = "white") +
        scale_x_continuous(breaks = 1:10) +
        theme_classic()
    

    enter image description here