rggplot2tidyversemeangplots

How to manipulate the n.label values with paste0() for the number of observations using the plotmean() function in the gplot() package in R


Problem:

I have a data frame called FID and I want to label the number of observations for each mean and the upper and lower confidence intervals. Because the data frame shows the number of observations per month over three years (below), the n.labels are being labelled as n=3 (see figure 1 below).

I created two vectors called observations and month_level and I was hoping to use the function paste0() to insert the true n.label values into the plot (see R code). When I tried to paste these n.label values onto the plot, the plotline joining the mean values tend to disappear and so do the n.labels themselves (see figure 2 below), in conjunction with two of the x-axis labels for months (January-December) disappearing (see figure 3).

If anyone is able to help to place the correct n.label values (see the true values below) on this plot, I would be deeply appreciative.

Many thanks in advance.

Key:

n.label = a logical value indicating whether text giving the number of observations in each group should be added to the plot.

 ##Three instances of each month over 3 years
 Year     Month       FID                             Month       FID  
 2018    January       86                             January     208
 2019    January       66                             February    176
 2020    January       56
 2018    February      76
 2019    February      55
 2020    February      45

 January (n=3)
 February (n=3) etc...............

The correct number of observations per month over three years (viewed below):

##the correct n.labels are these observations

       Month Observations 
1    January       113  
2   February        94  
3      March       111  
4      April       111  
5        May        33  
6       June         9   
7       July        14   
8     August        89  
9  September        86  
10   October        83  
11  November        81  
12  December       101  

R-code:

      library(gplots)
      library(tidyverse)

    ##Produce a vector showing the true n.label value of observations
        
     Observations<-c(113, 94, 111, 111, 33, 9, 14, 89, 86, 83, 81, 101)

    ##Create a vector to ensure the dates are in the right order

     month_levels = c('January', 'February', 'March', 'April', 'May', 'June', 
                 'July', 'August', 'September', 'October', 'November', 'December')
    
    ##Plot means, ci_labels, and n.lables for the column 'FID'

    ##Open plotting window
     dev.new()
    
    ##Plot the mean per month for FID but with incorrect n.label values
    ##Code for figure 1

           plotmeans(FID~Month, 
                    data=FID,
                    ci.label = TRUE,
                    mean.labels = TRUE,
                    n.label = TRUE,
                    digits = 2,
                    pch=0.3, 
                    col="red",
                    ccol="black",
                    barcol="blue",
                    ylab="Mean Blue Whale Sightings",
                    xlab="Months")

          ##Open plotting window
            dev.new()

          ##Code for figure 2

          plotmeans(FID~Month, 
                    data=FID,
                    ci.label = TRUE,
                    mean.labels = TRUE,
                    n.label = paste0("month_levels", levels=Observations),
                    digits = 2,
                    pch=0.3, 
                    col="red",
                    ccol="black",
                    barcol="blue",
                    ylab="FID",
                    xlab="Months")

       ##Plot means for the 'Final_New_Blue'
       ##Open plotting window
         dev.new(width=10, height=10, unit="in")

       ## Margins area
         par(oma=c(3,3,3,3)) # all sides have 3 lines of space

          Obs <-c(111, 33, 9, 14, 89, 86, 83, 81, 101, 113, 94, 111)
           
               plotmeans(FID~Month, 
                         data=FID,
                         ci.label = TRUE,
                         mean.labels = TRUE,
                         n.label = FALSE,
                         digits = 2,
                         pch=0.3, 
                         col="red",
                         ccol="black",
                         barcol="blue",
                         ylab="FID",
                         xlab="Months")
           
                         axis(1, at=1:12, labels = paste("n =", Obs), pos = -35, col = NA)
           
           

Figure 1

enter image description here

Figure 2

enter image description here

Figure 3

enter image description here

Data frame: FID

structure(list(Year = c(2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 
2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 2016L, 2016L, 2016L, 
2016L, 2016L, 2016L, 2016L, 2016L, 2016L, 2016L, 2016L, 2016L, 
2017L, 2017L, 2017L, 2017L, 2017L, 2017L, 2017L, 2017L, 2017L, 
2017L, 2017L, 2017L), Month = structure(c(5L, 4L, 8L, 1L, 9L, 
7L, 6L, 2L, 12L, 11L, 10L, 3L, 5L, 4L, 8L, 1L, 9L, 7L, 6L, 2L, 
12L, 11L, 10L, 3L, 5L, 4L, 8L, 1L, 9L, 7L, 6L, 2L, 12L, 11L, 
10L, 3L), .Label = c("April", "August", "December", "February", 
"January", "July", "June", "March", "May", "November", "October", 
"September"), class = "factor"), FID = c(65L, 88L, 43L, 54L, 
98L, 0L, 0L, 23L, 10L, 15L, 6L, 33L, 56L, 29L, 98L, 23L, 6L, 
10L, 7L, 65L, 53L, 41L, 25L, 30L, 44L, 65L, 38L, 27L, 20L, 0L, 
8L, 45L, 34L, 26L, 44L, 39L)), class = "data.frame", row.names = c(NA, 
-36L))

Solution

  • First, some bad news: plotmeans() might not be the best function to do what you're doing. The problem is that n.label is just a true/false value that determines whether the plot will sum the number of observations and add them to the axis. plotmeans() doesn't let you change that value unless you edit the function's code, which takes time.

    Now for the good news: it is possible to get around this limitation and fix your plot manually. First, set n.label to FALSE:

    # Rank factor levels by month name
    FID$Month <- factor(FID$Month, levels = month.name)
    
    ##Code for figure 2
    dev.new()
    plotmeans(FID~Month, 
              data=FID,
              ci.label = TRUE,
              mean.labels = TRUE,
              n.label = FALSE,
              digits = 2,
              pch=0.3, 
              col="red",
              ccol="black",
              barcol="blue",
              ylab="FID",
              xlab="Months")
    

    Now you can manually add the number of observations above your x-axis using the Base R function axis():

    Obs <-c(111, 33, 9, 14, 89, 86, 83, 81, 101, 113, 94, 111)
    axis(1, at=1:12, labels = paste("n =", Obs), pos = -70, col = NA)
    

    To adjust the position of the labels, change the value of pos =.

    Result: Alice's plot