rggplot2timeline

Timelime in r with ggplot without determining initial and final values


I need help in creating a script for a graph. The information is below:

I have this fictitious data

table <- data.frame(ind = c("Ind1","Ind1","Ind1","Ind1","Ind1","Ind1","Ind2",
                      "Ind2","Ind2","Ind3","Ind3","Ind3","Ind3","Ind4",
                      "Ind4","Ind4","Ind5","Ind5","Ind5","Ind5","Ind5",
                      "Ind5"),

           photo = c("55", "62", "63", "65", "70", "97", "100", "105",
                    "109", "72", "74", "76", "101", "140", "150", "170",
                    "168", "172", "182", "185", "189", "194"),


           data = c("jan/17", "mar/17", "mar/17", "apr/17",
                     "jun/17", "oct/17", "dec/17", "apr/18",
                     "may/18", "aug/17", "sep/17", "sep/17",
                     "dec/17", "aug/18", "nov/18", "feb/19",
                     "jan/19", "feb/19", "mar/19", "mar/19",
                     "mar/19", "jul/19")) 

and I would like to generate a graph like this, with the names of individuals x date of the meeting. I wanted the size of the symbols according to the number of photos existing in that month, and the number of photos above the symbol (like this).

Everything I find on the internet uses a data frame with two columns (starting x and final x)i.e. here . Do I really need to separate into columns? And how to do with intermediary values??


Solution

  • You can code your month/year data using as.yearmon from zoo package.

    To count number of photos each month, would group_by and summarise.

    To draw the line segments, would create a second data table to specify minimum and maximum dates.

    library(zoo)
    library(ggplot2)
    library(dplyr)
    
    my_table$ind <- factor(my_table$ind)
    my_table$mo_yr <- as.yearmon(my_table$data, "%b/%y")
    
    my_table_sum <- my_table %>%
      group_by(mo_yr, ind) %>%
      summarise(count = n())
    
    my_table_range <- my_table_sum %>%
      group_by(ind) %>%
      summarise(min = min(mo_yr),
                max = max(mo_yr))
    
    ggplot(data = my_table_sum, aes(x = mo_yr, y = ind)) +
      scale_x_yearmon() +
      geom_point(aes(size = count)) +
      geom_text(aes(label = ifelse(count > 1, as.character(count), '')), vjust = -1) +
      scale_size_continuous(range = c(1, 3), breaks = c(1,2,3)) +
      geom_segment(data = my_table_range, aes(x = min, xend = max, y = ind, yend = ind)) +
      theme(axis.title.x=element_blank(), axis.title.y=element_blank(), legend.position="none")
    

    timeline plot

    Edit: For greater flexibility in x-axis ticks and labels, you might want to use scale_x_date instead of scale_x_yearmon (package zoo would not be needed).

    scale_x_date will allow indication of breaks (every 3 months) and what you want in the label (right now month and 4 digit year, e.g., Mar 2019).

    Instead of converting your data to a yearmon (month/year), we can just use Date (using 1st day of the month when converting).

    Also added small margin around plot.

    #library(zoo)
    library(ggplot2)
    library(dplyr)
    
    my_table$ind <- factor(my_table$ind)
    #my_table$mo_yr <- as.yearmon(my_table$data, "%b/%y")
    my_table$dates <- as.Date(paste0("1/", my_table$data), format = "%d/%b/%y")
    
    my_table_sum <- my_table %>%
      group_by(dates, ind) %>%
      summarise(count = n())
    
    my_table_range <- my_table_sum %>%
      group_by(ind) %>%
      summarise(min = min(dates),
                max = max(dates))
    
    ggplot(data = my_table_sum, aes(x = dates, y = ind)) +
      scale_x_date(date_breaks = "3 months", date_labels = "%b %Y") +
      geom_point(aes(size = count)) +
      geom_text(aes(label = ifelse(count > 1, as.character(count), '')), vjust = -1) +
      scale_size_continuous(range = c(1, 3), breaks = c(1,2,3)) +
      geom_segment(data = my_table_range, aes(x = min, xend = max, y = ind, yend = ind)) +
      theme(axis.title.x=element_blank(), axis.title.y=element_blank(), legend.position="none",
            plot.margin=unit(c(1,1,1,1),"cm"))
    

    plot with ticks and labels every 3 months