rggplot2labelposition-dodgegeom-sf

Position Dodge Doesn't Dodge geom_sf_label


I'm trying to create a map of the USA displaying the amount of water in each state. I've got the basics of the ggplot together, the problem I'm running into involves putting labels on the map. When I try using geom_sf_label the labels all clump on top of one another for the eastern part of the country:

Map of USA. Labels are fine on the west side, but the eastern side of the country is all clumped together.

I tried solving this by using the position= argument inside of geom_sf_label, specifically dodge and dodge2. Neither has improved the map regardless of what number I pass to the width argument, I end up with the same cluttered visual. Is there a way to get position dodge to prevent labels from stacking on top of one another? Here is code to reproduce the image.

library(tidyverse)
library(sf)
library(tigris)
library(viridis)

us_states <- states(cb = TRUE, resolution = "20m")

us_states_shifted <- shift_geometry(us_states)

us_states_shifted%>% 
  ggplot(aes(fill = AWATER)) +
  geom_sf()+
  scale_fill_viridis(option = 'cividis') +
  geom_sf_label(aes(label=NAME),size=2, position=position_dodge2(width=2), fill = "white")+
  #geom_sf_text(aes(label = abbvAndRate,size=2))+
  #geom_text(aes(label=abbvAndRate, x = longitude, y = latitude))+
  theme_void()+
  theme(legend.position="none")+
  ggtitle("Amount of Water In Each State")

Solution

  • ggrepel() is often touted as a solution for dealing with overlapping labels. There is no doubt it is an extremely powerful tool for automating label placement. It is also quite unwieldy and this can make it hard to get an aesthetically balanced result. It can be like playing 'Whack-A-Mole' e.g. fixing one issue often creates new issues elsewhere.

    Unless you are working with hundreds of labels, to get a satisfactory result I find it almost always faster to manually assign label placements. Here is a potential workflow:

    1. plot your map (as you have done) and identify issues
    2. address issues where labels should remain inside the map extent e.g. Mississippi and Alabama
    3. address labels that will be best placed outside the map extent e.g. the smaller states on the Eastern Seaboard. Shift labels uniformly longitudinally, and evenly spacing them latitudinally e.g. yend = seq(min(lat)-2.5e5, max(lat), length.out = n()) in the code below

    This workflow is fully adaptable and you can subset any number of values to customise their placement. It can take some trial and error to get the offsets right, but IMHO, it is worth it.

    library(sf)
    library(tigris)
    library(viridis)
    library(dplyr)
    library(ggplot2)
    
    # Get US sf
    us_states <- states(cb = TRUE, resolution = "20m")
    
    # Shift non-contiguous states, add centroid lon/lat values for ggrepel,
    # address label placement issue with Mississippi and Alabama
    us_states_shifted <- shift_geometry(us_states) %>%
      mutate(lon = st_coordinates(st_centroid(.))[,1],
             lat = st_coordinates(st_centroid(.))[,2],
             lat = case_when(NAME == "Mississippi" ~ lat -0.5e5,
                             NAME == "Alabama" ~ lat + 0.5e5,
                             .default = lat))
    
    # Create vector of remaining problem labels
    move_labels <- c("Connecticut", "Delaware", "District of Columbia",
                     "Maryland", "Massachusetts", "New Hampshire", "New Jersey",
                     "Rhode Island", "Vermont")
    
    # Subset states and assign offset values for labels and segments
    move_states <- us_states_shifted %>%
      filter(NAME %in% move_labels) %>%
      arrange(lat) %>%
      mutate(xend = 2.2e6,
             yend = seq(min(lat)-2.5e5, max(lat), length.out = n()))
    
    # Plot
    us_states_shifted %>% 
      ggplot(aes(fill = AWATER)) +
      geom_sf()+
      scale_fill_viridis(option = "virdis") +
      geom_label(data = filter(us_states_shifted, !NAME %in% move_labels),
                 aes(x = lon, y = lat,
                     label = NAME),
                 fill = "white",
                 size = 2) +
      geom_label(data = move_states,
                 aes(x = xend, y = yend,
                     label = NAME),
                 fill = "white",
                 size = 2,
                 hjust = 0) +
      geom_segment(data = move_states, 
                   aes(lon, lat, xend = xend, yend = yend),
                   colour = "grey60",
                   linewidth = 0.3) +
      coord_sf(clip = "off") +
      theme_void() +
      theme(legend.position = "none") +
      ggtitle("Amount of Water In Each State")
    

    result

    There's an argument to be made regarding offsetting the Puerto Rico and Hawaii labels too. And it may be better to use a halo around the text rather than a rectangle as the rectangles are obscuring a lot of plot real estate underneath.