rggplot2geom-sf

How to clip a geometric object by a geom_sf with ggplot?


How can I clip a geometric object geom_tiles() (plotting fitted values of a s(longitude, latitude) smooth) by a geom_sf() geometric object (so the fitted values for the area outside the country's border are not displayed or at least shaded)?

library(ggplot)

ggplot() +
  geom_tile(
    data = gam,
    aes(
      x = longitude,
      y = latitude,
      fill = .fitted)
    ) +
  geom_sf(
    data = country,
    fill = NA,
    color = "black"
  )
)

Solution

  • The easiest way to do this would be to draw a geom_tile layer, then invert the sf country outline using st_difference to paint over the non-country areas with a white mask:

    library(sf)
    library(tidyverse)
    
    ggplot(gam, aes(fill = .fitted)) +
      geom_tile(aes(longitude, latitude), color = "black") +
      geom_sf(data = st_buffer(country, 1000000) |> st_difference(country),
              fill = "white") +
      coord_sf(xlim = c(-8, 2), ylim = c(49, 60)) +
      scale_fill_viridis_c() +
      theme_bw()
    

    enter image description here

    However, this has a number of drawbacks:

    1. You lose the panel background and grid
    2. You can only draw a single country per plot this way
    3. You have to manually specify the co-ordinate limits
    4. There is no structure in the underlying data that actually represents any of the irregular coastal tiles, so you cannot calculate things like areas or spatial averages.

    For these reasons I think it is much preferable to create the grid of tiles in sf and use st_intersection to keep the country parts of your grid.

    From the code in your question I assume that gam is a data frame representing a 2D grid of fitted values, with columns called longitude, latitude and fitted, whereas country is an sf object representing a country outline.

    That being the case, you can do

    width  <- mean(diff(sort(unique(gam$longitude)))) / 2
    height <- mean(diff(sort(unique(gam$latitude)))) / 2
    
    gam$xmin <- gam$longitude - width
    gam$xmax <- gam$longitude + width
    gam$ymin <- gam$latitude  - height
    gam$ymax <- gam$latitude  + height
    
    gam <- Map(function(x1, y1, x2, y2) {
        st_polygon(list(cbind(c(x1, x2, x2, x1, x1), c(y1, y1, y2, y2, y1))))
      }, gam$xmin, gam$ymin, gam$xmax, gam$ymax) |>
      st_sfc(crs = "WGS84") |>
      st_sf() |>
      `st_geometry<-`("geometry") |>
      cbind(gam) |>
      st_intersection(country)
    
    ggplot(gam, aes(fill = .fitted)) +
      geom_sf(color = "black", linewidth = 0.1) +
      scale_fill_viridis_c() +
      theme_bw()
    

    enter image description here

    It may be less cumbersome to do this with st_make_grid, but with your existing data structure I think building the grid manually seems a bit easier.


    Data used

    You did not include any data with your question, so I have created some with the same names and structure.

    library(mgcv)
    library(rnaturalearth)
    
    set.seed(1)
    country <- ne_countries(10, country = "United Kingdom", returnclass = "sf")
    
    gam <- expand.grid(longitude = seq(-8, 2, len = 30),
                       latitude = seq(49, 60, len = 60))
    
    gam$value <- sin((gam$longitude + gam$latitude)) + rnorm(nrow(gam))
    mod <- gam(value ~ s(longitude) + s(latitude), data = gam)
    gam$.fitted <- predict(mod)