rdplyrgisr-sf

sf: Find bounding boxes for groups and comparison with ArcGIS Pro results


I am new to sf and trying to generate bounding boxes (as well as calculate their area) for grouped points in the sf package. I am able to get bounding boxes that appear correct, but am running into issues when trying to plot things. I am also curious about the differences between how bounding boxes are created by st_bbox and ArcGIS Pro's 'Minimum Bounding Geometry' tool.

Question 1) Is there an easier way than my code below to get the grouped bounding boxes? I based it off of: https://github.com/r-spatial/sf/issues/1179. I also checked out Create polygons representing bounding boxes for subgroups using sf, but that approach did not work for me.

library(tidyverse)
library(sf)
library(units)

# Sim some locations

ID <- rep(seq(1:4),each = 4)
NC <- st_read(system.file("shape/nc.shp", package = "sf"), quiet = TRUE) %>% filter(FIPS == c("37007", "37123", "37167", "37179")) %>% st_transform(crs = 26917)
points <- st_sample(NC, size = c(4,4), type = "random") %>% st_transform(crs = 26917)
Example <- bind_rows(list(geometry = points))
Example$ID <- ID
Example <- st_as_sf(Example, crs = 26917)
ggplot() + geom_sf(data = NC) + geom_sf(data = Example, aes(color = ID))
st_write(Example, "Example.csv", layer_options = "GEOMETRY=AS_XY") # For export to compare to ArcPro
Example2 <- Example # We'll use this later

# Find bounding box by group. Pulled the function and process from https://github.com/r-spatial/sf/issues/1179

st_bbox_by_feature = function(x) {
  x = st_geometry(x)
  f <- function(y) st_as_sfc(st_bbox(y))
  do.call("c", lapply(x, f))
}

# This works up to the point where I try to map it

Example <- Example %>% dplyr::group_by(ID) %>% dplyr::summarise() %>% st_cast("MULTIPOINT") # Convert this to a multipoint geometry
Example.BoundBox <- st_bbox_by_feature(Example)
Example$BoundBox <- st_cast(Example.BoundBox, "MULTIPOLYGON",
                            ids = Example$ID,
                            group_or_split = TRUE)
Example$BoundBox_Area <- st_area(Example.BoundBox)
Example.BoundBox <- st_as_sf(Example.BoundBox, crs = 26917)
Example <- st_transform(Example, crs = 26917)

Map1 <- ggplot() +  geom_sf(data = Example.BoundBox) + geom_sf(data = Example, aes(color = ID)) 
Map1

Map2 <- ggplot() + geom_sf(data = Example, aes(color = ID)) 
Map2

Map3 <- ggplot() + geom_sf(data = Example.BoundBox) 
Map3

I can make Map3, but I get the following error trying to map the original points (Map1 or Map2): Error in st_transform.sfc(X[[i]], ...) : cannot transform sfc object with missing crs.

But when I check the CRS and active geometry of the original points, it returns the correct CRS and says the MULTIPOINT geometry is active.

st_geometry(Example) 
st_crs(Example)
st_geometry(Example.BoundBox)

Question 2) Why does geom_sf think I don't have a CRS set in my approach above?

Question 3) I can get things to work and map properly using the approach below where I never append the bounding box to my original Example data frame. The problem here, however, is that the Example.BoundBox variable loses the column that has the group ID. Is there a way to keep that ID during my process so I can color the bounding boxes and original points by ID?

Example2 <- Example.ORIGINAL 
Example2 <- Example2 %>% dplyr::group_by(ID) %>% dplyr::summarise() %>% st_cast("MULTIPOINT") # Convert this to a multipoint geometry
Example2.BoundBox <- st_bbox_by_feature(Example2)
Example2$BoundBox_Area <- st_area(Example2.BoundBox)
Example2.BoundBox <- st_as_sf(Example2.BoundBox, crs = 26917)
st_geometry(Example2)
st_geometry(Example2.BoundBox) 

Map4 <- ggplot() +  geom_sf(data = Example2.BoundBox) + geom_sf(data = Example2, aes(color = ID)) 
Map4

Question 4) When I convert the XY data to points in ArcPro and use the 'Minimum Bounding Geometry" tool, it's clear that st_bbox and ArcPro have different processes for determining these bounding boxes (pictures attached). I think the ArcPro approach is better for my interests, is there a setting in st_bbox I can use to mirror this?

Bounding Boxes made by ArcPro Bounding Boxes made by sf


Solution

  • crs is an attribute of sfc (geometry column), for sf object it's the crs of active geometry column, but other geometry columns might have a different or missing crs. And single geometries in sfc do not have a crs attribute, e.g:

    st_crs(st_geometry(NC))$Name
    #> [1] "NAD83 / UTM zone 17N"
    st_crs(st_geometry(NC)[[1]])
    #> Coordinate Reference System: NA
    

    So when building bboxes from sfc items as in st_bbox_by_feature(), returned object will not have it either.

    Though it is kind of unexpected that that ggplot fails when there's an issue with a geometry column it should not really care about, but apparently it attempts to transform all existing sfc columns.

    Modified st_bbox_by_feature() and polygons from st_minimum_rotated_rectangle() :

    library(tidyverse)
    library(sf)
    #> Linking to GEOS 3.12.1, GDAL 3.8.4, PROJ 9.3.1; sf_use_s2() is TRUE
    
    st_bbox_by_feature = function(x) {
      x = st_geometry(x)
      f <- function(y) st_as_sfc(st_bbox(y))
      do.call("c", lapply(x, f)) |> 
        st_set_crs(st_crs(x))
    }
    
    set.seed(42)
    NC <- 
      read_sf(system.file("shape/nc.shp", package = "sf"), quiet = TRUE) |> 
      filter(FIPS %in% c("37007", "37123", "37167", "37179")) |> 
      st_transform(crs = 26917)
    
    Example <- 
      NC[,"geometry"] |> 
      mutate(ID = row_number() |> factor()) |> 
      split(~ID) |> 
      map(\(x) mutate(x, geometry = st_sample(geometry, 4) |> st_union())) |> 
      bind_rows() |> 
      mutate(
        bbox_rect = st_bbox_by_feature(geometry),
        min_rect  = st_minimum_rotated_rectangle(geometry)
      )
    Example
    #> Simple feature collection with 4 features and 1 field
    #> Active geometry column: geometry
    #> Geometry type: MULTIPOINT
    #> Dimension:     XY
    #> Bounding box:  xmin: 534114.7 ymin: 3852971 xmax: 622364.9 ymax: 3927168
    #> Projected CRS: NAD83 / UTM zone 17N
    #> # A tibble: 4 × 4
    #>                                         geometry ID                    bbox_rect
    #> *                               <MULTIPOINT [m]> <fct>             <POLYGON [m]>
    #> 1 ((589272.7 3917694), (601535.2 3927168), (607… 1     ((589272.7 3894158, 6223…
    #> 2 ((561263.4 3907589), (564217.7 3925598), (568… 2     ((561263.4 3895442, 5837…
    #> 3 ((534114.7 3871532), (534554.2 3861650), (552… 3     ((534114.7 3852971, 5562…
    #> 4 ((578869.2 3886585), (580300.2 3879335), (599… 4     ((578869.2 3866860, 6029…
    #> # ℹ 1 more variable: min_rect <POLYGON [m]>
    
    # check sfc columns
    Example |> 
      select(where(\(x) is(x, "sfc"))) |> 
      map(\(geom) st_crs(geom) |> format()) |>
      str()
    #> List of 3
    #>  $ geometry : chr "NAD83 / UTM zone 17N"
    #>  $ bbox_rect: chr "NAD83 / UTM zone 17N"
    #>  $ min_rect : chr "NAD83 / UTM zone 17N"
    
    ggplot() + 
      geom_sf(data = NC, alpha = .5) + 
      geom_sf(data = Example, aes(color = ID)) +
      geom_sf(data = Example, aes(geometry = bbox_rect, color = ID), fill = NA, linewidth = .5, alpha = .2) +
      geom_sf(data = Example, aes(geometry = min_rect, fill = ID), alpha = .4) +
      theme_minimal() 
    

    Created on 2024-10-19 with reprex v2.1.1