I have a relatively large number of coordinates for which I'd like to get the census tract (in addition to the FIPS code). I know that I can look up individual lat/lon pairs using call_geolocator_latlon
(as done here), but this seems impractical for my purposes as the function issues a single call to the census bureaus' API, and I imagine would take a very long time to run on my ~200,000 pairs.
Is there a faster way to do this, perhaps by downloading shapefiles for each state using the block_groups
function and mapping from lat/lon to census tract from there?
This doesn't use tigris
, but utilizes sf::st_within()
to check a data frame of points for overlapping tracts.
I'm using tidycensus
here to get a map of California's tracts into R.
library(sf)
ca <- tidycensus::get_acs(state = "CA", geography = "tract",
variables = "B19013_001", geometry = TRUE)
Now to sim some data:
bbox <- st_bbox(ca)
my_points <- data.frame(
x = runif(100, bbox[1], bbox[3]),
y = runif(100, bbox[2], bbox[4])
) %>%
# convert the points to same CRS
st_as_sf(coords = c("x", "y"),
crs = st_crs(ca))
I'm doing 100 points here to be able to ggplot()
the results, but the overlap calculation for 1e6 is fast, only a few seconds on my laptop.
my_points$tract <- as.numeric(st_within(my_points, ca)) # this is fast for 1e6 points
The results:
head(my_points) # tract is the row-index for overlapping census tract record in 'ca'
# but part would take forever with 1e6 points
library(ggplot2)
ggplot(ca) +
geom_sf() +
geom_sf(data = my_points, aes(color = is.na(tract)))