Can you help figure out the best way to resolve the length mismatch error thrown by dotsInPolys? I think it is because there are NA's or NULLs or some funk in the polygon data that makes it too long. Here's code that reproduces the error. Ultimately, I want to plot multiple races using Leaflet, but I can't produce the lat/lon needed for the random dots at this point.
require(maptools)
require(tidycensus)
person.number.divider <- 1000
census_api_key("ENTER KEY HERE", install = TRUE)
racevars <- c(White = "B02001_002", #"P005003"
Black = "B02001_003", #Black or African American alone
Latinx = "B03001_003"
)
nj.county <- get_acs(geography = "county", #tract
year = 2015,
variables = racevars,
state = "NJ", #county = "Harris County",
geometry = TRUE,
summary_var = "B02001_001")
library(sf)
st_write(nj.county, "nj.county.shp", delete_layer = TRUE)
nj <- rgdal::readOGR(dsn = "nj.county.shp") %>%
spTransform(CRS("+proj=longlat +datum=WGS84"))
nj@data <- nj@data %>%
tidyr::separate(NAME,
sep =",",
into = c("county", "state")) %>%
dplyr::select(estimat,variabl, GEOID, county) %>%
spread(key = variabl, value = estimat) %>%
mutate(county = trimws(county))
black.dots <- dplyr::select(nj@data, Black) / person.number.divider #%>%
black.dots <- dotsInPolys(nj, as.integer(black.dots$Black), f="random")
# Error in dotsInPolys(nj, as.integer(black.dots$Black), f = "random") :
# different lengths
length(nj) # 63 This seems too many, because I believe NJ has 21 counties.
length(black.dots$Black) # 21
This post (Advice on troubleshooting dotsInPolys error (maptools)) came close to helping me, but I couldn't see how to apply it to my case.
I can change the length of the nj spatialpolygonsdataframe by removing NA's and counties with a black pop greater than 0, but then the map doesn't plot multiple counties (maybe there is something wrong with the census download?).
It looks like you might have gotten this figured out, but I wanted to share another approach that uses sf::st_sample()
instead of maptools::dotsInPolys()
. One advantage of this is that you don't need to convert the sf
object you get from tidycensus
to a sp
object.
In the following example I split the census data by race into a list three sf
objects then perform st_sample()
on each element of the list (each race). Next, I recombine the sampled points into one sf
object with a new race variable for each point. Finally, I use tmap
to make a map, though you could use ggplot2
or leaflet
to map as well.
library(tidyverse)
library(tidycensus)
library(sf)
library(tmap)
person.number.divider <- 1000
racevars <- c(White = "B02001_002", #"P005003"
Black = "B02001_003", #Black or African American alone
Latinx = "B03001_003"
)
# get acs data with geography in "tidy" form
nj.county <- get_acs(geography = "county", #tract
year = 2015,
variables = racevars,
state = "NJ", #county = "Harris County",
geometry = TRUE,
summary_var = "B02001_001"
)
# split by race
county.split <- nj.county %>%
split(.$variable)
# randomly sample points in polygons based on population
points.list <- map(county.split, ~ st_sample(., .$estimate / person.number.divider))
# combine points into sf collections and add race variable
points <- imap(points.list, ~ st_sf(tibble(race = rep(.y, length(.x))), geometry = .x)) %>%
reduce(rbind)
# map!
tm_shape(nj.county) +
tm_borders(col = "darkgray", lwd = 0.5) +
tm_shape(points) +
tm_dots(col = "race", size = 0.01, pal = "Set2")
I don't have enough rep to post the map image directly, but here it is.