rpurrrtidycensus

Error when using Tidycensus and map_dfr for "place" geographies


I'm using map_dfr to loop through multiple years of data for the "place" geography when pulling ACS data using TidyCensus. I'm following the template described here and here.

I'm receiving the following error and I'm stuck on debugging it. Would appreciate help in figuring out what's going on. thanks.

library(tidycensus, tidyverse, purrr)

#list years
years <- lst(2011:2020) 
names(years) <- years

# census variables
my_vars <- c(
   total_pop = "B01003_001",
   pop_poverty = "B17001_001"
)

# loop over list of years and get 1 year acs estimates
multi_year <- map_dfr(
   years,
   ~ {get_acs(
      geography = "place",
      variables = my_vars,
      year = .x,
      survey = "acs1",
      geometry = FALSE
   )  
   },
   .id = "year"  # when combining results, add id var (name of list item)
) %>%
   filter(GEOID = 0644000) %>% 
   select(-moe) %>%  
   arrange(variable, NAME) %>% 
   print()

Getting data from the 2011 1-year ACSGetting data from the 2012 1-year ACSGetting data from the 2013 1-year ACSGetting data from the 2014 1-year ACSGetting data from the 2015 1-year ACSGetting data from the 2016 1-year ACSGetting data from the 2017 1-year ACSGetting data from the 2018 1-year ACSGetting data from the 2019 1-year ACSGetting data from the 2020 1-year ACS
The 1-year ACS provides data for geographies with populations of 65,000 and greater.
Error in `map()`:
ℹ In index: 1.
ℹ With name: 2011:2020.
Caused by error in `parse_url()`:
! length(url) == 1 is not TRUE
Run `rlang::last_trace()` to see where the error occurred.
Warning messages:
1: In if (year < 2005) { :
  the condition has length > 1 and only the first element will be used
2: In if (year < 2013) { :
  the condition has length > 1 and only the first element will be used
3: In if (year < 2013) { :
  the condition has length > 1 and only the first element will be used


Solution

  • The issue is that using dplyr::lst(2011:2012) you created a list with one element, i.e. a vector of years:

    years <- lst(2011:2012)
    names(years) <- years
    
    years
    #> $`2011:2012`
    #> [1] 2011 2012
    

    Hence, instead of looping over your years you are passing a vector of years to get_acs.

    Instead use a named vector to achieve your desired result:

    library(tidycensus)
    library(dplyr, warn = FALSE)
    library(purrr)
    
    years <- c(2011:2012)
    names(years) <- years
    
    years
    #> 2011 2012 
    #> 2011 2012
    
    my_vars <- c(
      total_pop = "B01003_001",
      pop_poverty = "B17001_001"
    )
    
    multi_year <- map_dfr(
      years,
      ~ {
        get_acs(
          geography = "place",
          variables = my_vars,
          year = .x,
          survey = "acs1",
          geometry = FALSE
        )
      },
      .id = "year" # when combining results, add id var (name of list item)
    )
    #> Getting data from the 2011 1-year ACS
    #> The 1-year ACS provides data for geographies with populations of 65,000 and greater.
    #> Getting data from the 2012 1-year ACS
    #> The 1-year ACS provides data for geographies with populations of 65,000 and greater.
    
    multi_year %>%
      filter(GEOID == "0644000") %>%
      select(-moe) %>%
      arrange(variable, NAME)
    #> # A tibble: 4 × 5
    #>   year  GEOID   NAME                         variable    estimate
    #>   <chr> <chr>   <chr>                        <chr>          <dbl>
    #> 1 2011  0644000 Los Angeles city, California pop_poverty  3748793
    #> 2 2012  0644000 Los Angeles city, California pop_poverty  3789330
    #> 3 2011  0644000 Los Angeles city, California total_pop    3819708
    #> 4 2012  0644000 Los Angeles city, California total_pop    3857786