rblobrawzipcoder

Get area codes from area_code_list using zipcodeR


I'm using the {zipcodeR} package. My goal is to obtain an area-code-to-zipcode reference table to incorporate the rest of the data contained in output data frames from that package, to join to other data on area code.

However, the area_code_list column comes in a blob data type as per the column types. I'm not familiar with this data type and don't know how to extract it. Neither the package vignettes nor documentation seem to point to any helper functions for this task. Looking at str(), it seems the underlying type is raw.

Ideally, I'd have one row per zipcode-area code combo as my final output. I appreciate any help.

search_state("CA") %>% 
  filter(zipcode %in% c("90201", "90210")) %>% 
  select(zipcode, area_code_list)

# Current output
# # A tibble: 2 × 2
# zipcode area_code_list
# <chr>           <blob>
# 90201       <raw 15 B>
# 90210       <raw 26 B>

# Ideal output
# # A tibble: 3 × 2
# zipcode area_code_list
# <chr>           <chr>
# 90201           323
# 90210           310
# 90210           424



Solution

  • zipcodeR seems to get its ZCTA database from this US Zipcode Project. Looking around at the project's python package internals, the area_code_list is typed as a CompressedJSONType in that python code.

    I guessed, but it looks like it was compressed with Gzip. We can verify this using jsonlite to parse the column:

    zipcodeR::search_state("CA") %>% 
      dplyr::filter(zipcode %in% c("90201", "90210")) %>% 
      dplyr::select(zipcode, area_code_list) %>%
      dplyr::mutate(area_code_list = sapply(area_code_list, jsonlite::parse_gzjson_raw)) %>%
      tidyr::unnest_longer(col = area_code_list)
    
    #> # A tibble: 4 × 2
    #>   zipcode area_code_list
    #>   <chr>   <chr>         
    #> 1 90201   323           
    #> 2 90210   310           
    #> 3 90210   323           
    #> 4 90210   424