I'm using the {zipcodeR}
package. My goal is to obtain an area-code-to-zipcode reference table to incorporate the rest of the data contained in output data frames from that package, to join to other data on area code.
However, the area_code_list
column comes in a blob
data type as per the column types. I'm not familiar with this data type and don't know how to extract it. Neither the package vignettes nor documentation seem to point to any helper functions for this task. Looking at str()
, it seems the underlying type is raw
.
Ideally, I'd have one row per zipcode-area code combo as my final output. I appreciate any help.
search_state("CA") %>%
filter(zipcode %in% c("90201", "90210")) %>%
select(zipcode, area_code_list)
# Current output
# # A tibble: 2 × 2
# zipcode area_code_list
# <chr> <blob>
# 90201 <raw 15 B>
# 90210 <raw 26 B>
# Ideal output
# # A tibble: 3 × 2
# zipcode area_code_list
# <chr> <chr>
# 90201 323
# 90210 310
# 90210 424
zipcodeR
seems to get its ZCTA database from this US Zipcode Project. Looking around at the project's python package internals, the area_code_list
is typed as a CompressedJSONType
in that python code.
I guessed, but it looks like it was compressed with Gzip. We can verify this using jsonlite
to parse the column:
zipcodeR::search_state("CA") %>%
dplyr::filter(zipcode %in% c("90201", "90210")) %>%
dplyr::select(zipcode, area_code_list) %>%
dplyr::mutate(area_code_list = sapply(area_code_list, jsonlite::parse_gzjson_raw)) %>%
tidyr::unnest_longer(col = area_code_list)
#> # A tibble: 4 × 2
#> zipcode area_code_list
#> <chr> <chr>
#> 1 90201 323
#> 2 90210 310
#> 3 90210 323
#> 4 90210 424