I have downloaded a JSON extract from Big Query which has nested and repeated fields (similar to the package bigrquery) and am attempting to further manipulate the resulting tibble.
I have the following code to load from JSON and convert to a tibble
library(tidyverse)
ga.list <- lapply(readLines("temp.json"), jsonlite::fromJSON, flatten = TRUE)
ga.df <- tibble(dat = ga.list) %>%
unnest_wider(dat) %>%
mutate(id = row_number()) %>%
unnest_wider(b_nested) %>%
unnest_wider(b3) %>%
unnest_wider(b33)
So there were two list columns:
I am still working out how to extract id, rr1 and rr2 and make into a long table with repeated rows for each id row.
Note: this question has been edited a few times as I progress further along .. originally I had got stuck getting it from JSON to a tibble until I found unnest_wider()
temp.json:
{"a":"4000","b_nested":{"b1":"(not set)","b2":"some - text","b3":{"b31":"1591558980","b32":"60259425255","b33":{"b3311":"133997175"},"b4":false},"b5":true},"rr1":[],"rr2":[]} {"a":"4000","b_nested":{"b1":"asdfasdfa","b2":"some - text more","b3":{"b31":"11111","b32":"2222","b33":{"b3311":"3333333"},"b4":true},"b5":true}, "rr1":["v1","v2","v3"],"rr2":["x1","x2","x3"]} {"a":"6000","b_nested":{"b1":"asdfasdfa","b2":"some - text more","b3":{"b31":"11111","b32":"2222","b33":{"b3311":"3333333"},"b4":true},"b5":true},"rr1":["v1","v2","v3","v4","v5"],"rr2":["aja1","aja2","aja3","aja14","aja5"]}
The final piece of the puzzle; in order to get the repeating rows for repeating record
ga.df %>% select(id, rr1, rr2) %>%
unnest(cols = c(rr1, rr2))
FYI: Link to Big Query Specifying nested and repeated columns
Another solution (my preference) would be to create a tibble from rr1 and rr1 and keep as a column in ga.df so that purrr functions can be used
ga.df %>%
mutate(rr = map2(rr1, rr2, function(x,y) {
tibble(rr1 = x, rr2 = y)
})) %>%
select(-rr1, -rr2) %>%
mutate(rr_length = map_int(rr, ~nrow(.x)))