I'm encountering issues when attempting to unnest a dataframe containing list columns. The dataframe structure and the unnesting code I'm using are as follows:
library(dplyr)
library(tidyr)
df = structure(list(num_dos = c(41713200L, 41735799L, 41740459L, 41819734L
), `DAT_SD_29 ` = list("08/07/2024 15:41", c("11/07/2024 13:16",
"09/08/2024 14:17"), "16/07/2024 13:21", "01/08/2024 17:06"),
`DAT_SD_99 ` = list(c("09/07/2024 05:09", "09/07/2024 22:58"
), c("29/07/2024 16:35", "12/07/2024 05:09", "12/07/2024 22:56",
"23/08/2024 08:22"), c("05/08/2024 15:34", "10/07/2024 10:40",
"20/07/2024 12:22", "10/08/2024 10:18"), "26/07/2024 11:50")), row.names = c(NA,
-4L), class = c("tbl_df", "tbl", "data.frame"))
df |> tidyr::unnest()
Previously, this code worked as expected, but now it's producing unexpected results. I'm getting the following error:
Error in `unnest()`:
! In row 2, can't recycle input of size 2 to size 4.
As the error states, the issue arises due to the varying list lengths between the two list columns. To return your desired result, unnest_longer()
twice:
library(dplyr)
library(tidyr)
df |>
unnest_longer(`DAT_SD_29 `) |>
unnest_longer(`DAT_SD_99 `)
# # A tibble: 15 × 3
# num_dos `DAT_SD_29 ` `DAT_SD_99 `
# <int> <chr> <chr>
# 1 41713200 08/07/2024 15:41 09/07/2024 05:09
# 2 41713200 08/07/2024 15:41 09/07/2024 22:58
# 3 41735799 11/07/2024 13:16 29/07/2024 16:35
# 4 41735799 11/07/2024 13:16 12/07/2024 05:09
# 5 41735799 11/07/2024 13:16 12/07/2024 22:56
# 6 41735799 11/07/2024 13:16 23/08/2024 08:22
# 7 41735799 09/08/2024 14:17 29/07/2024 16:35
# 8 41735799 09/08/2024 14:17 12/07/2024 05:09
# 9 41735799 09/08/2024 14:17 12/07/2024 22:56
# 10 41735799 09/08/2024 14:17 23/08/2024 08:22
# 11 41740459 16/07/2024 13:21 05/08/2024 15:34
# 12 41740459 16/07/2024 13:21 10/07/2024 10:40
# 13 41740459 16/07/2024 13:21 20/07/2024 12:22
# 14 41740459 16/07/2024 13:21 10/08/2024 10:18
# 15 41819734 01/08/2024 17:06 26/07/2024 11:50
Based on the OP's comment, if you have many list columns, or the list columns were named dynamically e.g. you don't know the names of the columns beforehand, use:
library(dplyr)
library(tidyr)
library(purrr)
df |>
select(where(is.list)) |>
names() |>
reduce(~ unnest_longer(.x, all_of(.y)), .init = df)