rlistbind-rows

bind_rows on a list fails if some sub-list elements are empty. Why?


Assuming the following list:

x <- list(list(q = 1880L, properties = list(), last_Import_Date = "2024-09-16"), 
          list(q = 1888L, properties = list(list(a = "x", b = "y")), last_Import_Date = "2024-09-16"),
          list(q = 1890L, properties = list(list(a = "x", b = "y")), last_Import_Date = "2024-09-16"))

I want to convert this list into a data frame (rowwise). Usually, dplyr::bind_rows works well. However, my list has some elements that are sometimes empty ("properties"), in which case bind_rows fails in a way that it only keeps those rows that are not empty.

Can someone explain why that is?

And is there any (short) fix for it? I'm currently using rather ugly workarounds using list2DF, then transposing, then converting to data frame, then assigning names.

Wrong results (only keep non-empty properties):

x |>
  bind_rows()

# A tibble: 2 × 3
      q properties       last_Import_Date
  <int> <list>           <chr>           
1  1888 <named list [2]> 2024-09-16      
2  1890 <named list [2]> 2024-09-16 

UPDATE: where I need some additional help is with unnesting such a special "properties" column. Using unnest_longer will result in the same "bug" that deletes the NULL row, and using unnest_wider requires some extra workaround for fixing names.


Solution

  • bind_rows uses vctrs::data_frame under the hood. It turns out vctrs::data_frame creates empty dataframe when there is an element with 0 length (i.e. list(0), interger(0), character(0).etc):

    vctrs::data_frame(!!!list(q = 1880L, properties = list(), last_Import_Date = "2024-09-16"),.name_repair="unique")
    [1] q                properties       last_Import_Date
    <0 rows> (or 0-length row.names)
    
    vctrs::data_frame(a=list("a"),b= integer(0))
    [1] a b
    <0 rows> (or 0-length row.names)
    
    vctrs::data_frame(a=list(),b= 1)
    [1] a b
    <0 rows> (or 0-length row.names)
    

    One alternative is to use vctrs::vec_rbind:

    vctrs::vec_rbind(!!!x)
         q properties last_Import_Date
    1 1880       NULL       2024-09-16
    2 1888       x, y       2024-09-16
    3 1890       x, y       2024-09-16