rdplyrdata.tabledtplyr

pivot_longer gives error when using dtplyr


I have a large dataset I'm trying to tidy using dtplyr. It consists of a large number (>1000) of date-value pairs for various locations. The original uses a pivot_longer, which works fine in dplyr, but gives an error in dtplyr. Is there a way to fix this, maintaining the performance benefits of dtplyr?

This works

library(tidyverse)
library(dtplyr)
library(data.table)

my_data_tb <- tribble(
  ~`date-A`, ~`value-A`, ~`date-B`, ~`value-B`,
  "date1", 1, "date2", 2,
  "date2", 1, "date3", 2 
)

my_data_tb %>% 
  pivot_longer(
  cols = everything(),
  names_to = c(".value", "grid_square"),
  names_sep = "-"
)

but this gives the error:

my_data_dt <- as.data.table(my_data_tb)
my_data_dt <- lazy_dt(my_data_dt)

my_data_dt %>%
  pivot_longer(
    cols = everything(),
    names_to = c(".value", "grid_square"),
    names_sep = "-"
  )

The error message is:

Error: Can't subset elements that don't exist.
x The locations 1 and 2 don't exist.
i There are only 0 elements.
Run rlang::last_error() to see where the error occurred.
In addition: Warning message:
Expected 2 pieces. Missing pieces filled with NA in 7 rows [1, 2, 3, 4, 5, 6, 7].
rlang::last_error()
Error: Internal error: Trace data is not square.

UPDATE - it now gives this error message:

Error in UseMethod("pivot_longer") : no applicable method for 'pivot_longer' applied to an object of class "c('dtplyr_step_first', 'dtplyr_step')"

As an aside, this also works, but I think it loses the dtplyr performance gain:

my_data_dt %>%
  as_tibble() %>%
  pivot_longer(
    cols = everything(),
    names_to = c(".value", "grid_square"),
    names_sep = "-"
  )

Solution

  • Dtplyr version 1.2.0 is now available on CRAN, which means this issue is now resolved!

    For anyone experiencing this error, check/update your version of dtplyr to ensure you are running >=1.2.0:

    install.packages("dtplyr")
    

    (NB. this isn't updated as part of the tidyverse packages so make sure to do it separately)

    https://www.tidyverse.org/blog/2021/12/dtplyr-1-2-0/

    https://cran.r-project.org/web/packages/dtplyr/index.html