rloopsnestedvectorizationiris-dataset

How to modify all the columns of each data set of a nested data in one go?


I have this nested data

I want to unnest it, but I have to standardize the classes of the columns before to unnest

`library(tidyverse`)

    nested_data<-iris %>% nest(data = !Species)


#I added to the third dataset an additionnal variable
nested_data$data[[3]]$randomVar<-round(rnorm(nrow(
  nested_data$data[[3]]),100,5),1)


#I dropped a column of the second dataset
nested_data$data[[2]]$Sepal.Length<-NULL

#I changed the type of certain variables

nested_data$data[[2]]$Petal.Length<- as.character(
  nested_data$data[[2]]$Petal.Length)

nested_data$data[[1]]$Petal.Width<-as.character(
  nested_data$data[[1]]$Petal.Width
)

With different type of classes for certain variables I can not unnest

nested_data%>%unnest(data)

I have this error message:

Error: Can't combine `..1$Petal.Length` <double> and `..2$Petal.Length` <character>.
Run `rlang::last_error()` to see where the error occurred.

I want to change in character all the variables of each of the three datasets in one line of codes using a for loop or any vectorization method.

I have no idea how to do it.


Solution

  • If the column types are different accidentally, then can use type.convert before the unnest

    library(dplyr)
    library(tidyr)
    library(purrr)
    nested_data %>%
       mutate(data = type.convert(data, as.is = TRUE)) %>% 
       unnest(data)
    

    -output

    # A tibble: 150 × 6
       Species Sepal.Length Sepal.Width Petal.Length Petal.Width randomVar
       <fct>          <dbl>       <dbl>        <dbl>       <dbl>     <dbl>
     1 setosa           5.1         3.5          1.4         0.2        NA
     2 setosa           4.9         3            1.4         0.2        NA
     3 setosa           4.7         3.2          1.3         0.2        NA
     4 setosa           4.6         3.1          1.5         0.2        NA
     5 setosa           5           3.6          1.4         0.2        NA
     6 setosa           5.4         3.9          1.7         0.4        NA
     7 setosa           4.6         3.4          1.4         0.3        NA
     8 setosa           5           3.4          1.5         0.2        NA
     9 setosa           4.4         2.9          1.4         0.2        NA
    10 setosa           4.9         3.1          1.5         0.1        NA
    # … with 140 more rows
    

    Or if type.convert wouldn't work (because of character elements, then force the columns to be of type character, unnest and then change the column types with type.convert

    nested_data %>%
      mutate(data = map(data,~ 
       .x %>% 
        mutate(across(everything(), as.character)))) %>% 
      unnest(data) %>% 
      type.convert(as.is = TRUE)