rlistdismo

Selecting subsets of each dataset in a list in R


After using kfold from the dismo package, I am attempting to select a subset of the groups that this function makes from different datasets in a list in R. In an individual datset, this is easy:

#With an individual dataset:

library(dismo)

data_car <- mtcars

group_presence <- kfold(x = data_car, k = 5) # kfold is in dismo package


# Separate observations into training and testing groups:
presence_train <- data_car[group_presence != 1, ]

But, I can't seem to get it to work across multiple datasets in a list in R:


#Now, with listed datasets:

data_1 <- mtcars
data_2 <- iris

mylist <- list(data_1, data_2)

mylist_data <- lapply(mylist, function(q) {
  data = q
  return(data)
})

mylist_groups <- lapply(mylist, function(q) {
  group_item = kfold(x = q, 
                k = 5)
  q$group_obj = group_item
  return(q)
})


presence_train <- mylist_groups[group_obj != 1, ]

#Result:

Error: object 'group_obj' not found


Solution

  • We could use Map

    out <- Map(function(x, y) x[y !=1, ], mylist, mylist_groups)
    

    where

    mylist_groups <- lapply(mylist, function(q) {
      kfold(x = q, 
                    k = 5)})
    

    -output

    > str(out)
    List of 2
     $ :'data.frame':   26 obs. of  11 variables:
      ..$ mpg : num [1:26] 21 21 22.8 21.4 18.7 18.1 14.3 24.4 22.8 19.2 ...
      ..$ cyl : num [1:26] 6 6 4 6 8 6 8 4 4 6 ...
      ..$ disp: num [1:26] 160 160 108 258 360 ...
      ..$ hp  : num [1:26] 110 110 93 110 175 105 245 62 95 123 ...
      ..$ drat: num [1:26] 3.9 3.9 3.85 3.08 3.15 2.76 3.21 3.69 3.92 3.92 ...
      ..$ wt  : num [1:26] 2.62 2.88 2.32 3.21 3.44 ...
      ..$ qsec: num [1:26] 16.5 17 18.6 19.4 17 ...
      ..$ vs  : num [1:26] 0 0 1 1 0 1 0 1 1 1 ...
      ..$ am  : num [1:26] 1 1 1 0 0 0 0 0 0 0 ...
      ..$ gear: num [1:26] 4 4 4 3 3 3 3 4 4 4 ...
      ..$ carb: num [1:26] 4 4 1 1 2 1 4 2 2 4 ...
     $ :'data.frame':   120 obs. of  5 variables:
      ..$ Sepal.Length: num [1:120] 5.1 4.9 4.7 4.6 5 5.4 4.6 5 4.4 4.9 ...
      ..$ Sepal.Width : num [1:120] 3.5 3 3.2 3.1 3.6 3.9 3.4 3.4 2.9 3.1 ...
      ..$ Petal.Length: num [1:120] 1.4 1.4 1.3 1.5 1.4 1.7 1.4 1.5 1.4 1.5 ...
      ..$ Petal.Width : num [1:120] 0.2 0.2 0.2 0.2 0.2 0.4 0.3 0.2 0.2 0.1 ...
      ..$ Species     : Factor w/ 3 levels "setosa","versicolor",..: 1 1 1 1 1 1 1 1 1 1 ...