rresamplingmultilevel-analysis

Resampling multilevel data by group


I am trying to write a function that resamples names nested in groups. My function works for resampling without respect to groups, but I don't want to create samples of names that aren't in the same group.

Here's the function, where x is a vector of all names (some repeated), a is a vector of unique name observations, and b is a vector of unique names in randomized order.

    rep <- function(x,a,b){
      for(i in 1:length(a)){
        x1 <- x
        x1[which(x==a[i])] <- b[i]
      }
      x1
    }
x <- c("Smith", "Jones", "Washington", "Miller", "Wells", "Smith", "Smith", "Miller")
a <- sort(unique(x))
b <- sample(a, length(a))

dat <- rep(x, a, b)
View(dat)
"Smith"      "Jones"      "Washington" "Miller"     "Jones"      "Smith"      "Smith"       "Miller" 

However, each name is nested in a group, so I need to avoid creating samples of names that are not in the same group. For example:

x         groupid
Smith       A1
Jones       B1
Washington  C1
Miller      A2
Wells       B1
Smith       A2
Smith       A3
Miller      A3

How can I account for that?


Solution

  • This would be easier to accomplish with the tidyverse packages:

    library(tidyverse)
    
    txt <- 'x         groupid
    Smith       A1
    Jones       B1
    Washington  C1
    Miller      A2
    Wells       B1
    Smith       A2
    Smith       A3
    Miller      A3'
    
    df <- read_table(file = txt)
    
    set.seed(0)
    df.new <- df %>% 
      group_by(groupid) %>% 
      mutate(
        b = sample(unique(x), n(), replace = T)
      ) %>% 
      arrange(groupid)
    
      x          groupid b         
      <chr>      <chr>   <chr>     
    1 Smith      A1      Smith     
    2 Miller     A2      Miller    
    3 Smith      A2      Smith     
    4 Smith      A3      Smith     
    5 Miller     A3      Miller    
    6 Jones      B1      Wells     
    7 Wells      B1      Jones     
    8 Washington C1      Washington