rexperimental-design

How to produce a a randomized number sequence within seven block without boundary repeat using R?


There are 9 treatments and we want to have 7 blocks. In each block, the treatment should be repeated once.

The 9 treatments are marked as follows:

-Treatment 1 (1-7)
-Treatment 2 (8-14)
-Treatment 3 (15-21)
-Treatment 4 (22-28)
-Treatment 5 (29-35)
-Treatment 6 (36-42)
-Treatment 7 (43-49)
-Treatment 8 (50-56)
-Treatment 9 (57-63)

Each number represents a pot. We want these pots randomised in 7 blocks (columns) but we don't want two pot of the same treatment adjacent to each other - highlighted in grey:

How would I go about this in R?


Solution

  • If I'm interpreting it correctly, this should work.

    We'll do a two-step sampling:

    1. First, sample the treatment group itself, making it much easier to determine if a particular row in the block is in the same treatment group as the same row, previous block.
    2. Second, sample one from each of the proven-safe groups.

    I'll use a random seed here for reproducibility, do not use set.seed(.) in production.

    set.seed(42)
    nBlocks <- 7
    treatments <- list(1:7, 8:14, 15:21, 22:28, 29:35, 36:42, 43:49, 50:56, 57:63)
    blocks <- Reduce(function(prev, ign) {
      while (TRUE) {
        this <- sample(length(treatments))
        if (!any(this == prev)) break
      }
      this
    }, seq.int(nBlocks)[-1], init = sample(length(treatments)), accumulate = TRUE)
    blocks <- do.call(cbind, blocks)
    blocks
    #       [,1] [,2] [,3] [,4] [,5] [,6] [,7]
    #  [1,]    1    3    4    2    8    2    1
    #  [2,]    5    1    2    4    5    7    9
    #  [3,]    9    8    9    3    1    3    5
    #  [4,]    7    9    3    6    7    9    3
    #  [5,]    2    4    8    5    4    1    4
    #  [6,]    4    7    1    9    6    4    2
    #  [7,]    8    6    5    7    2    6    8
    #  [8,]    3    5    6    8    9    5    6
    #  [9,]    6    2    7    1    3    8    7
    

    Here each column is a "block", and each number represents the treatment group assigned to each row. You can see that no rows contain the same group in subsequent columns.

    For instance, the first column ("block 1") will have something from the Treatment 1 group in the first row, Treatment 5 group in row two, etc. Further, inspection will show that all treatments are included in each block column, an inferred requirement of the experimental design.

    (FYI, it is theoretically possible that this will take a while based on the random conditions. Because it repeats per-column, it should be relatively efficient, though. I have no safeguards here for too-long-execution, but I don't think it is required: the conditions here do not lend to a high likelihood of "failure" requiring much repetition.)

    The next step is to convert each of these group numbers into a number from the respective treatment group.

    apply(blocks, 1:2, function(ind) sample(treatments[[ind]], 1))
    #       [,1] [,2] [,3] [,4] [,5] [,6] [,7]
    #  [1,]    6   17   22   11   54   14    3
    #  [2,]   30    3   13   22   33   48   58
    #  [3,]   63   55   61   15    4   21   33
    #  [4,]   49   60   21   36   43   58   21
    #  [5,]   12   25   55   32   27    7   25
    #  [6,]   24   46    4   58   38   28   11
    #  [7,]   53   38   35   49   11   36   56
    #  [8,]   16   29   36   56   63   29   40
    #  [9,]   36    8   47    3   19   50   43
    

    To verify, in the first matrix, our first three rows (block 1) were 1, 5, and 9, which should translate into 1-7, 29-35, and57-63, respectively. "6" is within 1-7, "30" is within 29-35, and "63" is within 59-63. Inspection will show the remainder to be correct.

    Because of the step of determining treatment groups first, it is much simpler to verify/guarantee that you will not repeat treatment groups in a row between two adjacent blocks.


    EDIT

    Rules:

    1. The same treatment group may not be on the same row in adjacent columns; and
    2. The same treatment (not group) may not be in any row in adjacent columns.

    We can use the same methodology as before. Note that as any groups become smaller, the iteration time may increase but I do not expect it likely to get into an infinite loop. (However, if you inadvertently have a group of length 1, then ... this will never end.)

    nBlocks <- 7
    treatments <- list(1:7, 8:14, 15:21, 22:28, 29:35, 36:42, 43:49, 50:56, 57:63)
    
    # helper function for randomized selection of treatments given groups
    func <- function(grp) cbind(grp, sapply(treatments[grp], sample, size = 1))
    set.seed(42)
    func(c(1,3,5))
    #      grp   
    # [1,]   1  1
    # [2,]   3 19
    # [3,]   5 29
    

    And then the same Reduce mindset:

    set.seed(42)
    blocks <- Reduce(function(prev, ign) {
      while (TRUE) {
        this1 <- sample(length(treatments))
        if (!any(this1 == prev[,1])) break
      }
      while (TRUE) {
        this2 <- func(this1)
        if (!any(this2[,2] %in% prev[,2])) break
      }
      this2
    }, seq.int(nBlocks-1), init = func(sample(length(treatments))), accumulate = TRUE)
    blocks <- do.call(cbind, blocks)
    groups <- blocks[, seq(1, by = 2, length.out = nBlocks)]
    treats <- blocks[, seq(2, by = 2, length.out = nBlocks)]
    

    From this, we have two products (though you will likely only care about the second):

    1. The treatment groups, good to verify rule 1 above: no group may be in the same row in adjacent columns:

      groups
      #       grp grp grp grp grp grp grp
      #  [1,]   1   3   1   7   8   5   1
      #  [2,]   5   1   2   8   2   7   3
      #  [3,]   9   8   5   2   1   4   6
      #  [4,]   7   9   6   3   4   8   5
      #  [5,]   2   4   7   9   3   9   4
      #  [6,]   4   7   4   5   7   1   2
      #  [7,]   8   6   9   1   9   6   7
      #  [8,]   3   5   8   6   5   2   9
      #  [9,]   6   2   3   4   6   3   8
      
    2. The treatments themselves, for rule 2 above, where no treatment may be in adjacent columns:

      treats
      #                           
      #  [1,]  7 19  2 47 51 33  3
      #  [2,] 35  4 12 50  8 44 15
      #  [3,] 60 51 35 10  1 22 41
      #  [4,] 43 58 41 21 26 55 31
      #  [5,] 12 24 43 57 17 57 26
      #  [6,] 27 49 26 34 48  6 11
      #  [7,] 53 36 62  6 62 36 47
      #  [8,] 16 33 54 42 32 10 62
      #  [9,] 37  9 15 27 37 18 56
      

    Edit 2:

    Another rule:

    1. Each treatment group must be seen exactly once in each row and column (requiring a square experimental design).

    I think this is effectively generating a sudoku-like matrix of treatment groups, and once that is satisfied, backfill rule #2 (no repeat treatments in adjacent columns). One way (though it is hasty) is suggested by https://gamedev.stackexchange.com/a/138228:

    set.seed(42)
    vec <- sample(9)
    ind <- sapply(cumsum(c(0, 3, 3, 1, 3, 3, 1, 3, 3)), rot, x = vec)
    apply(ind, 1, function(z) all(1:9 %in% z)) # all rows have all 1-9, no repeats
    # [1] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
    apply(ind, 1, function(z) all(1:9 %in% z)) # ... columns ...
    # [1] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
    ind
    #       [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9]
    #  [1,]    1    7    8    3    5    2    4    6    9
    #  [2,]    5    2    3    6    9    4    8    1    7
    #  [3,]    9    4    6    1    7    8    3    5    2
    #  [4,]    7    8    1    5    2    3    6    9    4
    #  [5,]    2    3    5    9    4    6    1    7    8
    #  [6,]    4    6    9    7    8    1    5    2    3
    #  [7,]    8    1    7    2    3    5    9    4    6
    #  [8,]    3    5    2    4    6    9    7    8    1
    #  [9,]    6    9    4    8    1    7    2    3    5
    

    This makes a rather fixed-style of random group arrangements given the constraints on groups. Since this is a design of experiments, if you're going to use this method (and proximity between blocks is at all a concern), then you should likely randomize columns and/or rows of the ind matrix before sampling the treatments themselves. (You can do columns and rows, just do them piece-wise, and it should preserve the constraints.)

    ind <- ind[sample(9),][,sample(9)]
    ind
    #       [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9]
    #  [1,]    2    3    8    1    4    7    9    6    5
    #  [2,]    7    8    4    6    2    9    5    3    1
    #  [3,]    1    7    9    4    5    6    3    2    8
    #  [4,]    8    1    6    9    3    4    2    5    7
    #  [5,]    5    2    7    8    9    1    6    4    3
    #  [6,]    3    5    1    7    6    8    4    9    2
    #  [7,]    4    6    3    5    8    2    7    1    9
    #  [8,]    6    9    5    2    1    3    8    7    4
    #  [9,]    9    4    2    3    7    5    1    8    6
    

    From here, we can enact rule 2:

    treatments <- list(1:7, 8:14, 15:21, 22:28, 29:35, 36:42, 43:49, 50:56, 57:63)
    
    mtx <- do.call(rbind, Reduce(function(prev, ind) {
      while (TRUE) {
        this <- sapply(treatments[ind], sample, size = 1)
        if (!any(prev %in% this)) break
      }
      this
    }, asplit(ind, 2)[-1],
    init = sapply(treatments[ind[,1]], sample, size = 1),
    accumulate = TRUE))
    mtx
    #       [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9]
    #  [1,]   11   44    4   52   30   15   23   41   59
    #  [2,]   16   56   49    3   12   33   39   57   27
    #  [3,]   52   24   60   40   46    2   20   29   13
    #  [4,]    1   37   23   63   56   48   32   12   17
    #  [5,]   24   10   30   16   58   39   50    2   47
    #  [6,]   49   57   41   25    6   52   11   17   34
    #  [7,]   59   31   19   14   38   23   47   51    7
    #  [8,]   41   17   11   33   24   61    5   43   54
    #  [9,]   29    4   51   45   20    8   58   28   40