rdplyrrep

Is there a simpler version of renaming columns with alternating patterns? Or tidyverse methods?


My Data

So I have a data frame that I am working with below:

structure(list(V1 = c(3L, 3L, 3L, 2L, 4L, 1L), V2 = c(1L, 1L, 
1L, 1L, 1L, 1L), V3 = c(2L, 2L, 2L, 1L, 3L, 2L), V4 = c(2L, 2L, 
3L, 1L, 1L, 1L), V5 = c(3L, 3L, 4L, 1L, 3L, 3L), V6 = c(3L, 3L, 
4L, 3L, 3L, 3L), V7 = c(2L, 2L, 1L, 1L, 3L, 3L), V8 = c(3L, 3L, 
4L, 4L, 3L, 3L), V9 = c(3L, 3L, 3L, 2L, 3L, 3L), V10 = c(2L, 
2L, 1L, 1L, 1L, 1L)), row.names = c(NA, 6L), class = "data.frame")

It looks like this:

 V1 V2 V3 V4 V5 V6 V7 V8 V9 V10
1  3  1  2  2  3  3  2  3  3   2
2  3  1  2  2  3  3  2  3  3   2
3  3  1  2  3  4  4  1  4  3   1
4  2  1  1  1  1  3  1  4  2   1
5  4  1  3  1  3  3  3  3  3   1
6  1  1  2  1  3  3  3  3  3   1

Solution So Far

The best code I have come up with for renaming the variables quickly is this:

new_names <- outer("cope",
                   1:10, 
                   paste, 
                   sep="_")
names(data1) <- new_names
data1

Which gives me this data frame:

  cope_1 cope_2 cope_3 cope_4 cope_5 cope_6 cope_7 cope_8 cope_9 cope_10
1      3      1      2      2      3      3      2      3      3       2
2      3      1      2      2      3      3      2      3      3       2
3      3      1      2      3      4      4      1      4      3       1
4      2      1      1      1      1      3      1      4      2       1
5      4      1      3      1      3      3      3      3      3       1
6      1      1      2      1      3      3      3      3      3       1

Question

While this serves my purpose well enough, it has made me consider two questions for the future. First, is there a way to simplify the code down in order to make it one line? I was thinking something that worked within dplyr if possible because that is what I am most accustomed to working with.

Second, I foresee issues on the horizon if there are, say, 30 variables, with some having repeating patterns and some being unique. What is the most economical use of time when renaming variables like these? I know rep is one option, but I am only aware of how it can repeat but not separate values into multiple patterns. I'm thinking along the lines of something like this, which would be easier to write with some kind of pattern and stops:

names <- c("v1","v2","v3","c1","c2","c3","u","p","z1","z2")

For example:

names <- c("v1","v2","v3","c1","c2","c3","u","p","z1","z2")
colnames(data1) <- names
data1

  v1 v2 v3 c1 c2 c3 u p z1 z2
1   3  1  2  2  3  3 2 3  3  2
2   3  1  2  2  3  3 2 3  3  2
3   3  1  2  3  4  4 1 4  3  1
4   2  1  1  1  1  3 1 4  2  1
5   4  1  3  1  3  3 3 3  3  1
6   1  1  2  1  3  3 3 3  3  1
7   3  1  3  1  3  2 2 2  3  2
8   3  2  1  2  3  2 3 3  2  1
9   3  2  4  1  2  4 2 3  4  1
10  4  2  4  2  3  4 3 3  4  1

This is time-consuming if you spell it out manually:

names <- c("cope_1", "cope_2","cope_3","sad_1","sad_2","sad_3","u","p","zip_1","zip_2")
colnames(data1) <- names
data1

Which does get you what you want, yet slowly:

  cope_1 cope_2 cope_3 sad_1 sad_2 sad_3 u p zip_1 zip_2
1      3      1      2     2     3     3 2 3     3     2
2      3      1      2     2     3     3 2 3     3     2
3      3      1      2     3     4     4 1 4     3     1
4      2      1      1     1     1     3 1 4     2     1
5      4      1      3     1     3     3 3 3     3     1
6      1      1      2     1     3     3 3 3     3     1

And something like outer doesnt seem to fit here:

outer("cope",
      1:3,
      paste,
      sep="_",
      "sad",
      1:3,
      paste,
      sep="_",
      "u",
      "p")

So if there is a better way of naming chunks of variables like this, that would be great to know.


Solution

  • If you have a vector x with the names and a vector r with the number of replications, then you could do:

    x <- c("v", "c", "u", "p", "z")
    r <- c(3L, 3L, 1L, 1L, 3L)
    
    f <- function(n) if (n > 1L) seq_len(n) else character(n)
    paste0(rep(x, r), unlist(lapply(r, f)))
    ## [1] "v1" "v2" "v3" "c1" "c2" "c3" "u"  "p"  "z1" "z2" "z3"
    

    If you are fine with "u1" and "p1", then you can simplify a bit:

    paste0(rep(x, r), unlist(lapply(r, seq_len)))
    ## [1] "v1" "v2" "v3" "c1" "c2" "c3" "u1" "p1" "z1" "z2" "z3"
    

    There is also base R's make.unique. It is more literate, but it awkwardly only numbers duplicates, so it doesn't quite give you what you want:

    make.unique(rep(x, r), sep = "")
    ## [1] "v"  "v1" "v2" "c"  "c1" "c2" "u"  "p"  "z"  "z1" "z2"