rrep

Replicate a range of values if condition is satisfied


I am attempting to add a column (Transaction) to the below sample data frame in which the logic is that for every "New Value" in the index column, the replication of the values will start over. The value in index will be labeled "New" at random amongst the entire data frame (150,000+ rows). I'm looking to have the first row start with 1 and after every 1:3 sequence, the sequence will start over at 1 after 3 unless there is "New" in the index column, in which the sequence automatically starts back over at 1. I have attempted utilizing rep() and ifelse in various combinations with little success. Also, the Transaction column is currently empty with no values. Thank you in advance!

Index Transaction
1
2
3
1
2
3
New 1
2
New 1
2
3
1
New 1
2

Solution

  • Here is a first attempt:

    library(tidyverse)
    
    # Creating the data frame: 
    df <- data.frame(index = rep("", 14))
    df[c(7,9,13), 'index'] <- 'New'
    
    
    # Defining a run index:
    df$run <- cumsum(df$index == "New")
    
    
    df %>% 
      group_by(run) %>% 
      mutate(Transaction = ifelse( (1:n())%%3==0, 3, 1:n()%%3  )) %>%
      ungroup() %>% select(-run)
    
      # A tibble: 14 x 2
       index Transaction
       <chr>       <dbl>
     1 ""              1
     2 ""              2
     3 ""              3
     4 ""              1
     5 ""              2
     6 ""              3
     7 "New"           1
     8 ""              2
     9 "New"           1
    10 ""              2
    11 ""              3
    12 ""              1
    13 "New"           1
    14 ""              2