rdplyrtibble

Mutate a column based on pairing pattern


As said in my title, I've got a database of fighters' statistics where the names alternate. My database illustrates what I mean :

Database <- c("Fighter A", "Fighter B", "Fighter A", "Fighter  B",
                   "Fighter A", "Fighter B", "Fighter A", "Fighter B",
                   "Fighter  C", "Fighter  D", "Fighter  C", "Fighter  D", 
                   "Fighter  A", "Fighter C")

I want to add a column based on the observations in the fighters' names column. In my example above I would like a round column built as follows:

Database <- mutate(round = c(0, 0, 1, 1, 2, 2, 3, 3, 0, 0, 1, 1, 0, 0)

If the pair (fighter A and fighter B for example) is repeated, the round will receive 0, 0, 1, 1 as many times as the pattern is repeated. If the pattern is broken (fighter C and fighter D) then the round restarts at 0. It's a way to count the number of rounds where round = 0 is the average of the following rounds. In all cases, the round will always have 0, 0, 1, 1 values.

Here's what I tried without success:

fighters_name <- c("Combattant A", "Combattant B", "Combattant A", "Combattant B",
                   "Combattant A", "Combattant B", "Combattant A", "Combattant B",
                   "Combattant C", "Combattant D", "Combattant C", "Combattant D", 
                   "Combattant A", "Combattant C")


df <- tibble(fighters_name)

df <- df %>%
  mutate(pair_id = cumsum(fighters_name != lag(fighters_name, default = fighters_name[1]))) %>%
  group_by(pair_id) %>%
  mutate(round = (row_number() - 1) %% 2) %>%
  ungroup() %>%
  select(-pair_id)

If someone has an idea to build the round column I would be grateful. Thank you.


Solution

  • Possible option:

    library(tidyverse)
    
    Database <- c(
      "Fighter A", "Fighter B", "Fighter A", "Fighter  B",
      "Fighter A", "Fighter B", "Fighter A", "Fighter B",
      "Fighter  C", "Fighter  D", "Fighter  C", "Fighter  D",
      "Fighter  A", "Fighter C"
    )
    
    Database |> 
      as_tibble() |> 
      mutate(
        pairing = (row_number() + 1) %/% 2,
        value = str_squish(value)
        ) |> 
      summarise(fighters = str_c(value, collapse = "|"), .by = pairing) |> 
      mutate(round = row_number() - 1, .by = fighters) |> 
      separate_longer_delim(fighters, delim = "|")
    #> # A tibble: 14 × 3
    #>    pairing fighters  round
    #>      <dbl> <chr>     <dbl>
    #>  1       1 Fighter A     0
    #>  2       1 Fighter B     0
    #>  3       2 Fighter A     1
    #>  4       2 Fighter B     1
    #>  5       3 Fighter A     2
    #>  6       3 Fighter B     2
    #>  7       4 Fighter A     3
    #>  8       4 Fighter B     3
    #>  9       5 Fighter C     0
    #> 10       5 Fighter D     0
    #> 11       6 Fighter C     1
    #> 12       6 Fighter D     1
    #> 13       7 Fighter A     0
    #> 14       7 Fighter C     0
    

    Created on 2024-05-03 with reprex v2.1.0