rstatisticspivot-tabletapplytabulate

Create a two-way table in R base that displays concatenated game results between players


Objective To create a function in R {base} that produces a square matrix showing concatenated game results between players. The matrix should display the result of each pairwise match, concatenating results if multiple matches occurred. The desired output is a custom string representation of game results and not a standard statistical summary.

# Example: 
#   player 1 draws player 4  (column 1)
#   player 2 beats player 3  (colums 2)
#   player 3 loses to player (column 3)
#
hch <- intToUtf8(0x00BD)  # half character   
idx <-
  data.frame( player   = c(  1, 2, 3,  4,   1, 2, 3, 4,   1, 2,  3, 4,  4, 1 ),
              opponent = c(  4, 3, 2,  1,   2, 1, 4, 3,   3, 4,  1, 2,  1, 4 ),
              result   = c(hch, 1, 0,hch,   0, 0, 1, 0,  NA, 1, NA, 0,  1, 0 )
            )

How to convert the above data into a cross table as below:
Desired output

#     [,1] [,2] [,3] [,4] 
# [1,] x    0    .    ½0  
# [2,] 0    x    1    1   
# [3,] .    0    x    1   
# [4,] ½1   0    0    x   

If I change ‘½’ to .5 then gives xtabs(result ~ player + opponent, idx) a matrix with a summary result, namely the sum of the players'' results among themselves.

#     opponent
#player   1   2   3   4
#     1 0.0 0.0     0.5
#     2 0.0 0.0 1.0 1.0
#     3     0.0 0.0 1.0
#     4 1.5 0.0 0.0 0.0

I do not understand how to adopt the standard aggregation functions for this specific string concatenation requirement.

I tried the following rather cumbersome coding:

# Create a cross table from an index data frame (idx)
# If players meet multiple times, concatenate the results
crosstab <- matrix(nrow = 4, ncol = 4, NA )     
for (i in  seq_along(idx[,1]) ) {
  if (is.na(crosstab[idx[i, 1], idx[i, 2]])) {
    crosstab[idx[i, 1], idx[i, 2]] <- idx[i, 3]
  } else {
    crosstab[idx[i, 1], idx[i, 2]] <- paste0(crosstab[idx[i, 1], idx[i, 2]], idx[i, 3])
  }
}
print(head(crosstab, 20), quote = FALSE, na.print=".") 

I am looking for a more concise approach using aggregate functions like tabulate, xtabs

Edit

After the answer of @Onyambu, I started looking for a function to convert the three column data frame directly to a matrix. And found: How to convert dataframe with 3 columns into matrix in R.

tapply(idx$result, idx[1:2], FUN = paste0, collapse=‘’)

Which gives:

#       opponent
# player 1    2   3    4   
#      1 NA   "0" "NA" "½0"
#      2 "0"  NA  "1"  "1" 
#      3 "NA" "0" NA   "1" 
#      4 "½1" "0" "0"  NA

Note that without the collapse = parameter, the result becomes quite different:

#tapply(idx$result, idx[1:2], FUN = paste)
#      opponent
#player 1           2    3    4          
#     1 NULL        "0"  "NA" character,2
#     2 "0"         NULL "1"  "1"        
#     3 "NA"        "0"  NULL "1"        
#     4 character,2 "0"  "0"  NULL

Depending on the details, some preprocessing (remove NA's) or clean-up (after) is needed.


Solution

  • Try the following:

    
    fn <- function(x){
      x <- ifelse(is.na(x), '.', x)
      paste0(iconv(x, "latin1", "UTF-8"), collapse = "")
    }
    
    # you could also use fn2 instead of fn
    fn2 <- function(x){
      x <- ifelse(is.na(x), '.', x)
      paste0(stringi::stri_unescape_unicode(encodeString(x)), collapse = "")
    }
    library(tidyverse)
    idx %>%
      arrange(opponent) %>%
      pivot_wider(id_cols = player, names_from = opponent,
                  values_from = result, 
                  values_fn = fn, values_fill = 'x') %>% 
      arrange(player) %>%
      column_to_rownames('player') %>%
      as.matrix() %>%
      matrix(nrow = nrow(.),
             dimnames = list(player = rownames(.), opponent = colnames(.)))
    
          opponent
    player 1    2   3   4   
         1 "x"  "0" "." "½0"
         2 "0"  "x" "1" "1" 
         3 "."  "0" "x" "1" 
         4 "½1" "0" "0" "x"
    

    Base R:

    fn <- function(x){
      x <- ifelse(is.na(x), '.', x)
      paste0(iconv(x, "latin1", "UTF-8"), collapse = "")
    }
    
    a <- aggregate(result~., idx, fn)
    b <- matrix(".", max(a$player), max(a$opponent))
    b[as.matrix(a[-3])] <- a$result
    diag(b) <- "x"
    dimnames(b) <- lapply(a[-3], \(x)sort(unique(x)))
    b
          opponent
    player 1    2   3   4   
         1 "x"  "0" "." "½0"
         2 "0"  "x" "1" "1" 
         3 "."  "0" "x" "1" 
         4 "½1" "0" "0" "x"