rmatrixindicestriangular

How to efficiently generate lower triangle indices of a symmetric matrix


I need to generate lower triangle matrix indices (row and columns pairs). The current implementation is inefficient (memory wise) specially when symmetric matrix gets big (more than 50K rows). Is there a better way?

rows <- 2e+01
id <- which(lower.tri(matrix(, rows, rows)) == TRUE, arr.ind=T)
head(id)

#      row col
# [1,]   2   1
# [2,]   3   1
# [3,]   4   1
# [4,]   5   1
# [5,]   6   1
# [6,]   7   1

Solution

  • Here's another approach:

    z <- sequence(rows)
    cbind(
      row = unlist(lapply(2:rows, function(x) x:rows), use.names = FALSE),
      col = rep(z[-length(z)], times = rev(tail(z, -1))-1))
    

    Benchmarks with larger data:

    library(microbenchmark)
    
    rows <- 1000
    m <- matrix(, rows, rows)
    
    ## Your current approach
    fun1 <- function() which(lower.tri(m) == TRUE, arr.ind=TRUE)
    
    ## An improvement of your current approach
    fun2 <- function() which(lower.tri(m), arr.ind = TRUE)
    
    ## The approach shared in this answer
    fun3 <- function() {
      z <- sequence(rows)
      cbind(
        row = unlist(lapply(2:rows, function(x) x:rows), use.names = FALSE),
        col = rep(z[-length(z)], times = rev(tail(z, -1))-1))
    }
    
    ## Sven's answer
    fun4 <- function() {
      row <- rev(abs(sequence(seq.int(rows - 1)) - rows) + 1)
      col <- rep.int(seq.int(rows - 1), rev(seq.int(rows - 1)))
      cbind(row, col)
    }
    
    microbenchmark(fun1(), fun2(), fun3(), fun4())
    # Unit: milliseconds
    #    expr       min        lq   median       uq       max neval
    #  fun1() 77.813577 85.343356 90.60689 95.71648 130.40059   100
    #  fun2() 73.812204 82.103600 85.87555 90.59235 138.66547   100
    #  fun3()  9.016237  9.382506 10.63291 13.20085  55.42137   100
    #  fun4() 20.591863 24.999702 28.82232 31.90663  65.05169   100