arraysrderivativematrix-indexing

Clean way to compute jacobian of array summation


I'm doing some optimization in R and in connection with that I need to write a function that returns a jacobian. It's a very simple jacobian -- just zeros and ones -- but I'd like to populate it quickly and cleanly. My current code works but is very sloppy.

I have a four-dimensional array of probabilities. Index the dimensions by i, j, k, l. My constraint is that, for each i, j, k, the sum of probabilities over index l must equal 1.

I compute my constraint vector like this:

get_prob_array_from_vector <- function(prob_vector, array_dim) {
    return(array(prob_vector, array_dim))
}

constraint_function <- function(prob_vector, array_dim) {
    prob_array <- get_prob_array_from_vector(prob_vector, array_dim)
    prob_array_sums <- apply(prob_array, MARGIN=c(1, 2, 3), FUN=sum)
    return(as.vector(prob_array_sums) - 1)  # Should equal zero
}

My question is: what is a clean, fast way of computing the jacobian of as.vector(apply(array(my_input_vector, array_dim), MARGIN=c(1, 2, 3), FUN=sum)) -- i.e., my constraint_function in the code above -- with respect to my_input_vector?

Here is my sloppy solution (which I check for correctness against the jacobian function from the numDeriv package):

library(numDeriv)

array_dim <- c(5, 4, 3, 3)

get_prob_array_from_vector <- function(prob_vector, array_dim) {
    return(array(prob_vector, array_dim))
}

constraint_function <- function(prob_vector, array_dim) {
    prob_array <- get_prob_array_from_vector(prob_vector, array_dim)
    prob_array_sums <- apply(prob_array, MARGIN=c(1, 2, 3), FUN=sum)
    return(as.vector(prob_array_sums) - 1)
}

constraint_function_jacobian <- function(prob_vector, array_dim) {
    prob_array <- get_prob_array_from_vector(prob_vector, array_dim)
    jacobian <- matrix(0, Reduce("*", dim(prob_array)[1:3]), length(prob_vector))
    ## Must be a faster, clearner way of populating jacobian
    for(i in seq_along(prob_vector)) {
        dummy_vector <- rep(0, length(prob_vector))
        dummy_vector[i] <- 1
        dummy_array <- get_prob_array_from_vector(dummy_vector, array_dim)
        dummy_array_sums <- apply(dummy_array, MARGIN=c(1, 2, 3), FUN=sum)
        jacobian_row_idx <- which(dummy_array_sums != 0, arr.ind=FALSE)
        stopifnot(length(jacobian_row_idx) == 1)
        jacobian[jacobian_row_idx, i] <- 1
    }  # Is there a fast, readable one-liner that does the same as this for loop?
    stopifnot(sum(jacobian) == length(prob_vector))
    stopifnot(all(jacobian == 0 | jacobian == 1))
    return(jacobian)
}

## Example of a probability array satisfying my constraint
my_prob_array <- array(0, array_dim)
for(i in seq_len(array_dim[1])) {
    for(j in seq_len(array_dim[2])) {
        my_prob_array[i, j, , ] <- diag(array_dim[3])
    }
}
my_prob_array[1, 1, , ] <- 1 / array_dim[3]
my_prob_array[2, 1, , ] <- 0.25 * (1 / array_dim[3]) + 0.75 * diag(array_dim[3])

my_prob_vector <- as.vector(my_prob_array)  # Flattened representation of my_prob_array
should_be_zero_vector <- constraint_function(my_prob_vector, array_dim)
is.vector(should_be_zero_vector)
all(should_be_zero_vector == 0)  # Constraint is satistied

## Check constraint_function_jacobian for correctness using numDeriv
jacobian_analytical <- constraint_function_jacobian(my_prob_vector, array_dim)
jacobian_numerical <- jacobian(constraint_function, my_prob_vector, array_dim=array_dim)
max(abs(jacobian_analytical - jacobian_numerical))  # Very small

My functions take prob_vector as input -- i.e., a flattened representation of my probability array -- because optimization functions require vector arguments.


Solution

  • Spend some time to understand what you were trying to do, but here is a proposition to replace your constraint_function_jacobian:

    enhanced <- function(prob_vector, array_dim) {
      firstdim <- Reduce("*", array_dim[1:3])
      seconddim <- length(prob_vector)
      jacobian <- matrix(0, firstdim, seconddim)
      idxs <- split(1:seconddim, cut(1:seconddim, array_dim[4], labels=FALSE))
      for (i in seq_along(idxs)) {
        diag(jacobian[, idxs[[i]] ]) <- 1
      }
      stopifnot(sum(jacobian) == length(prob_vector))
      stopifnot(all(jacobian == 0 | jacobian == 1))
      jacobian
    }
    

    Unless I'm wrong, the jacobian construction is filling diagonals with 1, as it is not a square matrix we have to split it on array_dim[4] square matrix to fill up their diagonals with 1.

    I did get rid of the transformation of prob_vector to an array to then get its dim as it will be the same as array_dim, skipping this step is not a huge improvement but it simplify the code IMO.

    Results are ok according to test:

    identical(constraint_function_jacobian(my_prob_vector, array_dim),
              enhanced(my_prob_vector, array_dim))
    # [1] TRUE
    

    According to benchmark it gives a great speedup:

    microbenchmark::microbenchmark(
      original=constraint_function_jacobian(my_prob_vector, array_dim),
      enhanced=enhanced(my_prob_vector, array_dim), times=100)
    
    # Unit: microseconds
    #     expr       min        lq      mean     median         uq       max neval cld
    # original 16946.979 18466.491 20150.304 19066.7410 19671.4100 28148.035   100   b
    # enhanced   678.222   737.948   799.005   796.3905   834.5925  1141.773   100  a