ralgorithmsubset-sum

Select rows where sum of a column equals a fixed value in R


I would like to get one (or all) possible combination of rows where sum of quantity column equals 20

here an example :

structure(list(id = 1:10, quantity = c(11L, 1L, 4L, 12L, 19L, 10L, 3L, 13L, 16L, 14L)), class ="data.frame", row.names = c(NA,-10L))

id quantity
1   11          
2   1           
3   4           
4   12          
5   19          
6   10          
7   3           
8   13          
9   16          
10  14  

desired output (one possible set) :

id quantity
3   4           
7   3           
8   13

or

id quantity
 2  1           
 5  19          

Solution

  • Here is another base R solution by defining a recursive function subsetSum (I guess this would be faster since it avoids checking through all combinations)

    subsetSum <- function(v, target, r = c()) {
        if (sum(r) == target) {
            return(list(r))
        }
        unlist(lapply(seq_along(v), function(k) subsetSum(v[-(1:k)], target, c(r, v[k]))), recursive = FALSE)
    }
    

    Then, when running

    target <- 20
    lst <- subsetSum(setNames(df$quantity, seq(nrow(df))), target)
    res <- Map(function(v) df[as.integer(names(v)), ], lst)
    

    you will get

    > res
    [[1]]
      id quantity
    2  2        1
    3  3        4
    4  4       12
    7  7        3
    
    [[2]]
      id quantity
    2  2        1
    5  5       19
    
    [[3]]
      id quantity
    2  2        1
    7  7        3
    9  9       16
    
    [[4]]
      id quantity
    3  3        4
    7  7        3
    8  8       13
    
    [[5]]
      id quantity
    3  3        4
    9  9       16
    

    If you want only one of the subset sum which reaches the given value, you can try subsetsum from package adagio

    library(adagio)
    target <- 20
    res <- df[subsetsum(df$quantity,target)$inds,]
    

    which gives

    > res
      id quantity
    2  2        1
    5  5       19