rmixed-integer-programminglpsolve

Large mixed integer programming in R - possible to solve?


I would like to solve a large mixed integer programming problem, and I have tried with R, package lpSolveAPI. The problem is large - 410 variables each of which can be either 0 or 1, and about 49422 constraints. I have tried to let it run for 1,5 days, but once I try to stop it, it says that R needs to be terminated. The same happens if I let it run for short time, say 15 minutes, and then try to stop it by clicking on the red button. Since this happens, I am not sure whether there is something wrong with my computer or whether such problem is way too large for a computer. When it runs, it uses maybe 20% of the CPU power and about 70% of memory. My computer is a 2022 Lenovo X1 Yoga with i7 2.80GHz processor and 16GB of ram.

The problem itself is constructed in the following way:

library(lpSolveAPI)
#A has length of 410. No constraints to begin with
lprec <- make.lp(0, 410)
set.objfn(lprec, round(A,0))
lp.control(lprec, sense="max")
set.type(lprec,c(1:A),"binary")

#Defining the constraints with a for loop. Will not go into details, but it adds 49422 constraints

for (){
...
      add.constraint(lprec, MyConstraint, "<=", 1)
...
}

lprec
#This says: Model name: a linear program with 410 decision variables and 49422 constraints

solve(lprec)

The vector "MyConstraint" is different in every iteration, but it has length 410 where 408 elements are 0 and two elements are 1.

That is, I have 410 objects in total, and I want to choose a set of those objects (1 if an object chosen and 0 otherwise) such that the objective function is maximized. However, some pairs of objects are not allowed, and so each of the 49422 constraints specifies which two objects cannot be chosen chosen at once: each constraint says that the sum cannot be above 1.

So, my question is if there is any way to solve this? If not, how large can such problem be in order to be solvable?

Thank you!

EDIT: ---------------------------------------------

In the comments I was asked to provide an example, so here is it. A similar, but much smaller problem. Suppose we have 7 different objects, and these can allocated into 5 groups. Let us define the groups and the associated savings denoted by A:

MyGroups <- c(1,0,0,0,1,0,0,
              0,0,1,1,0,0,0,
              0,0,1,0,0,1,0,
              0,0,0,1,0,1,0,
              0,0,1,1,0,1,0)
MyGroups <- matrix(MyGroups,nrow=5,ncol=7,byrow=TRUE)
rownames(MyGroups) <- paste0("Group", 1:5)
colnames(MyGroups) <- paste0("Object", 1:7)

A=c(50,30,100,100,200)

That is, group 1 consists of Object 1 and Object 5 (denoted by the first row in the matrix MyGroups). Such a group will give a saving of 50. Objective: to maximize the total saving by choosing the right groups. Problem: each object can only be a part of one group. For example, if group 2 is implemented, then group 3 cannot be implemented, since both groups require object 3. Here we see that the optimal solution is to choose Group 1 and Group 5, which will give a total saving of 50+200=250. I want to be able to find this for a bigger problem. So, first I can create a matrix with constraints where specifies which 2 groups cannot be implemented at the same time.

lprec2 <- make.lp(0, 5)
set.objfn(lprec2, A)
lp.control(lprec2, sense="max")
set.type(lprec2,c(1:5),"binary")

#Defining the constraints
for (i in 1:(5-1)){
  for (j in (i+1):5) {
    
    if(max(colSums(MyGroups[c(i,j),]))>1){
      #group i and group j cannot be together. Add constraint
      MyConstraint=integer(5)
      MyConstraint[c(i,j)]=1
      add.constraint(lprec2, MyConstraint, "<=", 1)
    }
  }
}

lprec2

This gives the following mixed integer problem: The final problem

When I solve it, then the solution is:

solve(lprec2)
get.objective(lprec2)
get.variables(lprec2)

Which gives 250 and (1 0 0 0 1) respectively.

In the original problem I have 410 possible groups, implying 410 decision variables. The number of constraints is 49422, but in all rows there are exactly two 1 and the remaining are 0.

If you could help me to solve such a problem, I would be happy :-). Thanks!


Solution

  • Here is the model formulated using ompr:

    MyGroups <- c(1,0,0,0,1,0,0,
                  0,0,1,1,0,0,0,
                  0,0,1,0,0,1,0,
                  0,0,0,1,0,1,0,
                  0,0,1,1,0,1,0)
    MyGroups <- matrix(MyGroups,nrow=5,ncol=7,byrow=TRUE)
    
    ngroups <- nrow(MyGroups)
    nobjects <- ncol(MyGroups)
    
    coeffs <- c(50, 30, 100, 100, 200)
    
    model <- MIPModel() %>%
      add_variable(group[i], i=1:ngroups, type = 'binary') %>% 
      add_variable(assign[i, j], i=1:ngroups, j=1:nobjects, type = 'binary', MyGroups[i, j] == 1) %>% 
      set_objective(sum_over(coeffs[i] * group[i], i=1:ngroups, sense = 'max')) %>% 
      add_constraint(sum_over(assign[i, j], i=1:ngroups, MyGroups[i, j] == 1) <= 1, j=1:nobjects) %>%
      add_constraint(assign[i, j] == group[i], i=1:ngroups, j=1:nobjects, MyGroups[i, j] == 1) %>% 
      add_constraint(sum_over(group[i], i=1:ngroups) <= 2)
    
    result <- solve_model(model, with_ROI("glpk", verbose = TRUE))
    result
    <SOLVER MSG>  ----
    GLPK Simplex Optimizer, v4.47
    16 rows, 16 columns, 35 non-zeros
    *     0: obj =  0.000000000e+000  infeas = 0.000e+000 (11)
    *    12: obj =  2.500000000e+002  infeas = 0.000e+000 (3)
    OPTIMAL SOLUTION FOUND
    GLPK Integer Optimizer, v4.47
    16 rows, 16 columns, 35 non-zeros
    16 integer variables, all of which are binary
    Integer optimization begins...
    +    12: mip =     not found yet <=              +inf        (1; 0)
    +    13: >>>>>  2.500000000e+002 <=  2.500000000e+002   0.0% (1; 0)
    +    13: mip =  2.500000000e+002 <=     tree is empty   0.0% (0; 1)
    INTEGER OPTIMAL SOLUTION FOUND
    <!SOLVER MSG> ----
    result
    Status: success
    Objective value: 250
    

    ompr is a model management wrapper around the ROI package. It using an algebraic paradigm like GAMS or AMPL but has less embedded logic to simplify the syntax. Although with ompr, you can test other solvers that ROI offers as plug-ins: http://roi.r-forge.r-project.org/

    Some are free, others like Mosek, CPLEX and Gurobi are commercial products. Suggest running a large subset problem and checking the relative performance of the different solvers.

    Also note that your toy problem is degenerate. Group(1, 3, 4) is also a solution. I added an additional constraint that can limit the number of groups selected. If your objective function coefficients are integer values the formulation may have many degenerate solutions, a simple test is to add a small random epsilon to each of the coefficients to eliminate degeneracy and see if that improves performance.