pythonoptimizationcvxpyconvex-optimization

Two matrix multiplication operations in CVXPY gives UNKNOWN curvature and makes the problem not DCP


I want to solve the following optimization problem using cvxpy:

Maximize(y @ A @ x)

where A is a (n,m)-size matrix of positive valued elements; x is boolean vector of is length m; y is boolean vector is length n.

The constraints are: sum(x)==1 and sum(y)==1

For example, if A = [[ 87, 96, 127, 46], [155, 166, 92, 11], [111, 163, 126, 112]] then the solution is: x = [0,1,0,0] y = [0,1,0] resulting in: y @ A @ x = 166, which is the largest value given the constraints.

However, cvxpy throws an error and says:

DCPError: Problem does not follow DCP rules. Specifically: The objective is not DCP. Its following subexpressions are not: var27 @ [[ 87. 96. 127. 46.] [155. 166. 92. 11.] [111. 163. 126. 112.]] @ var26

The expression y @ A @ x is evaluated to Expression(UNKNOWN, NONNEGATIVE, ()), suggesting the curvature is unknown even though A @ x and y @ A are both AFFINE and NONNEGATIVE.

What is the workaround for this seemingly simple problem to make it work using cvxpy?

My code is:

import cvxpy as cp
import numpy as np

rows = 3
cols = 4

A = np.random.randint(0,255,size=(rows,cols))
x =  cp.Variable(cols, boolean=True)
y =  cp.Variable(rows, boolean=True)

objective = cp.Maximize(y @ A @ x)
constraints = [cp.sum(x)==1,cp.sum(y)==1]
prob = cp.Problem(objective,constraints)

prob.solve()

I tried doing the following to give an attribute to the randint matrix:

a = np.random.randint(0,255,size=(rows,cols))
A = cp.Variable((rows,cols),nonneg=True)
A.value = A.project(a)

I tried making it a square matrix with PSD=True. Tried, pos=True. I also tried setting prob.solve(qcp=True). All of these still resulted in the above error message about DCP.


Solution

  • I can linearize the problem by solving for a single variable w that encodes the values of x and y. Since the constraint for both x and y is that there is only 1 non-zero element and y @ A @ x only chooses 1 element from matrix A - then we can solve for A_flat @ w, where A_flat is the flattened matrix and the constraint for w is that there is only one non-zero element. In this way the problem can be solved using cvxpy. Solving for the example in the question:

    A = np.asarray([[87,  96, 127,  46], [155, 166,  92,  11], [111, 163, 126, 112]])
    Af = cp.vec(A) #Flattened into vector array
    
    w = cp.Variable(A.size, boolean=True) #encodes x and y boolean array values
    
    objective = cp.Maximize(Af @ w) #Equivalent to y @ A @ x
    
    constraints = [cp.sum(w)==1] #constraint boolean array to only have only 1 non-zero value
    
    prob = cp.Problem(objective,constraints)
    prob.solve() #solve using cvxpy
    
    xy = np.reshape(w.value,(A.shape[0],A.shape[1]),order="F")
    index = np.argwhere(xy==1)
    x = xy[index[0][0],:] #solution for the x boolean array
    y = xy[:,index[0][1]] #solution for the y boolean array
    

    Then the result is:

    A @ w = 166.0
    y @ A @ x = 166.0
    

    The new formulation is equal to the old one so the problem is solved.

    Edit: Shorter method

    Similar to above, but we can reduce two lines of code by having variable w the same shape as matrix A and changing the form of the objective function to cp.sum(cp.multiply(A,w)):

    w = cp.Variable(A.shape, boolean=True)
    
    objective = cp.Maximize(cp.sum(cp.multiply(A,w)))
    
    constraints = [cp.sum(w)==1] #constraint boolean array to only have only 1 non-zero value
    
    prob = cp.Problem(objective,constraints)
    prob.solve() #solve using cvxpy
    
    index = np.argwhere(w.value==1)
    x = w.value[index[0][0],:] #solution for the x boolean array
    y = w.value[:,index[0][1]] #solution for the y boolean array