pythonscipygradientminimizehessian

Large differences comparing results of linear_sum_assignment and minimize when using jac and hess


Dear stackoverflow users, I am having some troubles with linear assignment problems using the functions linear_sum_assignment and minimize, both from the scipy library. In order to compare more complex problems, first, I need to check that the optimal result given by minimize is the same as the one of linear_sum_assignment. However, the results obtained are quite different when I use a gradient function or a hessian.

The assignment problem is very well defined in this section of the wikipedia page regarding the assignment problem.

The function linear_sum_assignment simply deals with a cost (or weight) matrix in terms of a 2d-array. It assumes that the variables are stored also in matrix form, let's call it X. It also deals internally with the bounds and restrictions of these variables, which are:

When I try to obtain the same result using the minimize function, I get very different results when I use a gradient and/or a hessian.

The following code runs the optimization of an assignment problem using both functions:

import numpy as np
from scipy.optimize import linear_sum_assignment, minimize, LinearConstraint 

# example of cost matrix
cost = np.array(((54,54,51,53),(51,57,52,52),(50,53,54,56),(56,54,55,53)))

# number of variables:
n = 4 # matrix size
n2 = n*n # the real number of variables

# OPTIMAL ASSIGNMENT USING linear_sum_assignment:
row_ind, col_ind = linear_sum_assignment(cost)
X = np.zeros_like(cost)
X[row_ind, col_ind] = 1
# print(X)

# OPTIMAL ASSIGNMENT USING minimize:

# function definition
fun  = lambda x, c: c.dot(x)
jac  = lambda x, c: c
hess = lambda x, c: np.zeros((len(c), len(c)))

# argument:
cost2 = cost.flatten().astype(float)

# matrix of the linear constraints:
A = np.zeros((2*n,n2))
for i in range(n):
    A[i, i*n:(i+1)*n] = 1
    A[n+i, i::n] = 1
# print(A)

# array of ones for the rhs of the equality constraints:
ones = np.ones(2*n)

# linear constraints:
lin_constraints = LinearConstraint(A, lb=ones, ub=ones)

# bounds of variables:
bounds = [(0,1)]*n2

# initial condition:
x0 = np.eye(n).flatten()

# call to minimize:
res = minimize(fun, x0, args=(cost2,), constraints=[lin_constraints], bounds=bounds,
                   method='trust-constr', jac=jac, hess=hess)

# COMPARE RESULTS
X2 = res.x.reshape((n,n))
print(np.abs(X2-X).max())

When I call minimize with jac=None, hess=None it's when I get the best results, with errors of the order of 1.5e-7. However, when I use the hessian, the error increases to 8e-4. The worst result is when I use both gradient and hessian, when the error is greater than 0.07.

Why is this? It seems that there is a problem with the gradient definition, and probably also with the hessian. However, I have checked several times the functions jacand hess, and cannot find the error.

Any help would be greatly appreciated.


Solution

  • I don't see a problem with your Jacobian/hessians either.

    I'm not 100% sure why this works, but I did find that it helped to convert the equality constraints into inequality constraints via adding and subtracting a small value.

    eps = 1e-15
    lin_constraints = LinearConstraint(A, lb=ones - eps, ub=ones + eps)
    

    and

    res = minimize(fun, x0, args=(cost2,), constraints=[lin_constraints], bounds=bounds,
                       method='trust-constr', jac=jac, hess=hess,
                  options=dict(
                      # verbose=3,
                      gtol=1e-13,
                      xtol=1e-13,
                  ))
    

    This allows me to solve this to an accuracy of 1e-12.