scipyconstraintsscipy-optimize-minimizeobjective-function

Writing Objective Function using scipy.optimize.minimize Troubleshoot


I am trying to solve an optimization problem applied to bridges, in the context of a parametric design. The objective function is to try and split precast barriers into a minimum amount of unique barriers. There are two kind of barriers, normal and barriers with light poles. I have shared an image with a bridge of 3 spans.

The parameters are as follows:

The light pole barrier width is equal to w.

The light pole barriers are evenly spaced by the parameter s.

The light pole starting point is defined by the variable a.

The spans of the bridge, [l1,l2,...ln]

x is the width that all of the equations will have to be divisible by.

The idea is to as many barriers with the width of x and some remaining lengths to fill the spans kept at a minimum. I have to get all the distances [a,b,c,...f] in an array of equations (see picture), where I am also appending the light between the special barriers (s-w), and have all of them divisible by x. I also added in the elements arr array to account for constants in case the remaining length is not divisible by x so we will compensate with a different barrier (basically, we use that constant to subtract).

Here is a sketch of what I am talking about: bridge

from scipy.optimize import minimize
import numpy as np

s = 12500  
spans = np.array([25450, 25100, 25450])
ejs = np.zeros(len(spans)-1)  # expansion or separation joints . ignore for now

min_bw, max_bw  = 0, 1200  #light pole barrier width

consts = abs((len(spans) * 2 - 1))  # Number of distance elements [b,c,d...]
init = np.array([1500, 1500, 6250]) #x, w, a + w/2

n = (spans - s / 2) // s  # consecutive spacings of barriers for a given span. At times we have more 
m = np.ones(len(spans) * 2)  # the length of the equation array that has to be divisible

def objective(arr, n, s, spans, ejs):

    x = arr[0]
    w = arr[1]
    a =  arr[2] # (the offset is until the mid axis of the light pole)
    m = np.ones(len(spans) * 2 + 1) # We make room for the last constraint h=(s-w)
    m[0] = a - w / 2 

    for i in range(1, len(spans) * 2):
        j = (i - 1) // 2
        if i % 2 == 0:
            m[i] = (s - w) - (m[i - 1] + ejs[j] + arr[2 + i])
        else:
            m[i] = spans[j] - (m[i - 1] + n[j] * s + w + arr[2 + i])
    m[-1] = s - w  # We also need the light distance between two poles to be divisible by x
    #print(m)
    return np.sum(m%x)

#Bounds
bounds = np.array([(1500, 2100), (1500, 2600), (3000, s)])
extras = np.repeat([(min_bw, max_bw)], consts, axis=0) # Add the bounds for each additional element
all_bounds = np.concatenate((bounds, extras), axis=0)


initial = np.pad(init, (0, consts), 'constant', constant_values=min_bw)

sol = minimize(fun=objective, x0=initial, bounds=all_bounds, method='Nelder-Mead', args=(n, s, spans, ejs))
print(np.round(sol.x))

#m = [ 5500. , 5950.,  5050.  ,6050.  ,4950. , 6500. ,11000.] #the print m statment

The problem that I can't seem to figure out is that the np.sum(m%x) does not return 0! I get the right distances in the m array compared to the drawing. The divisibility equation constraints are not satisfied yet the function says it terminates successfully. I'd appreciate any insights! Thank you so much!


Solution

  • The main problems that Nick calls out are true (though not comprehensive): there are many local minima, and the objective is non-differentiable.

    But just as importantly: your objective isn't well-suited as an objective at all - it should be a constraint; and this entire system (when re-constructed) is linear. That means it's time for integer linear programming, which has better guarantees of convergence and accuracy than a generic minimize (no matter which method you select). More specifically, you're doing constraint programming, where there is no objective at all. (Or maybe you have some idea of what a better objective should be, such as minimizing the coefficients of x).

    In addition to the variables for x, w, a, your "distance elements" and m, you also need a binary selection matrix of which multiple of x is equal to m. This gets around the problem that multiplication of x by some multiple to get m is non-linear, by using so-called big-M constraints.

    I've tried to make this easy-to-follow, and at any point you can call some_constraint.A.toarray() and view the result in e.g. PyCharm's matrix visualiser to see what's happening; but you may also need to dig into some introductory material on linear programming to get a better handle on it. Anyway, the results seem sane.

    import numpy as np
    import scipy.sparse as sp
    from scipy.optimize import milp, Bounds, LinearConstraint
    
    s = 12_500
    spans = np.array((25_450, 25_100, 25_450))
    ejs = np.zeros(len(spans) - 1)  # expansion or separation joints
    min_bw, max_bw = 0, 1200  # light pole barrier width
    consts = spans.size*2 - 1  # Number of distance elements [b,c,d...]
    m_count = spans.size*2 + 1
    n = (spans - s/2)//s  # consecutive spacings of barriers for a given span
    
    # In the old solution's objective for the initial guess, m/x ranges from 3.6 to 7.3. Ballpark 20 as a maximum.
    mul_count = 20
    multiples = np.arange(1, 1 + mul_count)[:, np.newaxis]
    
    '''
    Variables: x, w, a, consts, m, and then
    a m_count * mul_count matrix of selectors of integer multiples of m.
    Each selector is binary (0 or 1), and indicates times 1, times 2, ... times mul_count respectively.
    '''
    
    # Constraint matrix for all m equations, starting with the 'm' term itself
    m_A = sp.hstack((
        sp.csc_array((m_count, 3 + consts)),
        -sp.eye(m_count),
        sp.csc_array((m_count, m_count*mul_count)),
    ), format='lil')
    
    # Column positions: x, w, a, consts, m, mi
    # First row: a - w/2 == m0: -0.5w + a - m == 0
    m_A[0, 1:3] = -0.5, 1
    # Middle rows
    m_A[1:, 1] = -1  # -w (middle and last row)
    m_A[1:-1, 3: 3+consts] = -sp.eye(consts)  # -consts
    m_A[1:-1, 3+consts: 3+consts+m_count - 2] -= sp.eye(m_count - 2)  # -m[i - 1]
    
    m_rhs = np.empty(m_count)
    # First row
    m_rhs[0] = 0
    # Even rows: m[i] = s - w - m[i-1] - ejs[(i - 1)//2] - consts
    #            -s + ejs[(i - 1)//2] = -w - m[i] - m[i-1] - consts
    m_rhs[2:-1:2] = ejs - s
    # Odd rows: m[i] = spans[(i - 1)//2] - (m[i - 1] + n[(i - 1)//2] * s + w + arr[2 + i])
    #            -spans[(i - 1)//2] + n[(i - 1)//2]*s = -m[i] - m[i - 1] - w - consts
    m_rhs[1:-1:2] = n*s - spans
    # Last row: m[-1] = s - w
    #           -w - m[-1] = -s
    m_rhs[-1] = -s
    
    constraint_m = LinearConstraint(A=m_A, lb=m_rhs, ub=m_rhs)
    # To see the constraint matrix: view constraint_m.A.toarray() in a matrix viewer
    
    # Exactly one selector per m-variable must be on
    constraint_selection = LinearConstraint(
        A=sp.hstack((
            sp.csc_array((m_count, 3 + consts + m_count)),
            sp.kron(
                A=sp.eye(m_count),
                B=np.ones(mul_count),
                format='csc',
            )
        )),
        lb=np.ones(m_count),
        ub=np.ones(m_count),
    )
    
    '''
    Enforce the multiple constraint:
    m >= x*i                     m <= x*i       (if mi is selected)
    -i*x + m >= 0                i*x - m >= 0   (if mi is selected)
    -i*x + m + M(1 - mi) >= 0    i*x - m + M(1 - mi) >= 0
    -i*x + m - M*mi >= -M        i*x - m - M*mi >= -M
    Column positions: x, w, a, consts, m, mi
    '''
    ix_m = sp.hstack((
        np.tile(multiples, (m_count, 1)),
        sp.csc_array((m_count * mul_count, 2 + consts)),
        sp.kron(
            -sp.eye(m_count),
            np.ones((mul_count, 1)),
        ),
    ))
    M = 1e6  # big-M constraint coefficient
    mmi = -M*sp.eye(m_count * mul_count)
    constraint_mul = LinearConstraint(
        A=sp.bmat((
            ( ix_m, mmi),
            (-ix_m, mmi),
        )),
        lb=np.full(shape=2*m_count*mul_count, fill_value=-M),
        ub=np.inf,
    )
    
    
    result = milp(
        c=np.zeros(3 + consts + m_count + m_count*mul_count),  # There is no optimization objective
        integrality=np.concatenate((
            np.zeros(3 + consts + m_count),  # x, w, a, consts and m continuous
            np.ones(m_count * mul_count),    # mi integral
        )),
        bounds=Bounds(
            lb=np.concatenate((
                (1_500, 1_500, 3_000),  # x, w, a
                np.full(shape=consts, fill_value=min_bw),  # consts
                np.zeros(m_count + m_count*mul_count),     # m, mi
            )),
            ub=np.concatenate((
                (2_100, 2_600, s),  # x, w, a
                np.full(shape=consts, fill_value=max_bw),   # consts
                np.full(shape=m_count, fill_value=np.inf),  # m unbounded
                np.ones(m_count*mul_count),  # mi are binary, so 0-1
            )),
        ),
        constraints=(
            constraint_m,
            constraint_selection,
            constraint_mul,
        ),
    )
    assert result.success, result.message
    (x, w, a), distances, m, mi = np.split(result.x, (3, 3+consts, 3+consts+m_count))
    mi = mi.reshape((m_count, mul_count))
    multipliers = (mi @ multiples).ravel()
    
    print(result.message)
    print(f'x = {x}')
    print(f'w = {w}')
    print(f'a = {a}')
    print(f'distance elements = {distances}')
    print(f'm = {m}')
    print(f'm/x quotients: {m/x}')
    print(f'multipliers = {multipliers}')
    print('multiply selectors =')
    print(mi)
    
    Optimization terminated successfully. (HiGHS Status 7: Optimal)
    x = 1650.0
    w = 2600.0
    a = 6250.0
    distance elements = [450.   0. 100.   0. 450.]
    m = [4950. 4950. 4950. 4950. 4950. 4950. 9900.]
    m/x quotients: [3. 3. 3. 3. 3. 3. 6.]
    multipliers = [3. 3. 3. 3. 3. 3. 6.]
    multiply selectors =
    [[0. 0. 1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
     [0. 0. 1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
     [0. 0. 1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
     [0. 0. 1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
     [0. 0. 1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
     [0. 0. 1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
     [0. 0. 0. 0. 0. 1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]]
    

    The problem is of fairly small size and high sparsity, so solves quickly. You can visualise the constraint sparsity pattern via

    plt.imshow(
        np.sign(
            sp.vstack((
                constraint_m.A,
                constraint_selection.A,
                constraint_mul.A,
            )).toarray()
        ),
    )
    plt.axis('scaled')
    plt.show()
    

    constraint pattern