pythondifferential-evolution

Problems with pseudo-inverse of Jacobian matrix


I'm implementing the E-Constrained Differential Evolution with Gradient-Based Mutation algorithm (Takahama, 2006). For the mutation part, I have to calculate the gradient of each constraint function for the trial vector, thus defining the Jacobian matrix. I'm doing it by finite differences. I understand the matrix should be m x n, with m = number of constraints and n = number of dimensions (length of the trial vector). The perturbation on the trial vector will then be pseudo-inverse_of_Jacobian x delta_C. I don't know what is wrong but I'm getting the message operands could not be broadcast together with shapes (4,5) (5,1). Can you please help? Shouldn't this multiplication yield a 4x1 array? Thanks.

trial = np.zeros(4)
inv_Jc = np.linalg.pinv(Jacobian(trial))
delta_C = np.array(calculate_C(trial))
delta_C = delta_C.reshape(len(delta_C),1)
trial = trial-inv_Jc*delta_C

Below you can find the relevant definitions for the example problem I'm trying to solve.

import numpy as np

def fobj(x): 
    return 3*x[0]+0.000001*x[0]**3+2*x[1]+(0.000002/3)*x[1]**3

def constraints(x): 
    g = [-x[3]+x[2]-0.55,
         -x[2]+x[3]-0.55]
    h = [1000*np.sin(-x[2]-0.25)+1000*np.sin(-x[3]-0.25)+894.8-x[0],
         1000*np.sin(x[2]-0.25)+1000*np.sin(x[2]-x[3]-0.25)+894.8-x[1],
         1000*np.sin(x[3]-0.25)+1000*np.sin(x[3]-x[2]-0.25)+1294.8]
    
    return np.array([max(0,g[i]) for i in range(len(g))]), np.array(h)
     
def calculate_C(v):
    g, h = constraints(v)
    return np.array([*g,*h])

def Jacobian(v,delta=0.01):
    C = calculate_C(v)
    Jc = np.zeros(shape=(len(C),len(v)))

    for i in range(len(delta_C)):
        for j in range(len(v)):
            aux = np.copy(v)
            aux[j] = v[j]+delta
            C_plus = calculate_C(aux)[i]
            aux[j] = v[j]-delta
            C_minus = calculate_C(aux)[i]
            Jc[i,j] = ((C_plus-C_minus)/(2*delta))

    return Jc

Text from the article:

gradient-based mutation


Solution

  • @falafelocelot hinted me to the right direction. Verifying the shapes of the arrays, I noticed that I was testing the function with a trial vector defined as trial = np.zeros(4), whose shape is (4,). Using np.dot() was providing a (4,1) array, as expected, but the operation trial-np.dot(a,b) was providing a (4,4) array. When I reshaped the trial vector to (4,1), everything worked as expected.