pythonnumpyscipyflattenscipy-optimize

Using scipy.optimize library to find minimum of a function


initially I write this code through many iterations and it works fine.

import numpy as np
from scipy.optimize import minimize

# Define the sigmoid function


def sigmoid(x):
    return 1 / (1 + np.exp(-x))

# Define the sigmoid-based loss function


def sigmoid_loss(theta, X, y_true):
    return np.mean((sigmoid(X @ theta) - y_true)**2)


# Generate synthetic data
np.random.seed(0)  # Set random seed for reproducibility
m = 1000
n = 5
X = np.random.randn(m, n)

# True values as a column vector
theta_true = np.array(
    [-0.4277, -0.5794, 0.9260, 0.0055, -0.6345]).reshape(-1, 1)

y_true = sigmoid(X @ theta_true)  # True output values (between 0 and 1)

theta_init = np.random.randn(5, 1)

tolerance = 1e-10

# Optimization using SciPy's minimize function
result_NM = minimize(sigmoid_loss, theta_init.flatten(), args=(
    X, y_true.flatten()), method='Nelder-Mead', tol=tolerance)

# result_NM = minimize(sigmoid_loss, theta_init.flatten(), args=(
#     X, y_true.flatten()), method='Nelder-Mead', tol=tolerance)

# Display optimization results
print(f'Number of iterations: {result_NM.nit}')
print(f'Optimal solution: {result_NM.x}')
print(f'Minimum value of the loss function: {result_NM.fun}')

At first it didn't work, until I tried to use .flatten() method in a function call of the function 'minimize'. I'm not sure why is that. I know that in numpy I need to be very carefully about the actual dimensions of the data, and that is why I used .reshape(-1,1) method to be sure that matrices can multiply correctly. So (1000,5) x (5,1) will produce vector (1000,1) and loss function is defined as a mean square error.

However, minimize function require "flatten" data, so that, for example

print(theta_init.flatten().shape)

will produce the following output (5, ), so the other dimension is "undefined", I don't understand this is the case. Can you please explain, why minimize cannot work correctly with (5,1), but, it can with (5,)? Thank you!

I did try to use consistent dimension when calling minimize function, and it can even provide the result, which is not a real solution. Minimize function also generates a warning "DeprecationWarning: Use of minimize with x0.ndim != 1 is deprecated. Currently, singleton dimensions will be removed from x0"


Solution

  • will produce the following output (5, ), so the other dimension is "undefined"

    It's not that the other dimension is undefined. (5, ) is how Python represents a tuple with one element. This is necessary, because (5) would be ambiguous: it could mean a 5 with parentheses around it to control evaluation order. So (5,) is just an array with one dimension, i.e. a vector.

    Can you please explain, why minimize cannot work correctly with (5,1), but, it can with (5,)? Thank you!

    This is because of how minimize is defined:

    https://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.minimize.html

    x0: ndarray, shape (n,)

    Initial guess. Array of real elements of size (n,), where n is the number of independent variables.

    So minimize() is defined to operate on 1D arrays. (5, 1) is a 2D array, so it's not supported.

    As you've observed, you can work around this, by using .flatten() to convert higher-dimensional arrays to 1D, and using .reshape(original_shape) to restore the original shape again afterwards.