initially I write this code through many iterations and it works fine.
import numpy as np
from scipy.optimize import minimize
# Define the sigmoid function
def sigmoid(x):
return 1 / (1 + np.exp(-x))
# Define the sigmoid-based loss function
def sigmoid_loss(theta, X, y_true):
return np.mean((sigmoid(X @ theta) - y_true)**2)
# Generate synthetic data
np.random.seed(0) # Set random seed for reproducibility
m = 1000
n = 5
X = np.random.randn(m, n)
# True values as a column vector
theta_true = np.array(
[-0.4277, -0.5794, 0.9260, 0.0055, -0.6345]).reshape(-1, 1)
y_true = sigmoid(X @ theta_true) # True output values (between 0 and 1)
theta_init = np.random.randn(5, 1)
tolerance = 1e-10
# Optimization using SciPy's minimize function
result_NM = minimize(sigmoid_loss, theta_init.flatten(), args=(
X, y_true.flatten()), method='Nelder-Mead', tol=tolerance)
# result_NM = minimize(sigmoid_loss, theta_init.flatten(), args=(
# X, y_true.flatten()), method='Nelder-Mead', tol=tolerance)
# Display optimization results
print(f'Number of iterations: {result_NM.nit}')
print(f'Optimal solution: {result_NM.x}')
print(f'Minimum value of the loss function: {result_NM.fun}')
At first it didn't work, until I tried to use .flatten() method in a function call of the function 'minimize'. I'm not sure why is that. I know that in numpy I need to be very carefully about the actual dimensions of the data, and that is why I used .reshape(-1,1) method to be sure that matrices can multiply correctly. So (1000,5) x (5,1) will produce vector (1000,1) and loss function is defined as a mean square error.
However, minimize function require "flatten" data, so that, for example
print(theta_init.flatten().shape)
will produce the following output (5, ), so the other dimension is "undefined", I don't understand this is the case. Can you please explain, why minimize cannot work correctly with (5,1), but, it can with (5,)? Thank you!
I did try to use consistent dimension when calling minimize function, and it can even provide the result, which is not a real solution.
Minimize function also generates a warning "DeprecationWarning: Use of minimize
with x0.ndim != 1
is deprecated. Currently, singleton dimensions will be removed from x0
"
will produce the following output (5, ), so the other dimension is "undefined"
It's not that the other dimension is undefined. (5, ) is how Python represents a tuple with one element. This is necessary, because (5) would be ambiguous: it could mean a 5 with parentheses around it to control evaluation order. So (5,) is just an array with one dimension, i.e. a vector.
Can you please explain, why minimize cannot work correctly with (5,1), but, it can with (5,)? Thank you!
This is because of how minimize is defined:
https://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.minimize.html
x0: ndarray, shape (n,)
Initial guess. Array of real elements of size (n,), where n is the number of independent variables.
So minimize()
is defined to operate on 1D arrays. (5, 1) is a 2D array, so it's not supported.
As you've observed, you can work around this, by using .flatten()
to convert higher-dimensional arrays to 1D, and using .reshape(original_shape)
to restore the original shape again afterwards.