I am trying to minimize this function but using scipy minimize i expect to get in return an array of the same size of the one in input, that is (92445,) but i get an array of size (92445, 92445) and so i get the following error:
MemoryError: Unable to allocate 63.7 GiB for an array with shape (92445, 92445) and data type float64
I don't understant if i simply did not really understand how minimize works or if there is another problem that i don't see
X = np.asarray(freezer.dropna())
Xc = X
D = np.random.random((5,18489))
Z = np.random.random((5,5))
B = np.ones((5,5))
u = 0.5
us = np.sqrt(u)
l = 0.1
def P1(Di):
Di = Di.reshape(D.shape)
A = Z - np.matmul(Di,Xc) - B
return (np.linalg.norm(A, 'fro'))**2
def P2(Xi):
Xi = Xi.reshape(Xc.shape)
A1 = np.vstack(X, us*(Z-B))
A2 = np.vstack(np.identity(18489), us*D)
A3 = np.matmul(A2, Xi)
A4 = A1 - A3
return (np.linalg.norm(A4, 'fro'))**2
def P3(Z):
Z = Z.reshape(Z.shape)
A = Z - np.matmul(D, Xc) - B
return (np.linalg.norm(A, 'fro'))**2 + (l/u)*np.linalg.norm(Z, 1)
Df = D.flatten()
Df = minimize(P1, Df)
D = Df.reshape(D.shape)
Xcf = Xc.flatten()
Xcf = minimize(P2, Xcf)
Xc = Xcf.reshape(Xc.shape)
Zf = Z.flatten()
Zf = minimize(P3, Zf)
Z = Zf.reshape(Z.shape)
B = Z - np.matmul(D,Xc) - B
For the case of P1, you can notice that the argument D only affects one row of A at a time. Therefore, you can optimize each row of D separately, and combine them at the end into a single D matrix. Solving a 18489 dimension problem five times is much easier than solving a 92445 dimension problem, so this will make it faster and use less memory.
I also found that it was necessary to use L-BFGS-B, a solver that uses less memory than what you are currently using, BFGS.
Code:
def P1(Di):
Di = Di.reshape(D.shape)
A = Z - np.matmul(Di,Xc) - B
return (np.linalg.norm(A, 'fro'))**2
def P1_simplified(Di, idx):
"""A version of P1 which operates on only one row of D,
one row of Z, and one row of B."""
Di = Di.reshape((1,) + (D.shape[1],))
A = Z[idx] - np.matmul(Di,Xc) - B[idx]
return (np.linalg.norm(A, 'fro'))**2
def solve_P1(D):
idx=0
D_list = []
for idx in range(D.shape[0]):
print(f"solving row {idx}")
result = minimize(
P1_simplified,
D[idx],
args=(idx,),
method='L-BFGS-B',
options=dict(maxfun=1e7),
)
D_row = result.x
D_list.append(D_row)
D = np.array(D_list)
print(f"solution has frobenius norm {P1(D)}")
solve_P1(D)
This is sufficient to solve this problem in under a gigabyte of memory, to get a Frobenius norm of ~10^-9. Since 0 is the best possible result, this is pretty close.
I think a similar approach could be used for P2 and P3, though I haven't worked through it in detail.
Another approach, which would be way faster, would be to reformat the problem into a linear regression.
Here is a short sketch of a proof.
Taking the square of the Frobenius norm is equivalent to the sum of squares of each element of A. Since A = Z - np.matmul(Di,Xc) - B
, this is equivalent to minimizing the square of the difference np.matmul(Di,Xc) - (Z - B)
. Split this into two pieces. Let Y = (Z - B)
. In order to do a standard linear regression, the parameters must be on the right, and in the expression np.matmul(Di,Xc)
, they are on the left. Fix this by transposing the inputs and swapping them to get np.matmul(Xc.T, Di.T)
. We also need to transpose Y as well, so Y = (Z - B).T
. We are now trying to find np.matmul(Xc.T, Di.T) ~ Y
. This is now an ordinary least squares problem, and we can use lstsq()
, which is much faster than minimize.
Code:
def P1_lstsq(X, Z, B):
Y = (Z - B).T
Di, residuals, rank, s = np.linalg.lstsq(Xc.T, Y, rcond=None)
return Di.T
Di = P1_lstsq(X, Z, B)
P1(Di)
Not only can this solve the problem in under a millisecond, it also finds a smaller result than the minimize()
solution.
I think you could apply this to P2 and P3 as well. P3 looks to be equivalent to P1 except with the addition of an intercept to the linear regression. I don't quite understand P2, but since A2 and A3 are constant with respect to Xi, I think you could do something similar.