I am trying to use 'mle' with a custom negative log-likelihood function, but I get the following error:
Requested 1200000x1200000 (10728.8GB) array exceeds maximum array size preference (15.6GB). This might cause MATLAB to become unresponsive.
The data I am using is a 1x1200000 binary array (which I had to convert to double), and the function has 10 arguments: one for the data, 3 known paramenters, and 6 to be optimized. I tried setting 'OptimFun' to both 'fminsearch' and 'fmincon'. Also, optimizing the parameters using 'fminsearch' and 'fminunc' instead of 'mle' works fine.
The problem happens in the 'checkFunErrs' functions, inside the 'mlecustom.m' file (call at line 173, actuall error at line 705).
With 'fminunc' I could calculate the optimal parameters, but it does not give me confidence intervals. Is there a way to circumvent this? Or am I doing something wrong?
Thanks for the help.
T_1 = 50000;
T_2 = 100000;
npast = 10000;
start = [0 0 0 0 0 0];
func = @(x, data, cens, freq)loglike(data, [x(1) x(2) x(3) x(4) x(5) x(6)],...
T_1, T_2, npast);
params = mle(data, 'nloglf', func, 'Start', start, 'OptimFun', 'fmincon');
% Computes the negative log likehood
function out = loglike(data, params, T_1, T_2, npast)
size = length(data);
if npast == 0
past = 0;
else
past = zeros(1, size);
past(npast+1:end) = movmean(data(npast:end-1),[npast-1, 0]); % Average number of events in the previous n years
end
lambda = params(1) + ...
(params(2)*cos(2*pi*(1:size)/T_1)) + ...
(params(3)*sin(2*pi*(1:size)/T_1)) + ...
(params(4)*cos(2*pi*(1:size)/T_2)) + ...
(params(5)*sin(2*pi*(1:size)/T_2)) + ...
params(6)*past;
out = sum(log(1+exp(lambda))-data.*lambda);
end
Your issue is line 228 (as of MATLAB R2017b) of the in-built mle
function, which happens just before the custom function is called:
data = data(:);
The input variable data
is converted to a column array without warning. This is typically done to ensure that all further calculations are robust to the orientation of the input vector.
However, this is causing you issues, because your custom function assumes data
is a row vector, specifically this line:
out = sum(log(1+exp(lambda))-data.*lambda);
Due to implicit expansion, when the row vector lambda
and the column vector data
interact, you get a huge square matrix per your error message.
Adding these two lines to make it explicit that both are column vectors resolves the issue, avoids implicit expansion, and applies the calculation element-wise as you intended.
lambda = lambda(:);
data = data(:);
So your function becomes
function out = loglike(data, params, T_1, T_2, npast)
N = length(data);
if npast == 0
past = 0;
else
past = zeros(1,N);
past(npast+1:end) = movmean(data(npast:end-1),[npast-1, 0]); % Average number of events in the previous n years
end
lambda = params(1) + ...
(params(2)*cos(2*pi*(1:N)/T_1)) + ...
(params(3)*sin(2*pi*(1:N)/T_1)) + ...
(params(4)*cos(2*pi*(1:N)/T_2)) + ...
(params(5)*sin(2*pi*(1:N)/T_2)) + ...
params(6)*past;
lambda = lambda(:);
data = data(:);
out = sum(log(1+exp(lambda))-data.*lambda);
end
An alternative would be to re-write your function so that it uses column vectors, but you create new row vectors with the (1:N)
steps and the concatenation within the movmean
. The suggested approach is arguably "lazier", but also robust to row or column inputs.
Note also I've changed your variable name from size
to N
, since size
is an in-built function which you should avoid shadowing.