I am trying to find the coefficients for polynomials of degree n
using fminunc
. I know polyfit
but I have to use fminunc (or lsqnonlin) because I need to extend my model at a later point. Anyway, my objective function is simply the nonlinear least-square problem and I applied the same centering and scaling approach used by polyfit
:
tmpx = (xdat-mean(xdat))./std(xdat) % centering/scaling
x0 = zeros(degree+1, 1); % initial point
objfctn = @(pars) mdl(pars, tmpx, ydat)
with
function [ sse, grad ] = mdl( pars, xdat, ydat )
est = polyval(pars, xdat)-ydat;
sse = sum(est.^2);
degree = length(pars)-1;
grad = [];
for d = 0:degree
grad = [sum(2*est.*(xdat.^d)); grad]; %#ok<AGROW>
end
end
I then generated some test data according to some heuristics derived from my target measurement data set:
degree = 2;
tpars = [ ... % true parameters
randraw('uniform', [-5e-20, 5e-20], 1), ...
randraw('uniform', [-10e-20, 10e-20], 1), ...
randraw('uniform', [-20e-9, 20e-9], 1), ...
randraw('uniform', [-50e-6, 50e-6], 1), ...
randraw('uniform', [0, 89e9], 1)];
tpars = tpars(end-degree:end);
xdat = sort(randi(1800e9, 100e3, 1));
yreal = polyval(tpars, xdat);
ydat = yreal + randraw('norm', [0, 100], length(xdat));
with the famous randraw
script available here.
So far so good, this works well for polynomials of degree up to 2 (i.e., length(pars) <= 3
). With the above test data, it even beats polyfit
in accuracy. However, as soon as I try to fit a polynomial with degree higher than 2, fminunc
stops right after the initial iteration saying
Optimization stopped because the objective function cannot be decreased in the current search direction. Either the predicted change in the objective function, or the line search interval is less than eps.
However, polyfit
is still able to fit the polynomial, although with an increasing error. Anyone any idea how to fit higher degree polynomials with fminunc
?
My feeling is that some numerical issue prevents fminunc
from doing its job. Maybe the gradients are too high and I need some y-axis scaling as well?
It seems that my guess with the y-axis scaling was right and fminunc
seems to have some numerical issues if the SSE is to high. I solved the problem by simply applying the same scaling to the y-axis as I did for the x-axis:
scaling_factor = std(ydat);
ydat_scaled = ydat / scaling_factor;
% fit the polynomial to ydat_scaled using fminunc or lsqnonlin
result = result * scaling_factor;
I tested this approach with polynomials of degrees up to 4 and it works like a charm. In terms of root-mean-square deviation of the resulting polynomial from the original polynomial, fminunc
performs much better than polyfit. Specifically, I did 100 runs for each degree 0-4, each run with a new set of test data (see question above). The average RMSE of the 100 runs were:
avgrmse_fminunc = 0.2608
and avgrmse_polyfit=0.2608
avgrmse_fminunc = 0.3810
and avgrmse_polyfit=0.3809
avgrmse_fminunc = 2.5897
and avgrmse_polyfit=53.3994
avgrmse_fminunc = 34.0733
and avgrmse_polyfit=340.6380
avgrmse_fminunc = 5.2805e+13
and avgrmse_polyfit=2.9255e+14
These results suggest that for applications with high SSE values and high accuracy requirements, a "manual" least-square minimization might be preferable.