c++eigenlevenberg-marquardt

Understanding Levenberg Marquardt enumeration returns.


Problem : I have been recently tasked with designing a nonlinear solver, but my solver is not converging to the correct solution.

**Expected ** : The 'minimize(x)' method should reduced my parameter vector, x, to the minimun.

Observed : After I call 'minimize(x)' I get a status return that says RelativeErrorTooSmall.

Question : Could someone please explain what this enumeration value means?

Documentation : The only available documentation on the Eigen Levenberg Marquardt class is basically it's .h file. Here is a list of enumerations:

enum Status {
    NotStarted = -2,
    Running = -1,
    ImproperInputParameters = 0,
    RelativeReductionTooSmall = 1,
    RelativeErrorTooSmall = 2,
    RelativeErrorAndReductionTooSmall = 3,
    CosinusTooSmall = 4,
    TooManyFunctionEvaluation = 5,
    FtolTooSmall = 6,
    XtolTooSmall = 7,
    GtolTooSmall = 8,
    UserAsked = 9
};

Here is a link to the header file: https://eigen.tuxfamily.org/dox/unsupported/NonLinearOptimization_2LevenbergMarquardt_8h_source.html

Here is a previous stack overflow question that has test program in it: How to use the Eigen unsupported levenberg marquardt implementation?


Solution

  • Searching through the .h file reveals that this return code means that at some step the algorithm could not declare success, yet had made little progress in adjusting the parameters on that step.

      if (delta <= parameters.xtol * xnorm)
            return LevenbergMarquardtSpace::RelativeErrorTooSmall;
    

    The term xnorm is calculated on the fly by the alrgorithm. It is an estimate of how large the parameters tend to be in some absolute sense. (It is good practice to scale your problem to that parameters tend to be around unity in absolute value.) Parameters.xtol is a number that the user may set as a "small" difference in parameters. A typical value is the square root of the machine efficiency. Indeed, that is the default value in the code.

    The failure to converge, assuming the library code is correct, can be due to any of the following:

    1. Too optimistic an estimate of how accurate the calculation of the function is. Try setting Parameters.xtol somewhat bigger. Use double precision at least. Be sure all your parameters are of approximately the same scale.

    2. The problem is not well conditioned, meaning the Hessian is much larger in some direction in parameter space than others. Be sure your parameters are scaled well, and use at least double precision. It might be necessary to use a conditioning matrix. That's too deep to go into here.

    3. The calculated gradient is not a good estimate of gradient of the loss-function. If the problem is well conditioned, either the gradient calculation or the loss-function calculation is faulty. Test them against finite differences estimate for the gradient.

    I possess my own well-tested and super-fast solvers. I would love to get in touch with you, but SO is not keen on that.