pythonscipystatisticscurve-fitting

Gauss Fitting data with scipy and getting strange answers on Fit Quality


I have Gamma-Spectra and I am doing Gauss fit to the peaks using Python with scipy. Works well, but trying to get a number on fit quality (for the intent on some automation) returns very odd numbers.

The scipy command is:

response = scipy.optimize.curve_fit(gauss, e, c, param0, full_output=True, absolute_sigma=True, method=FitMethod, bounds=bounds)

I get the fit quality from:

fitQuality = np.linalg.cond(response[1])     # PCOV is response[1]

FitQuality is a value from zero (excellent fit) up to infinity (beyond lousy). As an example, here is a fit (shaded green) to the K-40 Gamma line in a CsI detector. Gauus Fit to K-40 Gamma line from a CsI detector

Given the jittery nature of the data, I am pleased with the result. And so is scipy, giving it a FitQuality rating of 18.

Next picture shows a fit (shaded red) to a Na-22 line in a High-Resolution Ge detector.

enter image description here

Again, I am very pleased with the result. But scipy is giving it a FitQuality rating of 56,982,136; this means very, very poor.

This does not make sense. The fit is nearly perfect!

Is my FitQuality formula inappropriate? Does it need additions? Am I completely off the mark? Please, help.


Solution

  • The formula you have is useful, but it is not a direct measure of the quality of the fit. It is related to "uncertainty" of the fit, and the condition number specifically is useful for diagnosing problems that may have occurred during fit. For instance, I've seen in other SO posts that `curve_fit` returns wild -looking parameter values. We find that the condition number is astronomical, so the covariance matrix is essentially singular. We find that this is because the callable is over-parameterized - that is, the parameters are not independent - and some can be il either removed or combined. Even in these cases, the fit may look perfect.

    See, for example, good r2 score but huge parameter uncertainty. The fit looks great, but they would find that the condition number of pcov is huge. This is described in the curve_fit documentation.

    Something that correlates better with visual "quality" is the sum of squared errors - that is, the objective function of least squares curve fitting. The covariance, on the other hand, is related to the Hessian - sort of like the curvature - of the objective function with respect to fitting parameters. So it's just not a measure of the "quality" of the fit, but how the quality changes with respect to the parameters.