numpynumpy-ndarraysvdconvergence

polynomial extrapolation using numpy


I have a signal that I wish to fit using polynomial fitting first and then use that polynomial expression for extrapolation. Meaning, originally, my x axis range is 192.1-196THz. And I have corresponding y axis values for this range. But I want to plot the signal from 191.325-196.125 THz, meaning I need to predict y axis values outside of my original x axis range. Here is my code

import numpy as np
import matplotlib.pyplot as plt
import numpy.polynomial.polynomial as poly




y=[9.996651009,10.13327408,10.27003353,10.43754263,10.60523071,10.71136134,10.81753489,10.92299416,11.02853271,11.10093835,11.17349619,11.16565259,11.15787539,11.16191795,11.16626826,11.16087842,11.15561384,11.16940196,11.18346164,11.22434472,11.26537218,11.35981126,11.45427174,11.6102862,11.76633923,11.89058563,12.01486743,12.21133227,12.40782023,12.55682327,12.7059072,12.88910065,13.07234341,13.20139753,13.33048759,13.46960371,13.60875466,13.71956075,13.83042354,13.91157425,13.99277241,14.08548465,14.17824161,14.26091536,14.34364144,14.43504833,14.52651725,14.59548578,14.66449439,14.73751377,14.81058218,14.88687742,14.96322511,15.038278,15.11342222,15.1644832,15.21556914,15.31798775,15.42044285,15.51000031,15.59959896,15.68089263,15.76229354,15.86213633,15.96214484,16.0269976,16.0923806,16.16483146,16.23762336,16.24233076,16.24719576,16.26116812,16.27543692,16.28311024,16.29128992,16.23458771,16.17830522,16.12636211,16.07478778]

y.reverse()


# x axis for polyfit
i = 192.1e12
x=[]
while i <= 196e12:
  x.append(i)
  i += 50e9

plt.plot(x, y)


# x axis for extrapolation
i1= 191.325e12
x2=[]
while i1 <= 196.125e12:

  x2.append(i1)
  i1 += 50e9


z = poly.polyfit(x, y, 20)
f = poly.polyval(x, z)

plt.plot(x2, f(x2), linestyle='--')

Problems:

1. The error I am getting : 'numpy.ndarray' object is not callable. (line 37) I cannot find a work around.

2. If I put more than 21 as the number of coefficients then I run into the following problem: SVD did not converge in Linear Least Squares. (line 34) I basically have no idea about this problem.

Can anyone give me some ideas? Thank you.


Solution

  • it is true that the polynomial will go crazy. however i do want to know about this error message i am getting in general and also i would like to do a comparison of accuracy for different orders of polynomial fitting

    It's useful to take a moment and think about what intermediate values the polynomial fitting code will have while it is fitting your polynomial. Your X values are on the order of 1.9 * 10^14. If you then take the 22nd power of that, you have the number 2.8 * 10^314. This is bigger than the maximum representable double-precision float.

    Interestingly, even when your X^20 term fits into the range of a float, it seems that it still is suffering from floating-point precision problems. Here are the coefficients it fits for a degree 20 polynomial:

    >>> print(z)
    [ 7.01389718e+008 -7.23272471e-006 -6.18262251e-021  1.28113394e-034
      6.59034262e-049 -5.97742101e-066 -1.75197282e-077 -9.00617222e-092
      1.16972981e-106  3.58691274e-120 -9.24694178e-135  0.00000000e+000
      0.00000000e+000  0.00000000e+000  0.00000000e+000  0.00000000e+000
      0.00000000e+000  0.00000000e+000  0.00000000e+000  0.00000000e+000
      0.00000000e+000]
    

    In other words, the coefficients it fits for everything beyond X^10 are zero. This is happening due to floating point precision problems.

    How can this be fixed? You could take inspiration from machine-learning practitioners, and standardize your inputs. "Standardization" refers to subtracting the mean and dividing by the standard deviation of the input. Since this is a linear transformation, it is mathematically equivalent to fitting without standardization, while avoiding issues of overflow. When I do this, the coefficient it chooses for X^22 is -0.11, which is the equivalent of -5.3 * 10^-316 without standardization. Without standardization, this number cannot be expressed as a float.

    Here is the code I used:

    import numpy as np
    import matplotlib.pyplot as plt
    import numpy.polynomial.polynomial as poly
    from sklearn.preprocessing import StandardScaler
    
    
    
    
    y=[9.996651009,10.13327408,10.27003353,10.43754263,10.60523071,10.71136134,10.81753489,10.92299416,11.02853271,11.10093835,11.17349619,11.16565259,11.15787539,11.16191795,11.16626826,11.16087842,11.15561384,11.16940196,11.18346164,11.22434472,11.26537218,11.35981126,11.45427174,11.6102862,11.76633923,11.89058563,12.01486743,12.21133227,12.40782023,12.55682327,12.7059072,12.88910065,13.07234341,13.20139753,13.33048759,13.46960371,13.60875466,13.71956075,13.83042354,13.91157425,13.99277241,14.08548465,14.17824161,14.26091536,14.34364144,14.43504833,14.52651725,14.59548578,14.66449439,14.73751377,14.81058218,14.88687742,14.96322511,15.038278,15.11342222,15.1644832,15.21556914,15.31798775,15.42044285,15.51000031,15.59959896,15.68089263,15.76229354,15.86213633,15.96214484,16.0269976,16.0923806,16.16483146,16.23762336,16.24233076,16.24719576,16.26116812,16.27543692,16.28311024,16.29128992,16.23458771,16.17830522,16.12636211,16.07478778]
    
    y.reverse()
    
    
    # x axis for polyfit
    i = 192.1e12
    x=[]
    while i <= 196e12:
        x.append(i)
        i += 50e9
    
    plt.plot(x, y)
    
    
    # x axis for extrapolation
    i1= 191.325e12
    x2=[]
    while i1 <= 196.125e12:
    
        x2.append(i1)
        i1 += 50e9
    
    x = np.array(x)
    x2 = np.array(x2)
    
    scaler = StandardScaler()
    scaler.fit(x.reshape(-1, 1))
    x_rescaled = scaler.transform(x.reshape(-1, 1))
    x2_rescaled = scaler.transform(x2.reshape(-1, 1))
    
    z = poly.polyfit(x_rescaled.reshape(-1), y, 20)
    f = poly.polyval(x2_rescaled.reshape(-1), z)
    
    plt.plot(x2, f, linestyle='--')
    plt.ylim([7, 20])
    

    20th degree poly

    I should mention that doing an extrapolation on a 20th degree polynomial is likely to be useless outside the region you are fitting. A polynomial of that degree is extremely sensitive to small changes.

    For example, I added random noise of magnitude 0.01, then re-fit the polynomial. I repeated this process nine times. Visually, you can barely see the noise, but the noise has an enormous effect on the polynomial fit. Half of the resulting polynomials point up, and half of them point down.

    unreliable polynomials

    See also Why is the use of high order polynomials for regression discouraged?