pythonnumpyio

How to save numpy Polynomial fit


Given a fitted polynomial using the numpy Polynomial convenience class (>1.14)

p = np.polynomial.Polynomial.fit(...)

what is the best way of saving and loading this fit?

There are three arguments which are needed to reproduce the object, namely p.coef, p.domain and p.window. See this question for what happens when only the coefficients are loaded.

One could save all three as different files and re-load them, but that's cumbersome and annoying.

My two initial ideas would be to save the entire object, either as a 0d-array

np.save("polyfit.npy", p, allow_pickle=True)
np.load("polyfit.npy", allow_pickle=True).item()

or as a 1d-array, as I honestly had to look up .item() for accessing the element.

np.save("polyfit.npy", [p], allow_pickle=True)
np.load("polyfit.npy", allow_pickle=True)[0]

Both of these ways require using the pickle module, which causes cross-version / platform issues and is a potential security issue. So I'd rather avoid using it.
Are there any recommended ways of doing this? The documentation doesn't seem to contain any good hint

https://numpy.org/doc/stable/reference/routines.polynomials.html
https://numpy.org/doc/stable/reference/routines.polynomials.classes.html


Solution

  • Make a sample poly series:

    In [432]: x=np.linspace(0,1,10); y = x**3
    In [433]: p = np.polynomial.Polynomial.fit(x,x**3, 3); p
    Out[433]: 
    Polynomial([0.125, 0.375, 0.375, 0.125], domain=[0., 1.], window=[-1.,  1.], symbol='x')
    

    So it's a class instance with several attributes, which are arrays and string. And we can make new fitted array with p(x):

    In [434]: np.allclose(x**3,p(x))
    Out[434]: True
    

    As you note we can save and load it, allowing pickle:

    In [436]: np.save('test.npy', p); pp = np.load('test.npy',allow_pickle=True)
    
    In [437]: pp
    Out[437]: 
    array(Polynomial([0.125, 0.375, 0.375, 0.125], domain=[0., 1.], window=[-1.,  1.], symbol='x'),
          dtype=object)
    

    This is a single item object dtype array, containing the poly series instance. We get at that instance with .item():

    In [438]: np.allclose(x**3,pp.item()(x))
    Out[438]: True
    

    We can recreate a poly series using the 3 attributes:

    In [439]: np.polynomial.Polynomial(p.coef,p.domain, p.window)
    Out[439]: 
    Polynomial([0.125, 0.375, 0.375, 0.125], domain=[0., 1.], window=[-1.,  1.], symbol='x')
    

    And do the same with the attributes of pp.item()

    In [440]: p1=pp.item(); np.polynomial.Polynomial(p1.coef,p1.domain, p1.window)
    Out[440]: 
    Polynomial([0.125, 0.375, 0.375, 0.125], domain=[0., 1.], window=[-1.,  1.], symbol='x')
    

    An alternate save is with savez, which saves multiple npy files in a zip archive:

    In [441]: np.savez('test.npz', **vars(p))
    In [442]: dd = np.load('test.npz')
    In [443]: list(dd.keys())
    Out[443]: ['coef', 'domain', 'window', '_symbol']
    

    dd isn't a poly series, but the 3 keys can be used to recreate one:

    In [444]: np.polynomial.Polynomial(dd['coef'],dd['domain'], dd['window'])
    Out[444]: 
    Polynomial([0.125, 0.375, 0.375, 0.125], domain=[0., 1.], window=[-1.,  1.], symbol='x')
    

    This does't use pickle since the 3 keys are arrays.

    We don't have to worry about _symbol, since it's a python string in the original, and an char array in the load:

    In [471]: p._symbol, dd['_symbol']
    Out[471]: ('x', array('x', dtype='<U1'))