[SOLVED] First call to interpolator constructed by LinearNDInterpolator (with precomputed triangulation) slow regardless of grid size or values

First call to interpolator constructed by LinearNDInterpolator (with precomputed triangulation) slow regardless of grid size or values

I am using Scipy to interpolate several fields that are on a large irregular grid onto a regular one. I have precomputed the Delaunay triangulation of the grid and pickled it. Constructing the first LinearNDInterpolator is fast, but calling the resulting interpolation takes 10 times longer than subsequent interpolations. This is true regardless of the actual field I am interpolating and relatively insensitive to the grid size I am interpolating onto (a smaller grid results in a faster interpolation in all cases but the first call is still one hundred times slower).

Here's all the code, excuse the detritus from my debugging:

import xarray as xr
import os
import matplotlib.pyplot as plt
import gsw
from multiprocessing.pool import ThreadPool 
import dask

import numpy as np
from dask.diagnostics import ProgressBar
from scipy.interpolate import griddata
from scipy.spatial import Delaunay
from scipy.interpolate import LinearNDInterpolator

import matplotlib.tri as mtri
import time
import pickle



def regrid(tri,fnames):
        ds1 = xr.open_dataset(fnames[0],chunks={"xc":10,"yc":10})
        ds2 = xr.open_dataset(fnames[1],chunks={"xc":100})
        ds3 = xr.open_dataset(fnames[2],chunks={"xc":100})
        ds4 = xr.open_dataset(fnames[3],chunks={"xc":100})
        fullds = xr.combine_nested([ds1,ds2,ds3,ds4], concat_dim=["x"])
        im = fullds.sel(zc=52,drop=True)
        maskflat = np.logical_and(im.yc.values>-70,-im.depth.values<ds1.rf.values[52]).flatten()
        del(fullds)
        del(ds1)
        del(ds2)
        del(ds3)
        del(ds4)
        xmin,xmax = np.nanmin(im.xc),np.nanmax(im.xc)
        ymin,ymax = np.nanmin(im.yc),np.nanmax(im.yc)
        xs=np.linspace(xmin,xmax,17)
        ys=np.linspace(-70,-60,14)
        newxs , newys = np.meshgrid(xs,ys)
        uflat = im.u.values.flatten()
        vflat = im.v.values.flatten()
        xy = (im.xc.values.flatten()[maskflat],im.yc.values.flatten()[maskflat])

        print("starting v")
        vflat = vflat[~np.isnan(vflat)]
        start = time.time()
        interpolator = LinearNDInterpolator(tri,vflat)
        #print("interpolator constructed")
        newv = interpolator(newxs,newys)
        end = time.time()
        print("newv done: ", end-start)


        print("starting u")
        start = time.time()
        interpolator = LinearNDInterpolator(tri, uflat[~np.isnan(vflat)])
        print("interpolator constructed")
        newu = interpolator(newxs,newys)
        end = time.time()
        print("newu done: ", end - start)
        print("starting mask")
        start = time.time()
        interpolator = LinearNDInterpolator(tri, maskflat[~np.isnan(vflat)])
        maskflat = interpolator(newxs,newys)
        end = time.time()
        print("mask done: ", end-start)
        print("mask")

        ds = xr.Dataset(\
            data_vars=dict(\
                u=(["y", "x"], newu),\
                v=(["y", "x"], newv),\
                mask=(["y", "x"], maskflat),\
            ),\
            coords=dict(\
                x=xs,\
                y=ys,\
            ),\
            attrs=dict(description=" by the slice"),\
        )
        outname = os.path.basename(fnames[0])
        ds.to_netcdf(outname.replace(".nc",'-slice.nc'))

with open("data/tri.pickle","rb") as f:
    tri = pickle.load(f)

#File names redacted
fnames = ["1","2","3","4"]
regrid(tri,fnames)
#File names redacted
fnames = ["a","b","c","d"]
regrid(tri,fnames)

The output of executing this is:

starting v
newv done:  570.6315619945526
starting u
interpolator constructed
newu done:  0.6614155769348145
starting mask
mask done:  0.4872896671295166
mask
starting v
newv done:  0.1620798110961914
starting u
interpolator constructed
newu done:  0.5560829639434814
starting mask
mask done:  0.47842955589294434
mask

I would have expected every call to the linear interpolator to be very fast as the triangulation has already been computed. What especially confuses me is that I would have expected that the first call to the interpolator in each call to regrid would be slow (my first guess was that the issue was some memory management problem hence the dels) but in fact only the first call to the first interpolator in the first regrid is slow.

Some extra context:

The original grid is quite large( 17,000 x 14,000) with quite a few nans. You'll notice that for debugging purposes I am interpolating onto a 17x14 grid and the first call is taking ten minutes. I am running this on a SGEcluster across 14 cores.

Solution

From : https://github.com/scipy/scipy/issues/8856

On first use, the interpolator computes some additional data structures, which are cached on the interpolator object.