pythonnumpyscipyinterpolationbilinear-interpolation

Finding nearest points in a scattered data


I am struggling with improving the speed of interpolation of a large dataset which I am interpolating using gridfit. I have already posted a question on stackoverflow but havent got a response

So, I am thinking of trying something alternate. My idea is that if I have a huge dataset, as shown by the Python code snippet below

arr_len = 932826
xi = np.random.uniform(low=0, high=4496, size=arr_len)
yi = np.random.uniform(low=-74, high=492, size=arr_len)
zi = np.random.uniform(low=-30, high=97, size=arr_len)

I have to interpolate and get the values at defined points say (x, y). What could be the quickest way to find the 4 neighbouring points from the scattered data xi, yi and zi so that a bilinear interpolation could be performed, using interp2d (see image below). I dont know if this would give me faster results than using gridata, but would be nice to try it out

enter image description here


Solution

  • I think what you have in mind is essentially nearest neighbors regression. Here's how you could do this with scikit-learn. Note that the number 4 of neighbors considered is an arbitrary choice, so you could also try other values.

    import numpy as np
    from sklearn.neighbors import KNeighborsRegressor
    
    arr_len = 932826
    np.random.seed(42)
    xi = np.random.uniform(low=0, high=4496, size=arr_len)
    yi = np.random.uniform(low=-74, high=492, size=arr_len)
    zi = np.random.uniform(low=-30, high=97, size=arr_len)
    
    # points to get z-values for (e.g.):
    x_new = [100, 500, 2000]
    y_new = [400, 300, 100]
    
    # in machine learning notation:
    X_train = np.vstack([xi, yi]).T
    y_train = zi
    X_predict = np.vstack([x_new, y_new]).T
    
    # fit 4-nearest neighbors regressor to the training data
    neigh = KNeighborsRegressor(n_neighbors=4)
    neigh.fit(X_train, y_train)
    
    # get "interpolated" z-values
    print(neigh.predict(X_predict))
    
    [39.37712018  4.36600728 47.00192216]