pythonnumpy

Index numpy array by other array as indices


I'm trying to take array

a = [1,5,4,5,7,8,9,8,4,13,43,42]

and array

b = [3,5,6,2,7]

And I want b to be the indexes in a, e.g. a new array that is

[a[b[0]], a[b[1]], a[b[2]], a[b[3]] ...]

So the values in b are indexes into a. And there are 500k entries in a and 500k in b (approximately). Is there a fast way to kick in all cores in numpy to do this? I already do it just fine in for loops and it is sloooooooowwwwww.

Edit to clarify. The solution has to work for 2D and 3D arrays. so maybe

b = [(2,3), (5,4), (1,2), (1,0)]

and we want

c = [a[b[0], a[b[1], ...]

Solution

  • I solved this by writing a C extension to numpy called Tensor Weighted Interpolative Transfer, in order to get speed and multi-threading. In pure python it is 3 seconds per 200x100x3 image scale and fade across, and in multi-threaded C with 8 cores is 0.5 milliseconds for the same operation.

    The core C code ended up being like

    t2[dstidxs2[i2] + doff1] += t1[srcidxs2[i2] + soff1] * w1 * ws2[i2];
    

    Where the doff1 is the offset in the destination array etc. The w1 and ws2 are the interpolated weights. All the code is ultra optimized in C for speed. (not code size or maintainability)

    All code is available on https://github.com/RMKeene/twit and on PyPI.

    I expect further optimization in the future such as special cases if all weights are 1.0.

    --- Dec 2024 note: I dropped Twit because it is not what I need for my AI research any more.