pythonarraysnumpyindexingnumpy-indexing

Fast numpy row slicing on a matrix


I have the following issue: I have a matrix yj of size (m,200) (m = 3683), and I have a dictionary that for each key, returns a numpy array of row indices for yj (for each key, the size array changes, just in case anyone is wondering).

Now, I have to access this matrix lots of times (around 1M times) and my code is slowing down because of the indexing (I've profiled the code and it takes 65% of time on this step).

Here is what I've tried out:

  1. First of all, use the indices for slicing:
>> %timeit yj[R_u_idx_train[1]]
10.5 µs ± 79.7 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

The variable R_u_idx_train is the dictionary that has the row indices.

  1. I thought that maybe boolean indexing might be faster:
>> yj[R_u_idx_train_mask[1]]
10.5 µs ± 159 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

R_u_idx_train_mask is a dictionary that returns a boolean array of size m where the indices given by R_u_idx_train are set to True.

  1. I also tried np.ix_
>> cols = np.arange(0,200)
>> %timeit ix_ = np.ix_(R_u_idx_train[1], cols); yj[ix_]
42.1 µs ± 353 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
  1. I also tried np.take
>> %timeit np.take(yj, R_u_idx_train[1], axis=0)
2.35 ms ± 88.7 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

And while this seems great, it is not, since it gives an array that is shape (R_u_idx_train[1].shape[0], R_u_idx_train[1].shape[0]) (it should be (R_u_idx_train[1].shape[0], 200)). I guess I'm not using the method correctly.

  1. I also tried np.compress
>> %timeit np.compress(R_u_idx_train_mask[1], yj, axis=0)
14.1 µs ± 124 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
  1. Finally I tried to index with a boolean matrix
>> %timeit yj[R_u_idx_train_mask2[1]]
244 µs ± 786 ns per loop (mean ± std. dev. of 7 runs, 1000 loops each)

So, is 10.5 µs ± 79.7 ns per loop the best I can do? I could try to use cython but that seems like a lot of work for just indexing...

Thanks a lot.


Solution

  • A very smart solution was given by V.Ayrat in the comments.

    >> newdict = {k: yj[R_u_idx_train[k]] for k in R_u_idx_train.keys()}
    >> %timeit newdict[1]
    202 ns ± 6.7 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
    

    Anyway maybe it would still be cool to know if there is a way to speed it up using numpy!