pythonnumpy

How to make Numpy.where() return only the first match?


I'm trying to optimize the performance of a script, which is full of Numpy's where() after which only the first returned element is actually used. Example:

F = np.where(Y>p/100)[0]

For the huge data sets that we are processing, it doesn't look like a good solution (both in terms of speed and memory consumption) to create a large array and then discard all but the first element. Is there any way how to skip the overhead, maybe by tweaking the condition?


Solution

  • You can use argmax in cases where you want the first item. It returns the index of that item.

    idx = np.argmax(Y > p/100)
    if Y[idx] > p/100:
        F = idx
    else:
        F = None