pythonarraysnumpyparallel-processingnumpy-ufunc

How to efficiently apply a function to a NumPy array view in-place?


I have a view of a NumPy array. I want to apply a function to each of its elements and save the result into said view (essentially, I want to do an in-place map). I do not want to use for loops, because they do not benefit from any NumPy optimizations/parallelization. I also cannot do something like arr = map(fn, arr), since it creates a new object.


Solution

  • You aren't the first to ask about applying a scalar function to all elements of an array. That comes up often. Just search for the use of "vectorize" in SO questions.

    By stressing that this is a view, I assume you want to make sure that the changes apply to the corresponding elements of the base array. Others stress in-place because they think this will save memory or be faster.

    If array is 2d then the old-fashioned nested loop

     for i in range ...:
          for j in range ...:
              arr[i,j] = func(arr[i,j])
    

    takes care of the in-place requirement.

    If you can rework func to work with a 1d array, you can do

    for i in range...:
        arr[i,:] = func(arr[i,:])
    

    func will produce a new array, but those values can be copied back into arr[i,:] (and arr.base) without problem.

    The ideal, speed wise, is a function that can work with the whole nd array, with operators and numpy functions. That's fastest, but will always produce a tempoary buffer that you have to copy. That has to be copied back to the view (though the out parameter of ufunc can help).

     arr[:] = func(arr)
    

    Even arr[indx] += 1 uses a temporary buffer, when can be problem if the indx has duplicate indicies. For that, ufunc may have an at method to perform unbuffered iteration.

    https://numpy.org/doc/stable/reference/generated/numpy.ufunc.at.html

    There aren't many numpy operations that work in-place. Most produce a new array. It's easier to create a building-block language that way.

    There are some tools that "streamline" iterating on an array, but they don't offer any real performance enhancement. But questions come up often about them - np.vectorize, np.frompyfunc, np.nditer, and (my least favorite) np.apply_along_axis. In-place is trickier with the functions.

    But for real performance you have to use a tool that compiles your function, such as numba or cython. There are lots of SO about those.

    Python map just sets up an iteration, which is 'run' with a for or list(). I prefer the list-comprehension notation. None of that is special to numpy.

    Other SO deal with multithreading and processing.