I have a view of a NumPy array. I want to apply a function to each of its elements and save the result into said view (essentially, I want to do an in-place map). I do not want to use for loops, because they do not benefit from any NumPy optimizations/parallelization. I also cannot do something like arr = map(fn, arr)
, since it creates a new object.
You aren't the first to ask about applying a scalar function to all elements of an array. That comes up often. Just search for the use of "vectorize" in SO questions.
By stressing that this is a view
, I assume you want to make sure that the changes apply to the corresponding elements of the base array. Others stress in-place because they think this will save memory or be faster.
If array is 2d then the old-fashioned nested loop
for i in range ...:
for j in range ...:
arr[i,j] = func(arr[i,j])
takes care of the in-place requirement.
If you can rework func
to work with a 1d array, you can do
for i in range...:
arr[i,:] = func(arr[i,:])
func
will produce a new array, but those values can be copied back into arr[i,:]
(and arr.base
) without problem.
The ideal, speed wise, is a function that can work with the whole nd array, with operators and numpy functions. That's fastest, but will always produce a tempoary buffer that you have to copy. That has to be copied back to the view
(though the out
parameter of ufunc
can help).
arr[:] = func(arr)
Even arr[indx] += 1
uses a temporary buffer, when can be problem if the indx
has duplicate indicies. For that, ufunc
may have an at
method to perform unbuffered iteration.
https://numpy.org/doc/stable/reference/generated/numpy.ufunc.at.html
There aren't many numpy operations that work in-place. Most produce a new array. It's easier to create a building-block language that way.
There are some tools that "streamline" iterating on an array, but they don't offer any real performance enhancement. But questions come up often about them - np.vectorize
, np.frompyfunc
, np.nditer
, and (my least favorite) np.apply_along_axis
. In-place is trickier with the functions.
But for real performance you have to use a tool that compiles your function, such as numba
or cython
. There are lots of SO about those.
Python map
just sets up an iteration, which is 'run' with a for
or list()
. I prefer the list-comprehension notation. None of that is special to numpy
.
Other SO deal with multithreading and processing.