I have two numpy.ndarray
instances with different shapes. If I add these two arrays, broadcasting will occur between them:
import numpy as np
x = np.array([1, 2, 3])
y = np.array([[2, 3, 5],
[7, 11, 13]])
print(x + y)
# [[ 3 5 8]
# [ 8 13 16]]
Will the broadcast array ever be created? That is, will the following array be physically created from x
before the operation?
[[1, 2, 3],
[1, 2, 3]]
The problem is less significant with smaller arrays, but with larger arrays, the difference can be considerable. When implicit broadcasting leads to the creation of a new array, a significant amount of memory can be wasted by repeating the same numbers:
x = np.random.rand(10000)
y = np.random.rand(10000, 10000)
print(x + y)
When the broadcast array is actually created with x
, the memory wastage becomes very large.
If such broadcasting occurs, is there a way to avoid creating a new array? If not (i.e. a new array is not created), how are binary operations between mismatching shapes implemented?
The array(s) is expanded via the magic of strides. np.broadcast_arrays
and np.broadcast_to
let you see the intermediate product, which will be a view
.
In [75]: x=np.array([1,2,3]); y = np.array([[1,2,3],[4,5,6]])
In [76]: X=np.broadcast_to(x,y.shape)
In [77]: y.shape, y.strides
Out[77]: ((2, 3), (12, 4))
In [78]: X.shape, X.strides
Out[78]: ((2, 3), (0, 4))
In [79]: X.base
Out[79]: array([1, 2, 3])
So X
has the same shape as y
, but the leading strides value is 0
. This allows that dimension to be 'repeated' without actually copying. Note the base
.
In [80]: X+y
Out[80]:
array([[2, 4, 6],
[5, 7, 9]])
We get the same strides if we add a leading dimension with None/np.newaxis
:
In [81]: x.strides
Out[81]: (4,)
In [82]: x[None,:].strides
Out[82]: (0, 4)