pythonnumpynumpy-einsum

einsum not giving overflow error when applied to int arrays


I just had a bug which was based on np.sum and an equivalent (or at least I thought so...) np.einsum command not giving the same result. Here is an example:

import numpy.random
array = np.random.randint(-10000, 10000, size=(4, 100, 200, 600), dtype=np.int16)

sum1 = np.sum(array, axis=(0,1,2))
sum2 = np.einsum('aijt->t', array)

print(np.allclose(sum1, sum2))

plt.figure()
plt.plot(sum1)
plt.plot(sum2)
plt.show()

After some searching, this is due to overflow of the integer data type.

My question:


Solution

  • Define a large int16:

    In [322]: y=np.int16(32000)
    

    Addition produces a warning:

    In [323]: y+y
    C:\Users\paul\AppData\Local\Temp\ipykernel_8828\1714217578.py:1: RuntimeWarning: overflow encountered in short_scalars
      y+y
    Out[323]: -1536
    

    sum promotes them to a larger int, and no warning:

    In [324]: np.sum((y,y))
    Out[324]: 64000
    
    In [325]: _.dtype
    Out[325]: dtype('int32')
    

    Make an array from that:

    In [326]: Y = np.array(y)
    

    Overflow without warning:

    In [327]: Y+Y
    Out[327]: -1536
    

    I don't recall the details, but it's been explained that checking each element of an array for overflow is/was considered to be too expensive.

    Rather than checking 'by hand', just be aware of the overflow possibility, and don't use smaller dtypes unnecessarily.

    A possible duplicate

    Sum of positive numbers results in a negative number