How does numpy reshaping of a slice works

I'm trying to reimplement numpy from scratch but I can't figure out how slicing works exactly. Take as an example this array:

> a = np.array(list(range(0,20)))

array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16,
       17, 18, 19])

Let's reshape it to

> b = a.reshape(5,4)

array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11],
       [12, 13, 14, 15],
       [16, 17, 18, 19]])

The strides here make sense

> b.strides

(32, 8) # float64 has 8 bytes, so row jumps by 4 floats and column moves by 1

> b.shape

(5, 4)

I can index to this array by using 4*row+col expression

for row in range(b.shape[0]):
    for col in range(b.shape[1]):
        assert a[4*row+col] == b[row,col]

Now let's take a slice

> c = b[1:,1:]

array([[ 5,  6,  7],
       [ 9, 10, 11],
       [13, 14, 15],
       [17, 18, 19]])

> c.strides

(32, 8) # It has the same strides, which sounds logical.

> c.shape

(4, 3)

I suppose that c just stores an offset to a. So indexing into this array would be equivalent to offset + 4*row+col

for row in range(c.shape[0]):
    for col in range(c.shape[1]):
        assert a[5 + 4*row+col] == c[row,col]

But the real magic happens when I perform reshape.

d = c.reshape(3,4)

array([[ 5,  6,  7,  9],
       [10, 11, 13, 14],
       [15, 17, 18, 19]])

Notice that 8, 12, and 16 are missing (as expected).

However, when I query strides

> d.strides

(32, 8)

> d.shape

(3, 4)

they look the same. Those strides are clearly lying. There is no way to index into a with such strides. So how does this magic happen? There must be some missing piece of information.

This must be implemented in some clever way. I can't believe that numpy would use some tricky wrapper objects. Especially if we consider that the exact same mechanism should be implemented in pytorch/tensorflow and these framework run on CUDA, which obviously cannot support advanced C++ magic with virtual classes and intermediate layers of abstraction. There must be some simple obvious trick to this that could run on GPU as well.

Solution

I like to look at a.__array_interface__ which includes a databuffer 'pointer'.. From that you can see the c offset. I suspect d is a copy, no longer a view. This should be obvious from interface.

In [155]: a=np.arange(20)
In [156]: a.__array_interface__
Out[156]: 
{'data': (34619344, False),
 'strides': None,
 'descr': [('', '<i8')],
 'typestr': '<i8',
 'shape': (20,),
 'version': 3}
In [157]: b = a.reshape(5,4)
In [158]: b.__array_interface__
Out[158]: 
{'data': (34619344, False),       # same
 'strides': None,
 'descr': [('', '<i8')],
 'typestr': '<i8',
 'shape': (5, 4),
 'version': 3}
In [159]: c = b[1:,1:]
In [160]: c.__array_interface__
Out[160]: 
{'data': (34619384, False),       # offset by 32+8
 'strides': (32, 8),
 'descr': [('', '<i8')],
 'typestr': '<i8',
 'shape': (4, 3),
 'version': 3}
In [161]: d=c.reshape(3,4)
In [162]: d.__array_interface__
Out[162]: 
{'data': (36537904, False),       # totally different
 'strides': None,
 'descr': [('', '<i8')],
 'typestr': '<i8',
 'shape': (3, 4),
 'version': 3}