I know that numpy stores numbers in contiguous memory. So is it possible to take
a = np.array([127,127,127,127,127,127,127,127], dtype=np.uint8)
the binary representation of 'a' is all ones
to this:
b = np.array([72057594037927935], dtype=np.uint64)
as well as back again from b->a.
The binary representation is all ones however the number of elements is combined to one single 64 bit int the representation should be the same in Numpy only the metadata should change.
This sounds like a job for stride tricks but my best guess is:
np.lib.stride_tricks.as_strided(a, shape=(1,), strides=(8,8))
and
np.lib.stride_tricks.as_strided(b, shape=(8,), strides=(1,8))
only to get ValueError: mismatch in length of strides and shape
This only needs to be read only so I have no delusions thinking that I need to change the data.
If you want to reinterpret the existing data in an array you need numpy.ndarray.view
. That's the main difference between .astype
and .view
(i.e. the former converts to a new type with the values being preserved, while the latter maintains the same memory and changes how it's interpreted):
import numpy as np
a = np.array([127,127,127,127,127,127,127,127], dtype=np.uint8)
b = a.view(np.uint64)
print(a)
print(b)
print(b.view(np.uint8))
This outputs
[127 127 127 127 127 127 127 127]
[9187201950435737471]
[127 127 127 127 127 127 127 127]
Note that 127
has a leading zero in its binary pattern, so it's not all ones, which is why the value we get in b
is different from what you expect:
>>> bin(b[0])
'0b111111101111111011111110111111101111111011111110111111101111111'
>>> bin(72057594037927935)
'0b11111111111111111111111111111111111111111111111111111111'
What you seem to assume is a set of uint7
values of one bits...
Anyway, the best part about .view
is that the exact same block of memory will be used unless you explicitly copy:
>>> b.base is a
True
The corollary, of course, is that mutating b
will affect a
:
>>> b += 3
>>> a
array([130, 127, 127, 127, 127, 127, 127, 127], dtype=uint8)
To control endianness you'd want to use string-valued dtype specifications, i.e. a.view('<u8')
(little endian) or a.view('>u8')
(big endian). We can use this to reproduce the faulty number in your question:
>>> a2 = np.array([0] + [255] * 7, dtype=np.uint8)
... a2.view('>u8')
array([72057594037927935], dtype=uint64)