pythonarrayspython-3.xnumpystructured-array

Converting numpy array to structured array


Let's say I have the following array:

arr = np.array([[1,2], [3,4]], dtype='u1')

and I want to convert it into a structured array like this one:

strarr = np.array([(1,2), (3,4)], dtype=[('a', 'u1'), ('b', 'u1')])

If I just try

arr.astype([('a', 'u1'), ('b', 'u1')])

it returns

>>> array([[(1, 1), (2, 2)],
       [(3, 3), (4, 4)]], dtype=[('a', 'u1'), ('b', 'u1')])

How can I convert the array so that it uses all elements of a row to fill the fields (provided that the numbers match) instead of duplicating each element?


Solution

  • There are special helper functions for this:

    >>> from numpy.lib.recfunctions import unstructured_to_structured
    

    So,

    >>> import numpy as np
    >>> arr = np.array([[1,2], [3,4]], dtype='u1')
    >>> unstructured_to_structured(arr, dtype=np.dtype([('a', 'u1'), ('b', 'u1')]))
    array([(1, 2), (3, 4)], dtype=[('a', 'u1'), ('b', 'u1')])
    

    You can also create a view:

    >>> arr.ravel().view(dtype=np.dtype([('a', 'u1'), ('b', 'u1')]))
    array([(1, 2), (3, 4)], dtype=[('a', 'u1'), ('b', 'u1')])
    

    And in this simple case, that is fine, but if you choose to use a view you sometimes have to worry about how the array is packed. Note, a view doesn't copy the underlying buffer! Which can make it much more efficient if you are working with large arrays.