pythonarraysnumpyrow-major-ordercolumn-major-order

Changing `order` option from "C" (row-major) to "Fortran" (column-major) of `numpy` arrays has no effect


As the title says, I was trying to verify the ordering in numpy arrays by changing the ordering in the following test script:

import numpy as np


# Standard array
arr = [[1, 2, 3], [-7, -8, -9], ['A', 'B', 'C']]
print(arr, '\n')

for row_index, row_entries in enumerate(arr):
    print('Row ' + str(row_index+1))
    for column_index, column_entries in enumerate(row_entries):
        print(' Column ' + str(column_index+1) + '\n', '\t [' + str(column_entries) + ']')


# NumPy array
arr = np.asarray([[1, 2, 3], [-7, -8, -9], ['A', 'B', 'C']], order='F')    # Try 'C' vs. 'F'!!
print('\n\n', arr, '\n')

for row_index, row_entries in enumerate(arr):
    print('Row ' + str(row_index+1))
    for column_index, column_entries in enumerate(row_entries):
        print(' Column ' + str(column_index+1) + '\n', '\t [' + str(column_entries) + ']')

----------------------------------------------------------------------------------------------
Output:

[[1, 2, 3], [-7, -8, -9], ['A', 'B', 'C']] 

Row 1
 Column 1
         [1]
 Column 2
         [2]
 Column 3
         [3]
Row 2
 Column 1
         [-7]
 Column 2
         [-8]
 Column 3
         [-9]
Row 3
 Column 1
         [A]
 Column 2
         [B]
 Column 3
         [C]


 [['1' '2' '3']
 ['-7' '-8' '-9']
 ['A' 'B' 'C']] 

Row 1
 Column 1
         [1]
 Column 2
         [2]
 Column 3
         [3]
Row 2
 Column 1
         [-7]
 Column 2
         [-8]
 Column 3
         [-9]
Row 3
 Column 1
         [A]
 Column 2
         [B]
 Column 3
         [C]

Why am I getting identical outputs?


Solution

  • You start with a list (of lists):

    In [29]: alist = [[1, 2, 3], [-7, -8, -9], ['A', 'B', 'C']]
    In [30]: alist
    Out[30]: [[1, 2, 3], [-7, -8, -9], ['A', 'B', 'C']]
    

    Obviously we can iterate through the list, and through the sublists.

    We can make an array from that list. Usually we don't specify the order, but the default is 'C':

    In [31]: arr1 = np.array(alist, order='C')
    In [32]: arr1
    Out[32]: 
    array([['1', '2', '3'],
           ['-7', '-8', '-9'],
           ['A', 'B', 'C']], dtype='<U21')
    

    Note that the dtype is strings (I suppose I could have specified object).

    Same thing but with 'F':

    In [34]: arr2 = np.array(alist, order='F')
    In [35]: arr2
    Out[35]: 
    array([['1', '2', '3'],
           ['-7', '-8', '-9'],
           ['A', 'B', 'C']], dtype='<U21')
    

    Display is the same.

    To see how the elements are stored we have to 'ravel' the arrays. The result is a new 1d array. See np.reshape or np.ravel docs for the use of 'K' order:

    In [36]: arr1.ravel('K')
    Out[36]: array(['1', '2', '3', '-7', '-8', '-9', 'A', 'B', 'C'], dtype='<U21')
    
    In [38]: arr2.ravel('K')
    Out[38]: array(['1', '-7', 'A', '2', '-8', 'B', '3', '-9', 'C'], dtype='<U21')
    

    Here we read the values of arr2 down the columns. ravel of the first array, but with 'F' order produces the same thing:

    In [39]: arr1.ravel('F')
    Out[39]: array(['1', '-7', 'A', '2', '-8', 'B', '3', '-9', 'C'], dtype='<U21')
    

    Iteration as you do, doesn't change with the order. It effectively treats the array as a list.

    In [40]: [row for row in arr1]
    Out[40]: 
    [array(['1', '2', '3'], dtype='<U21'),
     array(['-7', '-8', '-9'], dtype='<U21'),
     array(['A', 'B', 'C'], dtype='<U21')]
    In [41]: [row for row in arr2]
    Out[41]: 
    [array(['1', '2', '3'], dtype='<U21'),
     array(['-7', '-8', '-9'], dtype='<U21'),
     array(['A', 'B', 'C'], dtype='<U21')]
    In [42]: arr2.tolist()
    Out[42]: [['1', '2', '3'], ['-7', '-8', '-9'], ['A', 'B', 'C']]
    

    You have to use numpy's own methods and tools to see the effect of order. order is more useful when creating an array via reshape:

    In [43]: np.arange(12)
    Out[43]: array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11])
    In [44]: np.arange(12).reshape(3,4)
    Out[44]: 
    array([[ 0,  1,  2,  3],
           [ 4,  5,  6,  7],
           [ 8,  9, 10, 11]])
    In [45]: np.arange(12).reshape(3,4,order='F')
    Out[45]: 
    array([[ 0,  3,  6,  9],
           [ 1,  4,  7, 10],
           [ 2,  5,  8, 11]])
    

    Tweaking the shape, and then applying a transpose:

    In [46]: np.arange(12).reshape(4,3,order='F')
    Out[46]: 
    array([[ 0,  4,  8],
           [ 1,  5,  9],
           [ 2,  6, 10],
           [ 3,  7, 11]])
    In [47]: np.arange(12).reshape(4,3,order='F').T
    Out[47]: 
    array([[ 0,  1,  2,  3],
           [ 4,  5,  6,  7],
           [ 8,  9, 10, 11]])
    

    edit

    It may be clearer if I make a smaller array with just 1 byte per element.

    The 2 orders:

    In [70]: arr1 = np.array([[1,2,3],[4,5,6]], 'uint8')
    In [72]: arr2 = np.array([[1,2,3],[4,5,6]], 'uint8',order='F')
    In [73]: arr1
    Out[73]: 
    array([[1, 2, 3],
           [4, 5, 6]], dtype=uint8)
    In [74]: arr2
    Out[74]: 
    array([[1, 2, 3],
           [4, 5, 6]], dtype=uint8)
    

    Instead of ravel, use tobytes with 'A' order to preserve the underlying order (see tobytes docs):

    In [75]: arr1.tobytes(order='A')
    Out[75]: b'\x01\x02\x03\x04\x05\x06'
    In [76]: arr2.tobytes(order='A')
    Out[76]: b'\x01\x04\x02\x05\x03\x06'
    

    The difference can alse be seen in the strides:

    In [77]: arr1.strides
    Out[77]: (3, 1)
    In [78]: arr2.strides
    Out[78]: (1, 2)
    

    strides controls how numpy iterates through the array in compiled code (but not when using python level iteration).

    A comment suggested using nditer to iterate via numpy's own methods. Generally I don't recommend using nditer, but here it is is illustrative:

    In [79]: [i for i in np.nditer(arr1)]
    Out[79]: 
    [array(1, dtype=uint8),
     array(2, dtype=uint8),
     array(3, dtype=uint8),
     array(4, dtype=uint8),
     array(5, dtype=uint8),
     array(6, dtype=uint8)]
    In [80]: [i for i in np.nditer(arr2)]
    Out[80]: 
    [array(1, dtype=uint8),
     array(4, dtype=uint8),
     array(2, dtype=uint8),
     array(5, dtype=uint8),
     array(3, dtype=uint8),
     array(6, dtype=uint8)]
    

    nditer takes an order, but 'K' is default (in contrast to many other cases where 'C' is the default).