pythonarraysnumpy

How to extract N elements every M elements from an array?


Suppose I have a numpy array [1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16], How do I take 4 elements every 8 elements). Here is the expected result:

a -> [1,2,3,4, 9,10,11,12]
b -> [5,6,7,8, 13,14,15,16]

My array has hundreds of elements. I went through the numpy array documentation but I never succeeded to perform this computation other then a loop which is very slow.

EDIT: The array can have up to 3 interleave sub-array of 4 elements

4 elt sample0, 4 elt sample 1, 4 elt  sample2, 4 elt sample0, 4 elt sample 1, 4 elt sample2, 4 elt sample0, 4 elt sample 1, 4 elt sample2 ...

My array has 499875840 elements !


Solution

  • For a generic and pure numpy approach, you could argsort then split:

    N = 4 # number of consecutive elements
    M = 2 # number of output arrays
    
    idx = np.argsort(np.arange(len(arr))%(N*M)//N, kind='stable')
    # array([ 0,  1,  2,  3,  8,  9, 10, 11,  4,  5,  6,  7, 12, 13, 14, 15])
    
    a, b = np.split(arr[idx], M)
    

    As a one liner:

    out = np.split(arr[np.argsort(np.arange(len(arr))%(N*M)//N, kind='stable')], M)
    

    Output:

    # a / out[0]
    array([ 1,  2,  3,  4,  9, 10, 11, 12])
    
    # b / out[1]
    array([ 5,  6,  7,  8, 13, 14, 15, 16])
    

    Output with arr = np.arange(32) as input:

    # a
    array([ 0,  1,  2,  3,  8,  9, 10, 11, 16, 17, 18, 19, 24, 25, 26, 27])
    
    # b
    array([ 4,  5,  6,  7, 12, 13, 14, 15, 20, 21, 22, 23, 28, 29, 30, 31])
    

    Output with arr = np.arange(32), N = 4, M = 4:

    (array([ 0,  1,  2,  3, 16, 17, 18, 19]),
     array([ 4,  5,  6,  7, 20, 21, 22, 23]),
     array([ 8,  9, 10, 11, 24, 25, 26, 27]),
     array([12, 13, 14, 15, 28, 29, 30, 31])
    

    timings

    Paul's approach is faster than mine (but limited to 2 arrays as output).

    enter image description here

    generalization

    A reshaping approach, as proposed by @hpaulj, can be generalized using:

    N = 4 # number of consecutive elements
    M = 3 # number of samples/output arrays
    
    out = arr.reshape(-1, M, N).transpose(1, 0, 2).reshape(-1, arr.size//M)
    
    # or to map to individual variables
    a,b,c = arr.reshape(-1, M, N).transpose(1, 0, 2).reshape(-1, arr.size//M) 
    

    @U13-Forward's approach only works when M = 2, if can however be generalized using a list comprehension:

    N = 4 # number of consecutive elements
    M = 3 # number of samples/output arrays
    
    reshaped = arr.reshape(-1, N*M)
    out = [reshaped[:, n*N:n*N+N].ravel() for n in range(M)]
    

    enter image description here