pythonnumpy

Repeat rows of a 2D array


I have a numpy array and I want to repeat it n times while preserving the original order of the rows:

>>>a
array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11]])

Desired ouput (for n =2):

>>>a
array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11],
       [ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11]])

I found a np.repeat function, however, it doesnt preserve the original order of the columns. Is there any other in-built function or a trick that will repeat the array while preserving the order?


Solution

  • This is another way of doing it. I have also added some time comparison against @coldspeed's solution

    n = 2
    a_new = np.tile(a.flatten(), n) 
    a_new.reshape((n*a.shape[0], a.shape[1]))
    # array([[ 0,  1,  2,  3],
    #        [ 4,  5,  6,  7],
    #        [ 8,  9, 10, 11],
    #        [ 0,  1,  2,  3],
    #        [ 4,  5,  6,  7],
    #        [ 8,  9, 10, 11]])
    

    Performance comparison with coldspeed's solution

    My method for n = 10000

    a = np.array([[ 0,  1,  2,  3],
           [ 4,  5,  6,  7],
           [ 8,  9, 10, 11]])
    n = 10000
    
    def tile_flatten(a, n):
        a_new = np.tile(a.flatten(), n).reshape((n*a.shape[0], a.shape[1])) 
        return a_new
    
    %timeit tile_flatten(a,n)
    # 149 µs ± 20.2 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)   
    

    coldspeed's solution 1 for n = 10000

    a = np.array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11]])
    n = 10000
    
    def concatenate_repeat(a, n):
        a_new =  np.concatenate(np.repeat(a[None, :], n, axis=0), axis=0)
        return a_new
    
    %timeit concatenate_repeat(a,n)
    # 7.61 ms ± 1.37 ms per loop (mean ± std. dev. of 7 runs, 100 loops each)
    

    coldspeed's solution 2 for n = 10000

    a = np.array([[ 0,  1,  2,  3],
           [ 4,  5,  6,  7],
           [ 8,  9, 10, 11]])
    n = 10000
    
    def broadcast_reshape(a, n):
        a_new =  np.broadcast_to(a, (n, *a.shape)).reshape(-1, a.shape[1])
        return a_new
    
    %timeit broadcast_reshape(a,n)
    # 162 µs ± 29.8 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
    

    @user2357112's solution

    def tile_only(a, n):
        a_new = np.tile(a, (n, 1))
        return a_new
    
    %timeit tile_only(a,n)
    # 142 µs ± 21.8 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)