pythonarraysnumpypermutation

Fastest way of creating a pair-wise matching matrix from a single numpy array


I want to do quick pair-wise matchings of all elements in a numpy array, ignoring self-matching.

For example given a numpy array like

a = np.array(range(4))

I want to get a matrix like

array([[0, 1], 
       [0, 2],
       [0, 3],
       [1, 0],
       [1, 2],
       [1, 3],
       [2, 0],
       [2, 1],
       [2, 3],
       [3, 0],
       [3, 1],
       [3, 2]])

In my real case, I have different sizes of arrays varying from 1 to 1000. I need to constantly call a function that does this pair-wise matching. I wrote the code below

import numpy as np
from itertools import permutations

a = np.array(range(1000))
xs, ys = zip(*[(x, y) for x, y in permutations(a, 2)])
res = np.vstack([np.array(xs), np.array(ys)]).T
print(res)

I want to know if there are faster ways (maybe more numpyic ways) to do this?

I would like to see some benchmarkings of runtime vs. array size.


Solution

  • Making a meshgrid and then filtering out the "matching" rows (e.g. 0,0 and 12,12) would likely be quite a bit faster at this scale.

    import numpy as np
    from itertools import permutations
    
    %%time
    n = 1000
    a = np.array(np.arange(n))
    xs, ys = zip(*[(x, y) for x, y in permutations(a, 2)])
    res = np.vstack([np.array(xs), np.array(ys)]).T
    res
    CPU times: user 1.08 s, sys: 176 ms, total: 1.26 s
    Wall time: 1.28 s
    
    %%time
    n = 1000
    a = np.arange(n)
    arr = np.array(np.meshgrid(a, a)).T.reshape(-1,2)
    arr = arr[arr[:, 0] != arr[:, 1]]
    CPU times: user 49.8 ms, sys: 33.2 ms, total: 83 ms
    Wall time: 89.2 ms
    
    %%time
    n = 2000
    a = np.array(np.arange(n))
    xs, ys = zip(*[(x, y) for x, y in permutations(a, 2)])
    res = np.vstack([np.array(xs), np.array(ys)]).T
    res
    CPU times: user 3.94 s, sys: 924 ms, total: 4.86 s
    Wall time: 5.03 s
    
    %%time
    n = 2000
    a = np.arange(n)
    arr = np.array(np.meshgrid(a, a)).T.reshape(-1,2)
    arr = arr[arr[:, 0] != arr[:, 1]]
    CPU times: user 218 ms, sys: 105 ms, total: 323 ms
    Wall time: 328 ms
    
    np.testing.assert_array_equal(res, arr) # passes