pythonnumpymultidimensional-arraycartesian-productn-dimensional

Cartesian product of arbitrary-dimensional coordinates in arbitrary-dimensional arrays


This is somewhat related to Numpy: cartesian product of x and y array points into single array of 2D points

I'm looking for a concise way to create a cartesian product of two arrays with arbitrary dimensionality.

Examples:

Similar to the related thread, I want

x = numpy.array([1,2,3]) #ndim 1
y = numpy.array([4,5])  #ndim 1
cartesian_product(x,y) == numpy.array([[[1, 4],
                                        [2, 4],
                                        [3, 4]],
                                       [[1, 5],
                                        [2, 5],
                                        [3, 5]]]) #ndim "2" = ndim x + ndim y

The resulting array is 2 dimensional because [1, 4], [2, 4], etc. are coordinates and hence not a true dimension. To generalise, it might be better to write x/y as [[1], [2], [3]].

The above is equal to

numpy.dstack(numpy.meshgrid(x,y))

But I also want

x2 = numpy.array([[1,1], [2,2], [3,3]]) #ndim "1", since [1, 1] is a coordinate
cartesian_product(x2,y) == numpy.array([[[1, 1, 4],
                                         [2, 2, 4],
                                         [3, 3, 4]],

                                        [[1, 1, 5],
                                         [2, 2, 5],
                                         [3, 3, 5]]]) #ndim 2 = ndim x2 + ndim y


y2 = numpy.array([[10, 11], [20, 21]]) #ndim 1
(cartesian_product(x2, y2) ==
numpy.array([[[1, 1, 10, 11],
              [2, 2, 10, 11],
              [3, 3, 10, 11]],

             [[1, 1, 20, 21],
              [2, 2, 20, 21],
              [3, 3, 20, 21]]])) #ndim x2 + ndim y2

x3 = numpy.array([[[1, 2], [3, 4]], [[5, 6], [7, 8]]]) #ndim 2
(cartesian_product(x3, y) ==
numpy.array([[[[1, 2, 4], [3, 4, 4]], [[5, 6, 4], [7, 8, 4]]],
             [[[1, 2, 5], [3, 4, 5]], [[5, 6, 5], [7, 8, 5]]]]) #ndim 3

To visualise what I'm trying to do: As I said, [[0, 0], [0, 1], [1, 1], [1, 0]] should be interpreted as a 1-dimensional list of coordinates, which corresponds to a line. If I then do a cartesian product with [1, 2, 3, 4], I'm extruding this line in the z-direction, turning into a surface (i.e. 2-dimensional). But now the array will of course be 3-dimensional.

I suppose I can find away to solve this with loops, but is there any way to achieve this with numpy/scipy tools?


Solution

  • A memory efficient way is broadcasted assignment:

    def cartesian_product(x, y):
        if x.ndim < 2:
            x = np.atleast_2d(x).T
        if y.ndim < 2:
            y = np.atleast_2d(y).T
    
        sx, sy = x.shape, y.shape
        sz = sy[:-1] + sx[:-1] + (sy[-1] + sx[-1],)
        z = np.empty(sz, np.result_type(x, y))
    
        # Broadcasted assignment
        z[...,:sx[-1]] = x
        z[...,sx[-1]:] = y.reshape(sy[:-1] + (x.ndim-1)*(1,) + (sy[-1],))
    
        return z
    

    In case you need the details on broadcasting, this page has you covered.