pythonpython-3.xnumpymultidimensional-arraytensordot

understanding numpy np.tensordot


arr1 = np.arange(8).reshape(4, 2)
arr2 = np.arange(4, 12).reshape(2, 4)
ans=np.tensordot(arr1,arr2,axes=([1],[0]))
ans2=np.tensordot(arr1,arr2,axes=([0],[1]))
ans3 = np.tensordot(arr1,arr2, axes=([1,0],[0,1]))

I am trying to understand how this tensordot function work . I know that it returns the tensordot product.

but axes part is a little bit difficult for me to comprehend. what i have observed that

for ans it is like the number of columns in array arr1 and the number of rows in arr2 makes the final matrix.

for ans2 it is the other way around number of columns in arr2 and number of rows in arr1

i dont understand axes=([1,0],[0,1]). let me know if my understanding for ans and ans2 is correct


Solution

  • You forgot to show the arrays:

    In [87]: arr1
    Out[87]: 
    array([[0, 1],
           [2, 3],
           [4, 5],
           [6, 7]])
    In [88]: arr2
    Out[88]: 
    array([[ 4,  5,  6,  7],
           [ 8,  9, 10, 11]])
    In [89]: ans
    Out[89]: 
    array([[  8,   9,  10,  11],
           [ 32,  37,  42,  47],
           [ 56,  65,  74,  83],
           [ 80,  93, 106, 119]])
    In [90]: ans2
    Out[90]: 
    array([[ 76, 124],
           [ 98, 162]])
    In [91]: ans3
    Out[91]: array(238)
    

    ans is just the regular dot, matrix product:

    In [92]: np.dot(arr1,arr2)
    Out[92]: 
    array([[  8,   9,  10,  11],
           [ 32,  37,  42,  47],
           [ 56,  65,  74,  83],
           [ 80,  93, 106, 119]])
    

    The dot sum-of-products is performed on ([1],[0]) axis 1 of arr1, and axis 0 of arr2 (the conventional across the columns, down the rows). With 2d 'sum across ...' phrase can be confusing. It's clearer when dealing with 1 or 3d arrays. Here the matching size 2 dimensions are summed, leaving the (4,4).

    ans2 reverses them, summing on the 4's, producing a (2,2):

    In [94]: np.dot(arr2,arr1)
    Out[94]: 
    array([[ 76,  98],
           [124, 162]])
    

    tensordot has just transposed the 2 arrays and performed a regular dot:

    In [95]: np.dot(arr1.T,arr2.T)
    Out[95]: 
    array([[ 76, 124],
           [ 98, 162]])
    

    ans3 is uses a transpose and reshape (ravel), to sum on both axes:

    In [98]: np.dot(arr1.ravel(),arr2.T.ravel())
    Out[98]: 238
    

    In general, tensordot uses a mix of transpose and reshape to reduce the problem to a 2d np.dot problem. It may then reshape and transpose the result.

    I find the dimensions control of einsum to be clearer:

    In [99]: np.einsum('ij,jk->ik',arr1,arr2)
    Out[99]: 
    array([[  8,   9,  10,  11],
           [ 32,  37,  42,  47],
           [ 56,  65,  74,  83],
           [ 80,  93, 106, 119]])
    In [100]: np.einsum('ji,kj->ik',arr1,arr2)
    Out[100]: 
    array([[ 76, 124],
           [ 98, 162]])
    In [101]: np.einsum('ij,ji',arr1,arr2)
    Out[101]: 238
    

    With the development of einsum and matmul/@, tensordot has become less necessary. It's harder to understand, and doesn't have any speed or flexibility advantages. Don't worry about understanding it.

    ans3 is the trace (sum of diagonal) of the other 2 ans:

    In [103]: np.trace(ans)
    Out[103]: 238
    In [104]: np.trace(ans2)
    Out[104]: 238