python-3.xnumpynumpy-ndarrayself-attention

Identify identical vectors as part of a multidimensional dot product


I am wanting to identify identical vectors after a dot product calc. The below works for a single dimension but not multi-dimensionally.

Single Dimension

a = np.array([0.8,0.5])
b = np.array([0.8,0.5])
x = a @ b / np.linalg.norm(a) / np.linalg.norm(b)
print(x) #Output: 1

Multi-dimensional

a = np.array([[0.8,0.5],[0.4,1]])
b = np.array([[0.8,0.5],[0.4,1]])
x = a @ b / np.linalg.norm(a) / np.linalg.norm(b)
print(x) #Output: [[0.4097561  0.43902439] [0.35121951 0.58536585]]
#Desired Output [[1 0.43] [0.35 1]] 0.43, and 0.35 will be the wrong values but would just not expect them to be 1.

I would expect this to output at least 2 '1's. Recognise this is likely due to it happening after the @. Is there a way to do this calc as part of it and have the final output as a multidimensional result?


Solution

  • Your 2 arrays, the 1d and 2d. No need to repeat them for this demo.

    In [16]: a = np.array([0.8,0.5])
    
    In [17]: b = np.array([[0.8,0.5],[0.4,1]])
    

    The 1d dot, and its norm:

    In [18]: a@a
    Out[18]: np.float64(0.8900000000000001)
    
    In [19]: np.linalg.norm(a)
    Out[19]: np.float64(0.9433981132056605)
    

    With those we can get your desired 1:

    In [21]: (a@a)/(np.linalg.norm(a)**2)
    Out[21]: np.float64(1.0)
    

    The 2d 'dot':

    In [22]: b@b
    Out[22]: 
    array([[0.84, 0.9 ],
           [0.72, 1.2 ]])
    

    But wait, don't we want to use the transpose:

    In [23]: b@b.T
    Out[23]: 
    array([[0.89, 0.82],
           [0.82, 1.16]])
    
    In [24]: b[1]@b[1]
    Out[24]: np.float64(1.1600000000000001)
    

    That gives us the .089 from a@a, and a corresponding 1d dot for the 2nd row.

    The norm of b is a single number! Read the docs, that's the norm for flattened b. We need to specify the axis, here 1 to get row-wise norms:

    In [27]: np.linalg.norm(b[1])
    Out[27]: np.float64(1.077032961426901)
    
    In [28]: np.linalg.norm(b[0])     # norm(a)
    Out[28]: np.float64(0.9433981132056605)
    
    In [29]: np.linalg.norm(b,axis=1)
    Out[29]: array([0.94339811, 1.07703296])
    

    Now we can get the desired 1's for the 2d array:

    In [30]: (b@b.T)/(np.linalg.norm(b,axis=1)**2)
    Out[30]: 
    array([[1.        , 0.70689655],
           [0.92134831, 1.        ]])
    

    edit

    Norm without the axis:

    In [36]: np.linalg.norm(b)
    Out[36]: np.float64(1.4317821063276355)
    In [37]: np.linalg.norm(b.ravel())
    Out[37]: np.float64(1.4317821063276355)
    In [39]: np.sqrt(b.ravel()@b.ravel())
    Out[39]: np.float64(1.4317821063276355)
    

    Your 1d case, x = a @ b / np.linalg.norm(a) / np.linalg.norm(b), with a and b identical is effectively

    x = (a @ a) / (np.sqrt(a@a)**2)