pythonnumpy

Numpy geta mask of false positives from the given two vectors of y_true and y_pred


Given three classes (5,6,7) of two arrays:

y_true = np.array([5,6,7,5])
y_pred = np.array([5,7,7,5])

Since second element is false, how to return one-hot encoded array of false positive array like this?

y_falsep_class5: [0,0,0,0]
y_falsep_class6: [0,0,0,0]
y_falsep_class7: [0,1,0,0]

So the returned array will have dimension (3,4), where 3 is then num of classes and 4 is the length of vector.


Solution

  • IIUC, use np.unique to get the classes, then simple broadcasting to identify the y_pred values that differ from y_true:

    # identify unique classes (sorted)
    u = np.unique(y_true)[:, None]
    # array([[5],
    #        [6],
    #        [7]])
    
    # set as True/1 the y_pred values that match the class
    # and are different from y_true
    out = ((u == y_pred) & (u != y_true)).astype(int)
    

    Output:

    array([[0, 0, 0, 0],  # 5
           [0, 0, 0, 0],  # 6
           [0, 1, 0, 0]]) # 7