pythonnumpymatrixbinary-matrix

Project a multi-class array into a binary matrix


I have a simple numpy array (e.g. [1,4,2,3,1]) and want to project it into a binary matrix, where each value in the array maps to an indicator in that column of the matrix.

For example, this array would map to a matrix like:

[1], => [1,0,0,0],
[4],    [0,0,0,1],
[2],    [0,1,0,0],
[3],    [0,0,1,0],
[1]     [1,0,0,0]

I can do this with iterating and list comprehensions, but is there an elegant numpy solution?


Solution

  • We can use broadacsting -

    (a[:,None] == np.arange(a.max())+1).astype(int)
    

    Sample run -

    In [28]: a = np.array([1,4,2,3,1,2,1,4])
    
    In [29]: a[:,None] == np.arange(a.max())+1 # Booelan array
    Out[29]: 
    array([[ True, False, False, False],
           [False, False, False,  True],
           [False,  True, False, False],
           [False, False,  True, False],
           [ True, False, False, False],
           [False,  True, False, False],
           [ True, False, False, False],
           [False, False, False,  True]], dtype=bool)
    
    In [30]: (a[:,None] == np.arange(a.max())+1).astype(int) # Int array
    Out[30]: 
    array([[1, 0, 0, 0],
           [0, 0, 0, 1],
           [0, 1, 0, 0],
           [0, 0, 1, 0],
           [1, 0, 0, 0],
           [0, 1, 0, 0],
           [1, 0, 0, 0],
           [0, 0, 0, 1]])
    

    For mapping integers that are not sequential and expecting no all False columns, we could use np.unique(a) directly for comparison against the 2D version of input array a, like so -

    In [49]: a = np.array([14,12,33,71,97])
    
    In [50]: a[:,None] == np.unique(a) # Boolean array
    Out[50]: 
    array([[False,  True, False, False, False],
           [ True, False, False, False, False],
           [False, False,  True, False, False],
           [False, False, False,  True, False],
           [False, False, False, False,  True]], dtype=bool)
    
    In [51]: (a[:,None] == np.unique(a)).astype(int) # Int array
    Out[51]: 
    array([[0, 1, 0, 0, 0],
           [1, 0, 0, 0, 0],
           [0, 0, 1, 0, 0],
           [0, 0, 0, 1, 0],
           [0, 0, 0, 0, 1]])