[SOLVED] Project a multi-class array into a binary matrix

Project a multi-class array into a binary matrix

I have a simple numpy array (e.g. [1,4,2,3,1]) and want to project it into a binary matrix, where each value in the array maps to an indicator in that column of the matrix.

For example, this array would map to a matrix like:

[1], => [1,0,0,0],
[4],    [0,0,0,1],
[2],    [0,1,0,0],
[3],    [0,0,1,0],
[1]     [1,0,0,0]

I can do this with iterating and list comprehensions, but is there an elegant numpy solution?

Solution

We can use broadacsting -

(a[:,None] == np.arange(a.max())+1).astype(int)

Sample run -

In [28]: a = np.array([1,4,2,3,1,2,1,4])

In [29]: a[:,None] == np.arange(a.max())+1 # Booelan array
Out[29]: 
array([[ True, False, False, False],
       [False, False, False,  True],
       [False,  True, False, False],
       [False, False,  True, False],
       [ True, False, False, False],
       [False,  True, False, False],
       [ True, False, False, False],
       [False, False, False,  True]], dtype=bool)

In [30]: (a[:,None] == np.arange(a.max())+1).astype(int) # Int array
Out[30]: 
array([[1, 0, 0, 0],
       [0, 0, 0, 1],
       [0, 1, 0, 0],
       [0, 0, 1, 0],
       [1, 0, 0, 0],
       [0, 1, 0, 0],
       [1, 0, 0, 0],
       [0, 0, 0, 1]])

For mapping integers that are not sequential and expecting no all False columns, we could use np.unique(a) directly for comparison against the 2D version of input array a, like so -

In [49]: a = np.array([14,12,33,71,97])

In [50]: a[:,None] == np.unique(a) # Boolean array
Out[50]: 
array([[False,  True, False, False, False],
       [ True, False, False, False, False],
       [False, False,  True, False, False],
       [False, False, False,  True, False],
       [False, False, False, False,  True]], dtype=bool)

In [51]: (a[:,None] == np.unique(a)).astype(int) # Int array
Out[51]: 
array([[0, 1, 0, 0, 0],
       [1, 0, 0, 0, 0],
       [0, 0, 1, 0, 0],
       [0, 0, 0, 1, 0],
       [0, 0, 0, 0, 1]])