When I use label_binarize
I do not get the correct number of classes even though I specify it. This is my simple code:
import numpy as np
from sklearn.preprocessing import label_binarize
y = ['tap', 'not_tap', 'tap', 'tap', 'not_tap', 'tap', 'not_tap','not_tap']
y = label_binarize(y, classes=[0, 1])
n_classes = y.shape[1]
I get n_classes
= 1. While using this code, I get the warning message:
FutureWarning: elementwise comparison failed; returning scalar instead, but in the future will perform elementwise comparison
mask |= (ar1 == a)
Can you tell me how to correctly get n_classes
= 2 as in this example?
Thank you!
label_binarize
binarizes the values in a one-vs-all fashion
Consider this example
from sklearn.preprocessing import label_binarize
print(label_binarize([1, 6], classes=[1, 2, 4, 6]))
[[1 0 0 0]
[0 0 0 1]]
The columns are the classes [1,2,4,6]
and 1 denotes if the value matches the class or not.
The way you're invoking it now (label_binarize(y, classes=[0, 1])
), none of the values (tap,no_tap) match any of the classes (0,1) and hence all values are 0.
What you're looking for is a LabelBinarizer
from sklearn.preprocessing import LabelBinarizer
y = ['tap', 'not_tap', 'tap', 'tap', 'not_tap', 'tap', 'not_tap','not_tap']
lb = LabelBinarizer()
label = lb.fit_transform(y)
[[1]
[0]
[1]
[1]
[0]
[1]
[0]
[0]]
n_classes = len(lb.classes_)
#2