pythonmachine-learningclassificationimbalanced-data

Can I use RandomUnderSampler for categorical data as well?


AFAIK, unlike SMOTE, RandomUnderSampler selects a subset of the data. But I am not quite confident to use it for categorical data.

So, is it really applicable for categorical data?


Solution

  • Under/Over sampling has nothing to do with features. It relies on targets and under/oversamples majority/minority class, no matter whatheter it is composed of continuous variables, categorical ones, or elephants :)