pythonxgboostxgbclassifier

XGBoostError: value 0 for Parameter num_class should be greater equal to 1


I'm trying to compare two different feature sets for classifying customers into high-value, mid-value, and low-value. This is the code I used:

ltv_xgb_model = xgb.XGBClassifier(max_depth=5, learning_rate=0.1,objective='multi:softmax',n_jobs=-1).fit(X_train, y_train)        

The first dataset has 11 customers in the training data, and 2 customers in the testing data. The classifier is able to achieve 50% precision for one of the feature sets, despite the limited number of customers.

The second dataset has 14 customers in the training data, and 2 customers in the testing data. Although we have a bigger training set, the classifier threw an error:

XGBoostError: value 0 for Parameter num_class should be greater equal to 1

Previous posts on the forum have mentioned that the .fit() method automatically sets the num_class parameter. See here: XGBClassifier num_class is invalid. Therefore, the problem seems to be caused by something else.

Does anybody has any idea where the problem is? Any help is appreciated!


Solution

  • The reason is because XGBoost is deducing number of classes based on the training data you give it. And for multi:softmax minimum number of classes should be 3 (if you have 2 classes you should use binary classification objective). So most likely the problem here is that in your dataset you only have 2 or less unique values as targets.

    In general 11 and 14 elements for datasets is very small. I would strongly recommend against training ML models on such scale. If you want to really check how good is your model with very little number of training samples you should do full leave-one-out cross-validation (i.e. train a model same way without just one example and test the resulting model on that example). If the results are looking good for you (they most likely will not though) - then you can train a model on full dataset and use that model.