I am trying to train a xgboost model using iris dataset. The training code is shown below, and both prediction functions produce the same results. However, the length of the results is 135, while test_data has only 45 rows. In addition, the results seems to look like predicted probabilities, but there are 3 classes in the label, while the results only produce a vector instead of a matrix of predicted probabilities of three classes. So, how can I get the predicted probability for each class and also the predicted class?
data("iris")
iris$Species <- as.numeric(as.factor(iris$Species)) - 1
indexes <- caret::createDataPartition(iris$Species, p = .7, list = F)
train_data <- iris[indexes, ]
test_data <- iris[-indexes, ]
xgb.train <- xgb.DMatrix(data = as.matrix(train_data), label = train_data$Species)
xgb.test <- xgb.DMatrix(data = as.matrix(test_data), label = test_data$Species)
params = list("objective" = "multi:softprob",
"eval_metric" = "mlogloss",
"num_class" = 3)
xgb.model <- xgboost::xgb.train(params = params, data = xgb.train, nrounds = 1000)
predict(xgb.model, newdata = xgb.test)
predict(xgb.model, newdata = xgb.test, type = "prob")
0.985415220 0.008038994 0.006545801 0.985415220 0.008038994 0.006545801 0.985415220 0.008038994 0.006545801 0.985415220 0.008038994 0.006545801 0.985415220 0.008038994 0.006545801 0.985415220 0.008038994 0.006545801 0.985415220 0.008038994 0.006545801 0.985415220 0.008038994 0.006545801 0.985415220 0.008038994 0.006545801 0.985415220 0.008038994 0.006545801 0.985415220 0.008038994 0.006545801 0.985415220 0.008038994 0.006545801 0.985415220 0.008038994 0.006545801 0.985415220 0.008038994 0.006545801 0.985415220 0.008038994 0.006545801 0.985415220 0.008038994 0.006545801 0.985415220 0.008038994 0.006545801 0.985415220 0.008038994 0.006545801 0.977108896 0.016400522 0.006490625 0.985415220 0.008038994 0.006545801 0.008124468 0.983585954 0.008289632 0.005110676 0.989674747 0.005214573 0.003452316 0.993025184 0.003522499 0.005499140 0.988889933 0.005610934 0.011182932 0.977406859 0.011410273 0.005110676 0.989674747 0.005214573 0.011182932 0.977406859 0.011410273 0.011182932 0.977406859 0.011410273 0.003452316 0.993025184 0.003522499 0.010401487 0.978985548 0.010612942 0.005250969 0.005771303 0.988977730 0.005250969 0.005771303 0.988977730 0.005250969 0.005771303 0.988977730 0.005239322 0.007976402 0.986784279 0.005239322 0.007976402 0.986784279 0.005239322 0.007976402 0.986784279 0.005250969 0.005771303 0.988977730 0.005219116 0.011802264 0.982978642 0.005250969 0.005771303 0.988977730 0.005219116 0.011802264 0.982978642 0.005219116 0.011802264 0.982978642 0.005250969 0.005771303 0.988977730 0.005250969 0.005771303 0.988977730 0.005250969 0.005771303 0.988977730 0.005180326 0.019146746 0.975672841
The predicted values are concatenated into a vector. You can simply convert them into a matrix to get the predicted values for each class. Make sure byrow=TRUE
since the values are for each class in turn.
pred <- predict(xgb.model, newdata = xgb.test)
pred <- matrix(pred, ncol=xgb.model$params$num_class, byrow=TRUE)
head(pred)
[,1] [,2] [,3]
[1,] 0.9858927 0.007713272 0.006394033
[2,] 0.9858927 0.007713272 0.006394033
[3,] 0.9858927 0.007713272 0.006394033
[4,] 0.9858927 0.007713272 0.006394033
[5,] 0.9858927 0.007713272 0.006394033
[6,] 0.9790474 0.014603053 0.006349638