When I create an model with the 'train' function from the Caret package to do a gradient boosting with weights, I get an error when using the 'varImp' function that says it didn't detected a tree model. But when I remove the weights it works.
Thee code below produces the error:
set.seed(123)
model_weights <- ifelse(modelo_df_sseg$FATALIDADES == 1,
yes = (1/table(modelo_df_sseg$FATALIDADES)[2]) * 0.5,
no = (1/table(modelo_df_sseg$FATALIDADES)[1]) * 0.5
)
model <- train(
as.factor(FATALIDADES) ~.,
data = modelo_df_sseg,
method = "xgbTree",
trControl = trainControl("cv", number = 10),
weights = model_weights
)
varImp(model)
But if I don't apply weights it works.
Why varImp doens't recognizes my tree?
EDIT 04-SEP-2020
It was suggested by missuse in the comments section to use wts instead of weights. Now I get the error below:
Error in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, : formal argument 'wts' matched by multiple actual arguments
I made a small code with an R in-built dataset so you can test it yourself:
set.seed(123)
basex <- Arrests
model_weights <- ifelse(basex$released == 2,
yes = (1/table(basex$released)[2]) * 0.5,
no = (1/table(basex$released)[1]) * 0.5
)
y = basex$released
x = basex
tc = trainControl("cv", number = 10)
mtd = "xgbTree"
model <- train(
x,
y,
method = mtd,
trControl = tc,
wts = model_weights,
verbose = TRUE
)
Maybe I'm creating the weights vector wrong. But I can't find any documentation on the 'wts' parameter.
The example code has several problems.
The correct way to apply weights in caret is using the weights
argument to train
.
I was mistaken in the comments where I recommended to use the argument wts
. My error was due to the xgbTree source, specifically the line:
if (!is.null(wts))
xgboost::setinfo(x, 'weight', wts)
which indicates wts
might be the correct answer.
Lets go through the example and fix all problems
library(caret)
library(car) #for the data set
library(tidyverse) #because I like to use it
data(Arrests)
basex <- Arrests
table(basex$released) #released is the outcome class
No Yes
892 4334
Here we see "Yes" outcome is much more frequent then "No" outcome. This will skew the predicted probabilities and favor a model which will tend to predict "Yes". One way to fix it is to give higher weight to the "No" observations. A meaningful weight for the "No" observations would be the proportion of the "Yes" class, and a meaningful weight for the "Yes" observations would be the proportion of the "No" class:
model_weights <- ifelse(basex$released == "Yes",
table(basex$released)[1]/nrow(basex),
table(basex$released)[2]/nrow(basex))
The sum of the weights is 1
head(data.frame(basex,
weights = model_weights))
released colour year age sex employed citizen checks weights
1 Yes White 2002 21 Male Yes Yes 3 0.170685
2 No Black 1999 17 Male Yes Yes 3 0.829315
3 Yes White 2000 24 Male Yes Yes 3 0.170685
4 No Black 2000 46 Male Yes Yes 1 0.829315
5 Yes Black 1999 27 Female Yes Yes 1 0.170685
6 Yes Black 1998 16 Female Yes Yes 0 0.170685
"Yes" is more frequent so we give it a lesser weight.
From the above it can be seen the data frame has several categorical predictors (like colour, sex...). xgbTree
can not handle them so you will need to convert them to numeric prior to modeling. One way to convert categorical predictors to numeric is dummy coding. There are other ways but that is not within the scope of this answer.
To use dummy coding:
dummies <- dummyVars(released ~ ., data = basex)
x <- predict(dummies, newdata = basex)
head(x)
colour.Black colour.White year age sex.Female sex.Male employed.No employed.Yes citizen.No citizen.Yes checks
1 0 1 2002 21 0 1 0 1 0 1 3
2 1 0 1999 17 0 1 0 1 0 1 3
3 0 1 2000 24 0 1 0 1 0 1 3
4 1 0 2000 46 0 1 0 1 0 1 1
5 1 0 1999 27 1 0 0 1 0 1 1
6 1 0 1998 16 1 0 0 1 0 1 0
y <- basex$released
Now we have our weights, x and y
Since I will fit several models below I will first create the resampling folds and use them within each call to train, so they don't differ.
folds <- createFolds(basex$released, 10)
Since there is a disbalance in the class frequencies I will use twoClassSummary
so we can see the sensitivity and specificity of the trained models
tc <- trainControl(method = "cv",
number = 10,
summaryFunction = twoClassSummary,
index = folds, #predefined folds
classProbs = TRUE) #needed for twoClassSummary
mtd <- "xgbTree"
model <- train(x = x,
y = y,
method = mtd,
trControl = tc,
weights = model_weights,
verbose = TRUE,
metric = "ROC")
#no errors
model$results %>%
filter(ROC == max(ROC))
eta max_depth gamma colsample_bytree min_child_weight subsample nrounds ROC Sens Spec ROCSD SensSD SpecSD
1 0.3 1 0 0.8 1 1 50 0.7031076 0.6185944 0.693945 0.009074758 0.03516597 0.01536701
Here we see that if we use the model weights the model with the highest AUC has 0.6185944 sensitivity and 0.693945 specificity.
Without the weights
model2 <- train(x = x,
y = y,
method = mtd,
trControl = tc,
verbose = TRUE,
metric = "ROC")
#no errors
model2$results %>%
filter(ROC == max(ROC))
eta max_depth gamma colsample_bytree min_child_weight subsample nrounds ROC Sens Spec ROCSD SensSD SpecSD
1 0.3 1 0 0.8 1 0.75 50 0.701109 0.1000325 0.9713885 0.0101395 0.03343579 0.01236701
A model without the weights has sensitivity of 0.1000325 and specificity of 0.9713885.
So the meaningful weights argument fixed the model tendency to predict "Yes" all the time.