xgboost
has a parameter feature_weights
that should influence the probability of selecting a feature for the model, that is, we can give more or less weight to each feature, but it seems that the parameter does not work or am I doing something wrong?
X <- as.matrix(iris[,-5])
Y <- ifelse(iris$Species=="setosa", 1, 0)
library(xgboost)
dm1 <- xgb.DMatrix(X, label = Y)
#I set different probabilities for each feature
dm2 <- xgb.DMatrix(X, label = Y, feature_weights = c(1, 0, 0, 0.01))
params <- list(objective = "binary:logistic", eval_metric = "logloss")
set.seed(1)
xgb1 <- xgboost(data = dm1, params = params, nrounds = 10, print_every_n = 5)
[1] train-logloss:0.448305
[6] train-logloss:0.090220
[10]train-logloss:0.033148
xgb2 <- xgboost(data = dm2, params = params, nrounds = 10, print_every_n = 5)
[1] train-logloss:0.448305
[6] train-logloss:0.090220
[10]train-logloss:0.033148
But the models behave absolutely the same, it seems that the parameter feature_weights
is simply ignored
It appears that the parameter only has an effect when one of the colsample_by*
parameters is less than 1 (and by default they are all 1).