rmachine-learningdecision-treeoversampling

ROSE : Error in data.frame(ynew, Xnew) : object 'ynew' not found


I'm trying to build a Decision Tree model on this data https://www.kaggle.com/raosuny/success-of-bank-telemarketing-data and I'm trying to deal with oversampling on my data, but I get this Error message :

Error in data.frame(ynew, Xnew) : object 'ynew' not found

This is the relevant code for my question :

#Change Target Column to Boolean
levels(subscribe.prep$Subscribed) <- c(FALSE,TRUE)
subscribe.prep$Subscribed <- as.logical(subscribe.prep$Subscribed)

str(subscribe.prep$Subscribed)

#Final df
subscribe <- subscribe.prep

#Decision Tree
filter <- sample.split(subscribe$Subscribed, SplitRatio = 0.7)
subscribe.train <- subset(subscribe, filter == T)
subscribe.test <- subset(subscribe, filter == F)

#Dealing with oversampling
subscribe.train.over <- ovun.sample(Subscribed ~ ., data = subscribe.train, method = 'over', N = 36000)$data

model.dt <- rpart(Subscribed ~ ., subscribe.train.over)
rpart.plot(model.dt, box.palette = "RdBu", shadow.col = "gray", nn = TRUE)

prediciton.dt <- predict(model.dt, subscribe.test, type = "class")
actual.dt <- subscribe.test$Subscribed
confusion_matrix <- table(actual.dt, prediciton.dt > 0.5)

All the features except Age and the target (Subscribed) are Factor

Age - Int

Subscribed - Logical


Solution

  • All the variables have to be either continuous or categorical. If you change your values of Subscribed from (T,F) to (0,1), as a factor variable, it should work.

    I got this error because my factor variable vales were (1,2). After I changed them to (0,1) it worked for me.