Code:
ranger(outcome~., data, num.trees=500, probability=TRUE)
Error: Missing data in columns
Is there a format that the data needs to be in? How to get past this error?
You need to remove NAs Example:
ranger(outcome~., data[complete.cases(data),], num.trees=500, probability=TRUE)
Other methods use packages like mice
or miceFast
to impute (fill NA).
Other simple solution to impute the data with random data (from each column).
data_cs = data.frame(Map(function(x) Hmisc::impute(x,'random'), data))
ranger(outcome~., data_cs, num.trees=500, probability=TRUE)