I am using bioconductor package MLSeq on Ubuntu with R version 3.1.2 . I have tried running through the example provided by the package, and that work just fine. However, I want to use the bagsvm method for the classify function, so at chunk 14, I changed the code from
svm <- classify(data = data.trainS4, method = "svm", normalize = "deseq",
deseqTransform = "vst", cv = 5, rpt = 3, ref = "T")
to
bagsvm <- classify(data = data.trainS4, method = "bagsvm", normalize = "deseq",
deseqTransform = "vst", cv = 5, rpt = 3, ref = "T")
which produced the error:
Something is wrong; all the Accuracy metric values are missing:
Accuracy Kappa
Min. : NA Min. : NA
1st Qu.: NA 1st Qu.: NA
Median : NA Median : NA
Mean :NaN Mean :NaN
3rd Qu.: NA 3rd Qu.: NA
Max. : NA Max. : NA
NA's :1 NA's :1
Error in train.default(counts, conditions, method = "bag", B = B, bagControl = bagControl(fit = svmBag$fit, :
Stopping
In addition: There were 17 warnings (use warnings() to see them)
The warnings were:
Warning messages:
1: executing %dopar% sequentially: no parallel backend registered
2: In eval(expr, envir, enclos) :
model fit failed for Fold1.Rep1: vars=150 Error in fitter(btSamples[[iter]], x = x, y = y, ctrl = bagControl, v = vars, :
task 1 failed - "could not find function "lev""
warning 2 was then repeated 14 times followed by:
17: In nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, ... :
There were missing values in resampled performance measures.
traceback() produced
4: stop("Stopping")
3: train.default(counts, conditions, method = "bag", B = B, bagControl = bagControl(fit = svmBag$fit,
predict = svmBag$pred, aggregate = svmBag$aggregate), trControl = ctrl,
...)
2: train(counts, conditions, method = "bag", B = B, bagControl = bagControl(fit = svmBag$fit,
predict = svmBag$pred, aggregate = svmBag$aggregate), trControl = ctrl,
...)
1: classify(data = data.trainS4, method = "bagsvm", normalize = "deseq",
deseqTransform = "vst", cv = 5, rpt = 3, ref = "T")
I thought the problem might have been that the kernlab library, which I think MLSeq code uses, didn't get loaded so I tried
library(kernlab)
bagsvm <- classify(data = data.trainS4, method = "bagsvm", normalize = "deseq",
deseqTransform = "vst", cv = 5, rpt = 3, ref = "T")
which resulted in the same error, but the warnings changed to:
Warning messages:
1: In eval(expr, envir, enclos) :
model fit failed for Fold1.Rep1: vars=150 Error in fitter(btSamples[[iter]], x = x, y = y, ctrl = bagControl, v = vars, :
task 1 failed - "no applicable method for 'predict' applied to an object of class "c('ksvm', 'vm')""
repeated 15 times followed by
16: In nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, ... :
There were missing values in resampled performance measures.
I don't believe this problem is specific to MLSeq as I tried running the train function as
ctrl <- trainControl(method = "repeatedcv", number = 5,
repeats = 3)
train <- train(counts, conditions, method = "bag", B = 100,
bagControl = bagControl(fit = svmBag$fit, predict = svmBag$pred,
aggregate = svmBag$aggregate), trControl = ctrl)
where counts is a data frame with the RNASeq data and conditions is a factor with the classes and I got the exact same results. Any help is much appreciated.
I was trying to debug my problem, and seem to have inadvertently found a solution. Since the problem seemed to be in the predict function so I stored the svmBag$pred function as a variable predfunct so I could see where it was not working
predfunct<-function (object, x)
{
if (is.character(lev(object))) {
out <- predict(object, as.matrix(x), type = "probabilities")
colnames(out) <- lev(object)
rownames(out) <- NULL
}
else out <- predict(object, as.matrix(x))[, 1]
out
}
and then calling
train <- train(counts, conditions, method = "bag", B = 100,
bagControl = bagControl(fit = svmBag$fit, predict = predfunct,
aggregate = svmBag$aggregate), trControl = ctrl)
as in the last code block of the problem description with predfunct replacing svmBag$pred. Somehow this fixed the problem and everything runs just fine. If anyone can figure out why this worked, and preferably find a solution that isn't such a kluge, I will make your response the answer.