I am using bioconductor package MLSeq
on Ubuntu with R version 3.1.2 . I have tried running through the example provided by the package, and that work just fine. However, I want to use the bagsvm
method for the classify
function, so at chunk 14
, I changed the code from
svm <- classify(data = data.trainS4, method = "svm", normalize = "deseq",
deseqTransform = "vst", cv = 5, rpt = 3, ref = "T")
to
bagsvm <- classify(data = data.trainS4, method = "bagsvm", normalize = "deseq",
deseqTransform = "vst", cv = 5, rpt = 3, ref = "T")
which produced the error:
Something is wrong; all the Accuracy metric values are missing:
Accuracy Kappa
Min. : NA Min. : NA
1st Qu.: NA 1st Qu.: NA
Median : NA Median : NA
Mean :NaN Mean :NaN
3rd Qu.: NA 3rd Qu.: NA
Max. : NA Max. : NA
NA's :1 NA's :1
Error in train.default(counts, conditions, method = "bag", B = B, bagControl = bagControl(fit = svmBag$fit, :
Stopping
In addition: There were 17 warnings (use warnings() to see them)
The warnings were:
Warning messages:
1: executing %dopar% sequentially: no parallel backend registered
2: In eval(expr, envir, enclos) :
model fit failed for Fold1.Rep1: vars=150 Error in fitter(btSamples[[iter]], x = x, y = y, ctrl = bagControl, v = vars, :
task 1 failed - "could not find function "lev""
warning 2 was then repeated 14 times followed by:
17: In nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, ... :
There were missing values in resampled performance measures.
traceback()
produced
4: stop("Stopping") 3: train.default(counts, conditions, method = "bag", B = B, bagControl = bagControl(fit = svmBag$fit, predict = svmBag$pred, aggregate = svmBag$aggregate), trControl = ctrl, ...) 2: train(counts, conditions, method = "bag", B = B, bagControl = bagControl(fit = svmBag$fit, predict = svmBag$pred, aggregate = svmBag$aggregate), trControl = ctrl, ...) 1: classify(data = data.trainS4, method = "bagsvm", normalize = "deseq", deseqTransform = "vst", cv = 5, rpt = 3, ref = "T")
I thought the problem might have been that the kernlab
library, which I think MLSeq code uses, didn't get loaded so I tried
library(kernlab)
bagsvm <- classify(data = data.trainS4, method = "bagsvm", normalize = "deseq",
deseqTransform = "vst", cv = 5, rpt = 3, ref = "T")
which resulted in the same error, but the warnings changed to:
Warning messages: 1: In eval(expr, envir, enclos) : model fit failed for Fold1.Rep1: vars=150 Error in fitter(btSamples[[iter]], x = x, y = y, ctrl = bagControl, v = vars, : task 1 failed - "no applicable method for 'predict' applied to an object of class "c('ksvm', 'vm')""
repeated 15 times followed by
16: In nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, ... :
There were missing values in resampled performance measures.
I don't believe this problem is specific to MLSeq
as I tried running the train
function as
ctrl <- trainControl(method = "repeatedcv", number = 5,
repeats = 3)
train <- train(counts, conditions, method = "bag", B = 100,
bagControl = bagControl(fit = svmBag$fit, predict = svmBag$pred,
aggregate = svmBag$aggregate), trControl = ctrl)
where counts
is a data frame with the RNASeq data and conditions
is a factor with the classes and I got the exact same results. Any help is much appreciated.
I was trying to debug my problem, and seem to have inadvertently found a solution. Since the problem seemed to be in the predict function so I stored the svmBag$pred
function as a variable predfunct
so I could see where it was not working
predfunct<-function (object, x)
{
if (is.character(lev(object))) {
out <- predict(object, as.matrix(x), type = "probabilities")
colnames(out) <- lev(object)
rownames(out) <- NULL
}
else out <- predict(object, as.matrix(x))[, 1]
out
}
and then calling
train <- train(counts, conditions, method = "bag", B = 100,
bagControl = bagControl(fit = svmBag$fit, predict = predfunct,
aggregate = svmBag$aggregate), trControl = ctrl)
as in the last code block of the problem description with predfunct
replacing svmBag$pred
. Somehow this fixed the problem and everything runs just fine. If anyone can figure out why this worked, and preferably find a solution that isn't such a kluge, I will make your response the answer.