I have a sequence of data (similar to protein data) and I want to use the mixture of hidden Markov model (mhmm) to cluster them. I chose seqHMM
package to do it. but when I want to train a mhmm model, it gives this error:
build_mhmm(observations = dat, n_states = c(4, 4, 6))
Error in FUN(X[[i]], ...) : seqdata should be a state sequence object, an event sequence object, or a suffix tree. Use seqdef or seqecreate.
I tried to structure for dat
. One is normal sequences and the other is matrix of sequences.
For instance:
dat<-data.frame(matrix(c("e","f","j","o","d","o","p","k","k","a","d","c"),ncol = 4,nrow = 3))
# X1 X2 X3 X4
#1 e o p a
#2 f d k d
#3 j o k c
and
matrix(paste(dat$X1,dat$X2,dat$X3,dat$X4),nrow = nrow(dat))
#1 "eopa"
#2 "fdkd"
#3 "jokc"
How should I change the format of my data in order to make it readable using build_mhmm
? The data already exists and I don't want to recreate them using any package. I want to manipulate them and make them as a proper input.
I found the answer. I should use seqdef(dat)
form package TraMineR
instead of dat
in build_mhmm
function