I am new to statistical analysis. I will give a detailed description of my problem as follows: I have a data set as follows:
ObjectID Timestamp State
1 t1 1
1 t2 3
1 t3 5
1 t4 2
2 t11 2
2 t22 5
2 t33 3
2 t44 1
and likewise.
The total number of states is fixed to 20
. Each object is similar and can be grouped into one class. So finally, I have the variable length sequences of states of each object belonging to the similar class with their respective timestamps.
So, I want to train an HMM model for this type of data set and predict the next state as an output when the corresponding input is a sequence of previous states.
So, how do I approach this kind of problem and what are the parameters that I need to implement using the hmmlearn
Python library. Any code help will also be better.
I guess that reading the documentation of the hmmlearn
library would have helped you to start at least. So basically, in the simpler case in which:
from hmmlearn import hmm
# Setting the HMM structure. n_component is the number of hidden states
mode = hmm.MultinomialHMM(n_components=2)
# Training the model with your data
model.fit(your_data)
# Predicting the states for the observation sequence X (with Viterbi)
Z = model.predict(your_data)
For predicting the state of the next output, you can use the last state inferred from the Viterbi sequence (an HMM is a memoryless process) along with the transition matrix. From the probability mass function representing the transition from the last state your system is to the other states, you can draw the next state in which the model will be.
In my answer to this question is elaborate more on this last point.