pythonmachine-learningscikit-learnhmmlearn

How to use sklearn HMM to calculate the likelihood of the observed data


There are three fundamental problems for HMMs:

  1. Given the model parameters and observed data, estimate the optimal sequence of hidden states.
  2. Given the model parameters and observed data, calculate the likelihood of the data.
  3. Given just the observed data, estimate the model parameters.

The problem 1 and problem 3 could be resolved by the sklearn HMM tutorial. But how can we use sklearn to resolve problem 2?


Solution

  • Use the score() function. From the code:

    def score(self, X, lengths=None):
    """Compute the log probability under the model.
    
        Parameters
        ----------
        X : array-like, shape (n_samples, n_features)
            Feature matrix of individual samples.
    
        lengths : array-like of integers, shape (n_sequences, ), optional
            Lengths of the individual sequences in ``X``. The sum of
            these should be ``n_samples``.
    
        Returns
        -------
        logprob : float
            Log likelihood of ``X``.