nlphidden-markov-modelsviterbi

Are start and end states in HMM, necessary when implementing the Viterbi Algorithm for POS tagging?


I do not fully understand how to use the start and end states in the Hidden Markov Model. Are these necessary in order to design and implement the transition and emission matrices?


Solution

  • The start/end states are necessary for modeling whether a tag is likely to come at the beginning or end of a sentence.

    For example, if you had a five-word sentence and you were considering two taggings

    1. Det Noun Verb Det Noun
    2. Det Noun Verb Det Adj

    Both of these look pretty good in terms of transitions because Det->Noun and Det->Adj are both very likely. BUT, it is much less for a sentence to end in an Adj than a Noun, something that you would not get without an end tag. So what you really want to compare is

    1. START Det Noun Verb Det Noun END
    2. START Det Noun Verb Det Adj END

    Then you will be computing P(END|Noun) and P(END|Adj).


    If you're doing supervised training, then getting the probabilities with START/END is no different than the other tags, you just have to append the special tags to each sentence before counting. So if your training corpus has:

    Det Noun Verb
    Det Noun Verb Det Noun
    

    Then you would modify it to be

    START Det Noun Verb END
    START Det Noun Verb Det Noun END
    

    And compute, for example:

    Also, emissions are trivial: P(START|START)=1 and P(END|END)=1