rstatisticsprobabilitymarkov-chainsmarkov

Understanding Markov Chain source code in R


The following source code is from a book. Comments are written by me to understand the code better.

#==================================================================
# markov(init,mat,n,states) = Simulates n steps of a Markov chain 
#------------------------------------------------------------------
# init = initial distribution 
# mat = transition matrix 
# labels = a character vector of states used as label of data-frame; 
#           default is 1, .... k
#-------------------------------------------------------------------
markov <- function(init,mat,n,labels) 
{ 
    if (missing(labels)) # check if 'labels' argument is missing
    {
        labels <- 1:length(init) # obtain the length of init-vecor, and number them accordingly.
    }

    simlist <- numeric(n+1) # create an empty vector of 0's
    states <- 1:length(init)# ???? use the length of initial distribution to generate states.
    simlist[1] <- sample(states,1,prob=init) # sample function returns a random permutation of a vector.
                        # select one value from the 'states' based on 'init' probabilities.

    for (i in 2:(n+1))
    { 
        simlist[i] <- sample(states, 1, prob = mat[simlist[i-1],]) # simlist is a vector.
                                                    # so, it is selecting all the columns 
                                                    # of a specific row from 'mat'
    }

    labels[simlist]
}
#==================================================================

I have a few confusions regarding this source code.

Why is states <- 1:length(init) used to generate states? What if states are like S ={-1, 0, 1, 2,...}?


Solution

  • Names of the states don't really need to have any statistical meaning as long as they are different. So, while simulating transitions between states, it's perfectly fine to choose states <- 1:length(init) or any other names for them. Ultimately, though, for practical purposes we often have in mind some labels in mind, such as -1, 0, ..., n, as in your example. You can provide those names as the labels parameter and then labels[simlist] will rename 1:length(init) to labels, element by element. I.e., if initially we had c(1, 2, 3) and you provided labels as c(5, 10, 12), then the output will be in terms of the latter vector. For instance,

    (states <- sample(1:3, 10, replace = TRUE))
    # [1] 1 3 3 2 2 1 2 1 3 3
    labels <- c(5, 10, 12)
    labels[states]
    # [1]  5 12 12 10 10  5 10  5 12 12