It seems that PST
cannot predict the conditional probabilities of the next state after contexts which consist of a single state, e.g. EX-EX
Consider this code:
# Load libraries
library(RCurl)
library(TraMineR)
library(PST)
# Get data
x <- getURL("https://gist.githubusercontent.com/aronlindberg/08228977353bf6dc2edb3ec121f54a29/raw/c2539d06771317c5f4c8d3a2052a73fc485a09c6/challenge_level.csv")
data <- read.csv(text = x)
# Load and transform data
data <- read.table("thread_level.csv", sep = ",", header = F, stringsAsFactors = F)
# Create sequence object
data.seq <- seqdef(data[2:nrow(data),2:ncol(data)], missing = NA, right= NA, nr = "*")
# Make a tree
S1 <- pstree(data.seq, ymin = 0.05, L = 6, lik = TRUE, with.missing = TRUE)
# Mine the context
context <- seqdef("EX-EX")
p_context <- predict(S1.p1, context, decomp = F, output = "prob")
The line context <- seqdef("EX-EX")
yields:
[>] 1 distinct states appear in the data:
1 = EX
Error:
[!] alphabet contains only one state
which means that predict()
cannot be executed.
How do I predict the conditional probabilities of the next state based on contexts which only have 1 state, which may be repeated multiple times?
This is an issue of seqdef
that has been fixed since version 1.8-12.
Here is what I get with TraMineR 1.8-13
> context <- seqdef("EX-EX")
[>] 1 distinct states appear in the data:
1 = EX
[>] state coding:
[alphabet] [label] [long label]
1 EX EX EX
[>] 1 sequences in the data set
[>] min/max sequence length: 2/2
> p_context <- predict(S1, context, decomp = F, output = "prob")
[>] 1 sequence(s) - min/max length: 2/2
[>] max. context length: L=6
[>] found 2 distinct context(s)
[>] total time: 0.019 secs
> p_context
prob
[1] 0.000476372
Note that I replaced your undefined S1.p1
with S1
.