I am analysing some sequence data and wish to be able to see missing states within all of my sequence plots. However, I have noticed that TraMineR's state distribution plot function seqdplot
automatically removes missing sequence states. I have included a reproducible example below. As you can see, the missing data is visible in the plot and legend of the sequence index plot seqIplot
. However, it is automatically removed from the state distribution plot seqdplot
.
How do I stop seqdplot
from removing these missing values?
Create & Format Data
# Import required libraries
library(TraMineR)
library(tidyverse)
# Set seed for reproducibility
set.seed(123)
# Read in TraMineR sample data
data(mvad)
# For loop which generates missing data within the sequences
for (col in 17:86) {
mvad[sample(1:nrow(mvad),(round(nrow(mvad)*0.1))),col] <- NA
}
# Create sequence object
mvad.seq <- seqdef(mvad[, 17:86])
Sequence Index Plot (missing data visible)
# Create sequence index plot
seqIplot(mvad.seq, sortv = "from.start", with.legend = "right")
State Distribution Plot (missing data removed)
# Create state distribution plot
seqdplot(mvad.seq, sortv = "from.start", with.legend = "right")
To display missing values, simply use the argument with.missing=TRUE
seqdplot(mvad.seq, sortv = "from.start", with.legend = "right",
with.missing=TRUE, border=NA)
By default, seqdef
sets right missings as voids, i.e., it assumes sequences end at the last valid state. If you want also to treat (display) right missings as missing tockens, set right=NA
in the seqdef
command (it is right="DEL"
by default):
mvad.seq <- seqdef(mvad[, 17:86], right=NA)