rsequencemissing-datatraminer

How to prevent TraMineR state distribution plot (seqdplot) from removing missing states


I am analysing some sequence data and wish to be able to see missing states within all of my sequence plots. However, I have noticed that TraMineR's state distribution plot function seqdplot automatically removes missing sequence states. I have included a reproducible example below. As you can see, the missing data is visible in the plot and legend of the sequence index plot seqIplot. However, it is automatically removed from the state distribution plot seqdplot.


How do I stop seqdplot from removing these missing values?


Create & Format Data

# Import required libraries
library(TraMineR)
library(tidyverse)

# Set seed for reproducibility
set.seed(123)

# Read in TraMineR sample data
data(mvad)

# For loop which generates missing data within the sequences
for (col in 17:86) {
  mvad[sample(1:nrow(mvad),(round(nrow(mvad)*0.1))),col] <- NA
}

# Create sequence object
mvad.seq <- seqdef(mvad[, 17:86])

Sequence Index Plot (missing data visible)

# Create sequence index plot
seqIplot(mvad.seq, sortv = "from.start", with.legend = "right")

enter image description here


State Distribution Plot (missing data removed)

# Create state distribution plot
seqdplot(mvad.seq, sortv = "from.start", with.legend = "right")

enter image description here


Solution

  • To display missing values, simply use the argument with.missing=TRUE

    seqdplot(mvad.seq, sortv = "from.start", with.legend = "right",
             with.missing=TRUE, border=NA)
    

    By default, seqdef sets right missings as voids, i.e., it assumes sequences end at the last valid state. If you want also to treat (display) right missings as missing tockens, set right=NA in the seqdef command (it is right="DEL" by default):

    mvad.seq <- seqdef(mvad[, 17:86], right=NA)