I'm very new to TraMineR, but am trying to examine sequences of modes that patients used for clinical visits over time.
My data looks like this after setting it up to convert it to from SPELL to STS format. You can see here that the begin and end variables are integer variables.
> df %>% head(20)
# A tibble: 20 x 6
id index begin end status status_1
<int> <int> <int> <int> <fct> <int>
1 1 1 1 1 Video 3
2 1 2 2 2 Video 3
3 2 1 1 1 Phone 2
4 2 2 2 2 Phone 2
5 2 3 3 3 Phone 2
6 3 1 1 1 Video 3
7 4 1 1 1 Video 3
8 5 1 1 1 Phone 2
9 6 1 1 1 Video 3
10 6 2 2 2 Video 3
11 6 3 3 3 Video 3
12 6 4 4 4 Video 3
13 6 5 5 5 Video 3
14 7 1 1 1 Phone 2
15 7 2 2 2 Phone 2
16 8 1 1 1 Video 3
17 9 1 1 1 Phone 2
18 10 1 1 1 Phone 2
19 10 2 2 2 Phone 2
20 10 3 3 3 InPerson 1
With a quick look using skim() from skimr, we can also see the variable types and that there is no missing data.
> df %>% skimr::skim()
-- Data Summary ------------------------
Values
Name Piped data
Number of rows 4530
Number of columns 6
_______________________
Column type frequency:
factor 1
numeric 5
________________________
Group variables None
-- Variable type: factor ----------------------------------------------------------------------------------------------------------------------------------------------
# A tibble: 1 x 6
skim_variable n_missing complete_rate ordered n_unique top_counts
* <chr> <int> <dbl> <lgl> <int> <chr>
1 status 0 1 FALSE 3 Pho: 2496, Vid: 1864, InP: 170
-- Variable type: numeric ---------------------------------------------------------------------------------------------------------------------------------------------
# A tibble: 5 x 11
skim_variable n_missing complete_rate mean sd p0 p25 p50 p75 p100 hist
* <chr> <int> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <chr>
1 id 0 1 1203. 702. 1 584 1207 1817 2426 ▇▇▇▇▇
2 index 0 1 1.86 1.24 1 1 1 2 11 ▇▁▁▁▁
3 begin 0 1 1.86 1.24 1 1 1 2 11 ▇▁▁▁▁
4 end 0 1 1.86 1.24 1 1 1 2 11 ▇▁▁▁▁
5 status_1 0 1 2.37 0.556 1 2 2 3 3 ▁▁▇▁▆
However, when I attempt to use seqformat to convert my data from SPELL to state sequences using this code:
df_sts <- seqformat(df, id = "id", begin = "begin", end = "end", status = "status_1", from = "SPELL", to = "STS", process = FALSE)
I get this error:
Error in is.wholenumber(c(endcolumn, begincolumn)) :
'list' object cannot be coerced to type 'integer'
I'm trying to follow the steps outlined in the TraMineR User Guide, but I'm really not sure where this error is coming from since both begin and end variables are integers... Can someone help me understand what the issue is here and how to resolve the "error"?
TraMineR
does not seem to play well with tibbles. Declaring the data as data frame should do the trick.
df2 <- as.data.frame(df)
I checked the TraMineR code and found the cause for the error. Tibbles and data frames behave differently when extracting a single variable. When doing this with a data.frame we obtain a vector, in the case of a tibble the extracted column is still of class tibble.
library(tidyverse)
df <- tribble(
~id, ~index, ~begin, ~end, ~status, ~status_1,
1, 1, 1, 1, "Video", 3,
1, 2, 2, 2, "Video", 3,
2, 1, 1, 1, "Phone", 2,
2, 2, 2, 2, "Phone", 2,
2, 3, 3, 3, "Phone", 2,
3, 1, 1, 1, "Video", 3,
4, 1, 1, 1, "Video", 3,
5, 1, 1, 1, "Phone", 2,
6, 1, 1, 1, "Video", 3,
6, 2, 2, 2, "Video", 3,
6, 3, 3, 3, "Video", 3,
6, 4, 4, 4, "Video", 3,
6, 5, 5, 5, "Video", 3) |>
mutate(status = factor(status))
# Subsetting a tibble
c(df[,3],df[,4])
$begin
[1] 1 2 1 2 3 1 1 1 1 2 3 4 5
$end
[1] 1 2 1 2 3 1 1 1 1 2 3 4 5
# Subsetting a data.frame
c(df2[,3],df2[,4])
[1] 1 2 1 2 3 1 1 1 1 2 3 4 5 1 2 1 2 3 1 1 1 1 2 3 4 5
is.wholenumber <- function(x){as.integer(x) == x}
# tibble --> error
all(is.wholenumber(c(df[,3],df[,4])))
Error in is.wholenumber(c(df[, 3], df[, 4])) :
'list' object cannot be coerced to type 'integer'
# data.frame -> works as expected
all(is.wholenumber(c(df2[,3],df2[,4])))
[1] TRUE