I am trying to construct a dataframe for multistate analysis in R using the mstate package using the following code:
tmat <- transMat(x = list( c(2, 10), c(3, 10), c(4, 10), c(5, 10), c(6, 10), c(7, 10), c(8, 10), c(9, 10), c(10), c()),
names = c("start", "aki_1", "rec_1", "aki_2", "rec_2", "aki_3", "rec_3", "aki_4", "rec_4", "death"))
tmat
from start aki_1 rec_1 aki_2 rec_2 aki_3 rec_3 aki_4 rec_4 death
start NA 1 NA NA NA NA NA NA NA 2
aki_1 NA NA 3 NA NA NA NA NA NA 4
rec_1 NA NA NA 5 NA NA NA NA NA 6
aki_2 NA NA NA NA 7 NA NA NA NA 8
rec_2 NA NA NA NA NA 9 NA NA NA 10
aki_3 NA NA NA NA NA NA 11 NA NA 12
rec_3 NA NA NA NA NA NA NA 13 NA 14
aki_4 NA NA NA NA NA NA NA NA 15 16
rec_4 NA NA NA NA NA NA NA NA NA 17
death NA NA NA NA NA NA NA NA NA NA
dlong <- msprep(time = c(NA, "aki_1_time", "rec_1_time", "aki_2_time", "rec_2_time",
"aki_3_time", "rec_3_time", "aki_4_time", "rec_4_time", "death_time"),
status = c(NA, "aki_1_status", "rec_1_status", "aki_2_status", "rec_2_status",
"aki_3_status", "rec_3_status", "aki_4_status", "rec_4_status", "death_status"),
data = d, id = "subject", trans = tmat)
However, msprep keeps returning: Error in time[, -startings] : incorrect number of dimensions. I checked and all variables are present in the dataset, correctly spelled and there are no missing values. Also, I believe the transition matrix is specified correctly.
I thought that it could have to do with the starting state "start", for which I filled in NA for both time and status, but this is the way it should be done according the Rdocumentation.
The dataset looks like this:
subject aki_1_status aki_1_time rec_1_status rec_1_time aki_2_status aki_2_time rec_2_status rec_2_time aki_3_status aki_3_time rec_3_status rec_3_time
1 1 0 90.2 0 90.2 0 90.2 0 90.2 0 90.2 0 90.2
2 2 0 90.2 0 90.2 0 90.2 0 90.2 0 90.2 0 90.2
3 4 1 6.1 0 90.2 0 90.2 0 90.2 0 90.2 0 90.2
4 5 1 2.1 1 10.1 0 90.2 0 90.2 0 90.2 0 90.2
5 6 1 3.1 1 11.1 1 31.1 1 47.1 0 90.2 0 90.2
6 8 1 1.1 0 90.2 0 90.2 0 90.2 0 90.2 0 90.2
aki_4_status aki_4_time rec_4_status rec_4_time death_status death_time
1 0 90.2 0 90.2 0 90.2
2 0 90.2 0 90.2 0 90.2
3 0 90.2 0 90.2 1 11.2
4 0 90.2 0 90.2 0 90.2
5 0 90.2 0 90.2 0 90.2
6 0 90.2 0 90.2 1 2.2
Does anybody have a solution for this?
This could be a bug or a problem with your input format. I don't understand the structure of the input data, but I traced the error message and added a line to the function that seems to prevent it from happening while still giving you an output that looks like it matches your input data. So, if you're sure that the input format is correct and the output I got is correct then its likely a bug in the package. Otherwise you'll need to revisit the documentation to make sure you're specifying the data correctly.
If you look at the code for mstate:::msprepEngine
it assumes the time
argument is a matrix. However in the final iteration when there is only one row, time
becomes a vector (representing the last row of the matrix).
I can prevent the error by adding a line to the msprepEngine
function, changing time
back into a matrix, just before the call to Recall
.
so the last two lines of msprepEngine
become:
if (!is.matrix(time))
time <- matrix(time, nrow = 1)
Recall(time = time[, -startings], status = status[, -startings],
id = id, starttime = newtime, startstate = newstate,
trans = trans[-startings, -startings], originalStates = originalStates[-startings],
longmat = longmat)
Then the function runs and I get:
> dlong
An object of class 'msdata'
Data:
subject from to trans Tstart Tstop time status
1 1 1 2 1 0.0 90.2 90.2 0
2 1 1 10 2 0.0 90.2 90.2 0
3 2 1 2 1 0.0 90.2 90.2 0
4 2 1 10 2 0.0 90.2 90.2 0
5 4 1 2 1 0.0 6.1 6.1 1
6 4 1 10 2 0.0 6.1 6.1 0
7 4 2 3 3 6.1 11.2 5.1 0
8 4 2 10 4 6.1 11.2 5.1 1
9 5 1 2 1 0.0 2.1 2.1 1
10 5 1 10 2 0.0 2.1 2.1 0
11 5 2 3 3 2.1 10.1 8.0 1
12 5 2 10 4 2.1 10.1 8.0 0
13 5 3 4 5 10.1 90.2 80.1 0
14 5 3 10 6 10.1 90.2 80.1 0
15 6 1 2 1 0.0 3.1 3.1 1
16 6 1 10 2 0.0 3.1 3.1 0
17 6 2 3 3 3.1 11.1 8.0 1
18 6 2 10 4 3.1 11.1 8.0 0
19 6 3 4 5 11.1 31.1 20.0 1
20 6 3 10 6 11.1 31.1 20.0 0
21 8 1 2 1 0.0 1.1 1.1 1
22 8 1 10 2 0.0 1.1 1.1 0
23 8 2 3 3 1.1 2.2 1.1 0
24 8 2 10 4 1.1 2.2 1.1 1
however I have no idea if this is the correct output! The transitions with status==1
look like they match the transitions in your sample data frame but I don't understand the rest of the input or output formats (it looks like there are nonsensical 'censored' transitions in there, but they could be OK, I don't know). You could check it or contact the package authors.