I'm actually working on the pathways of inpatients during their hospital stay. These pathways are represented as states sequences (the current medical unit at each time unit) and I'm trying to find typical pathways through clustering algorithms.
I create the distance matrix by using the seqdist
function from the R package TraMineR
, with the method "OMspell"
. I've already read the R documentation and the related articles, but I can't find how to set the arguments tpow
and expcost
.
As the time unit is an hour, I don't want any little difference of duration to have a big impact on the clustering result (contrary to a medical unit transfer for example). But I don't want the duration not to have any impact either...
Also, is there a proper way to choose their value ? Or do I just continue to grope around for a good configuration ? (I'm using Dunn, Davies-Bouldin and Silhouette criteria to compare the results of hierarchical clustering, besides the medical opinion on the resulting clusters)
The parameter tpow
is an exponential coefficient applied to transform the actual spell lengths (durations). The default value is 1 for which the spell lengths are taken as are. With tpow=0
, you would just ignore spell durations, and with tpow=0.5
you would consider the square root of the spell lengths.
The expcost
parameter is the expansion cost, i.e. the cost for expanding a (transformed) spell length by one unit. In other words, when in the editing of one sequence into the other a spell of length t1
has to be expanded to length t2
, it would cost expcost * |t2^tpow - t1^tpow|
. With expcost=0
spells in a same state (e.g. AA and AAAAA) would be equivalent whatever their lengths.
With tpow=.5
, for example, increasing the spell length from 1 to 2 costs more than increasing a spell length form 3 to 4. If you do not want to give to much importance to small differences in spell lengths use a low expcost
. However, note that the expcost
applies to the transformed spell lengths and you may want to adjust it when you change the tpow
value.