I'd be grateful if someone could tell me why the following is happening and how to correct it.
I'm using the expss package to create a table as follows:
table <- dta %>%
tab_cells(dta[["x"]]) %>%
tab_rows(factor(dta[["y"]], ordered=TRUE)) %>%
tab_weight(dta[["weight"]]) %>%
tab_stat_cpct(total_statistic = "w_cpct") %>%
tab_pivot() %>%
split_columns()
I put factor(dta[[y]], ordered=TRUE) so that the factor is ordered in the table. With my other variables this has worked but somehow not with this one.
If I only enter factor(dta[[y]], ordered=TRUE) into the console it returns correctly
Levels: 537 < 564 < 650 < 1010
However, if I use the above function to create a data table, then for whatever reason it's ordered as follows:
1010 537 564 650
What can I do so that it's in the correct order?
This is a sample dataset to re-create the problem:
dta <- data.frame(x = c(1,1,1,2,1,1,1,1,1,1,1,2,1,2,2,2,1,1,2,2),
y = c(1010,650,650,537,650,650,650,650,564,650,650,650,564,564,564,564,650,650,564,564),
weight = c(42.066290,3.126177,3.808385,4.812877,8.093253,1.559941,6.168395,2.419531,3.937412,4.293246,20.445602,16.504405,1.314727,2.474295,2.274015,2.668155,3.864480,2.521209,2.605202,2.194348))
Thanks a lot in advance!
Yes, it's a bug in expss
. You can use sorting workaround, wich reorder table according to numeric values:
sort_workaround = function(tbl){
separated_labels = as.data.frame(split_labels(tbl[[1]], remove_repeated = FALSE))
# [,-ncol(separated_labels)] to keep total position
separated_labels = type.convert(separated_labels, as.is = TRUE)[,-ncol(separated_labels)]
new_order = do.call(order, separated_labels)
tbl[new_order, ]
}
table <- dta %>%
tab_cells(x) %>%
tab_rows(factor(y, ordered=TRUE)) %>%
tab_weight(weight) %>%
tab_stat_cpct(total_statistic = "w_cpct") %>%
tab_pivot() %>%
sort_workaround() %>%
split_columns()
table