I have been stuck on this problem for hours and it's becoming somewhat frustrating. Basically I want to arrange some data so that the NA's appear first based on a grouping structure. I can get part of the way there, but nothing I try gets me to the desired result.
With this code,
df <- df |>
group_by(AESOC, AEPT) |>
arrange(!is.na(AEPT), !is.na(Severity), .by_group = TRUE)
I have been able to achieve what is shown in the image.
But I would still like to arrange further so that rows 9-12 appear before row 1 and rows 25-28 appear before row 13 (i.e at the very beginning of the groups determined by AESOC and AEPT.
This small data is included here:
df <- structure(list(AESOC = c("Blood and lymphatic system disorders",
"Blood and lymphatic system disorders", "Blood and lymphatic system disorders",
"Blood and lymphatic system disorders", "Blood and lymphatic system disorders",
"Blood and lymphatic system disorders", "Blood and lymphatic system disorders",
"Blood and lymphatic system disorders", "Blood and lymphatic system disorders",
"Blood and lymphatic system disorders", "Blood and lymphatic system disorders",
"Blood and lymphatic system disorders", "Cardiac disorders",
"Cardiac disorders", "Cardiac disorders", "Cardiac disorders",
"Cardiac disorders", "Cardiac disorders", "Cardiac disorders",
"Cardiac disorders", "Cardiac disorders", "Cardiac disorders",
"Cardiac disorders", "Cardiac disorders", "Cardiac disorders",
"Cardiac disorders", "Cardiac disorders", "Cardiac disorders"
), AEPT = c(" Anaemia", " Anaemia", " Anaemia", " Anaemia",
" Lymphopenia", " Lymphopenia", " Lymphopenia", " Lymphopenia",
NA, NA, NA, NA, " Dizziness", " Dizziness", " Dizziness",
" Dizziness", " Palpitations", " Palpitations", " Palpitations",
" Palpitations", " Presyncope", " Presyncope", " Presyncope",
" Presyncope", NA, NA, NA, NA), Severity = c(" mild",
" moderate", " severe", NA, " mild", " moderate",
" severe", NA, " mild", " moderate", " severe",
NA, " mild", " moderate", " severe", NA,
" mild", " moderate", " severe", NA, " moderate",
" mild", " severe", NA, " moderate", " mild",
" severe", NA)), row.names = c(NA, -28L), class = c("tbl_df",
"tbl", "data.frame"))
Any help would be greatly appreciated.
You can use arrange
in the following way :
library(dplyr)
df %>% arrange(AESOC, !is.na(AEPT), AEPT, !is.na(Severity), Severity)
which returns :
AESOC AEPT Severity
1 Blood and lymphatic system disorders <NA> <NA>
2 Blood and lymphatic system disorders <NA> mild
3 Blood and lymphatic system disorders <NA> moderate
4 Blood and lymphatic system disorders <NA> severe
5 Blood and lymphatic system disorders Anaemia <NA>
6 Blood and lymphatic system disorders Anaemia mild
7 Blood and lymphatic system disorders Anaemia moderate
8 Blood and lymphatic system disorders Anaemia severe
9 Blood and lymphatic system disorders Lymphopenia <NA>
10 Blood and lymphatic system disorders Lymphopenia mild
11 Blood and lymphatic system disorders Lymphopenia moderate
12 Blood and lymphatic system disorders Lymphopenia severe
13 Cardiac disorders <NA> <NA>
14 Cardiac disorders <NA> mild
15 Cardiac disorders <NA> moderate
16 Cardiac disorders <NA> severe
17 Cardiac disorders Dizziness <NA>
18 Cardiac disorders Dizziness mild
19 Cardiac disorders Dizziness moderate
20 Cardiac disorders Dizziness severe
21 Cardiac disorders Palpitations <NA>
22 Cardiac disorders Palpitations mild
23 Cardiac disorders Palpitations moderate
24 Cardiac disorders Palpitations severe
25 Cardiac disorders Presyncope <NA>
26 Cardiac disorders Presyncope mild
27 Cardiac disorders Presyncope moderate
28 Cardiac disorders Presyncope severe
and the same in base R :
df[with(df, order(AESOC, !is.na(AEPT), AEPT, !is.na(Severity), Severity)), ]