I have a dataset with repeated measures over time, in which I am looking for predictors of the maximum tn value. I am not interested in measures which occur after this. The maximum values occur on different days for different patients.
ID day tn hb sofa
1 1 7 85 NA
1 2 NA NA NA
1 3 35 80 13
1 4 28 79 12
2 1 500 NA 12
2 2 280 80 9
2 3 140 90 8
2 4 20 90 7
3 1 60 80 12
3 2 75 75 10
3 3 NA 75 NA
3 4 55 84 7
I can find tn_ max:
tn_max <- df %>% group_by(record) %>% summarise(tn_max = max(tn,na.rm=TRUE))
How can I truncate the dataset after the maximum tn for each patient? I found this code from a previous similar question, but I can't get it to work Error: unexpected ':' in "N_max = find(df(:"
mod_df = df;
N_max = find(df(:,3) == max(df(:,3)));
N_max(1);
for N=1:size(df,3)
if df(N,1) < N_max
mod_df (N,:)=0;
end
end
mod_data_1(all(mod_data_1==0,1),:) = []
Many thanks, Annemarie
First I would create a function able to return, for any vector, a Boolean vector of the same length and whose coefficients are TRUE
if the value occurs before the maximum, and FALSE
otherwise:
f <- function(x) 1:length(x) <= which.max(x)
Then I would apply this function to each sub-vector of tn
defined by the ID
:
ind <- as.logical(ave(df$tn, df$ID, FUN=f))
Finally, all I have to do is to take the corresponding subset of the original data-frame:
df[ind, ]