rdataframelogical-operatorsrowwise

Applying function rowwise efficiently


I have a dataframe with multiple columns containing information on one diagnosis. The entries are TRUE, FALSE or NA. I create a vector which summarizes those columns as follows: If a patient was diagnosed at some time (TRUE), then TRUE, if the only valid entry is FALSE, then FALSE and if there just missings, then NA. Written text as code:

data.frame(a= c(FALSE, TRUE, NA, FALSE, TRUE, NA, FALSE, TRUE, NA),
           b= c(FALSE, FALSE, FALSE, TRUE, TRUE, TRUE, NA, NA, NA),
           expected= c(FALSE, TRUE, FALSE, TRUE, TRUE, TRUE, FALSE, TRUE, NA))

I need to go trough all the columns rowwise and I do so using split. Unfortunatelly, my data is big and it takes a long while. What I do at the moment is

library(magrittr)
# big example data
df <- expand.grid(c(FALSE, TRUE, NA), c(FALSE, TRUE, NA)) %>%
  .[rep(1:nrow(.), 50000), ] %>%
  as.data.frame() %>%
  setNames(., nm= c("a", "b"))

# My approach
df$res <- df %>%
  split(., 1:nrow(.)) %>%
  lapply(., function(row_i){
    ifelse(all(is.na(row_i)), NA,
           ifelse(any(row_i, na.rm= TRUE), TRUE,
                  ifelse(any(!row_i, na.rm= TRUE), FALSE,
                         row_i)))
  }) %>%
  unlist()

Is there a more efficient way to solve this task?


Solution

  • A vectorized solution using pmax():

    df$result <- as.logical(do.call(\(...) pmax(..., na.rm = TRUE), df[1:2]))
    
    df
    #       a     b expected result
    # 1 FALSE FALSE    FALSE  FALSE
    # 2  TRUE FALSE     TRUE   TRUE
    # 3    NA FALSE    FALSE  FALSE
    # 4 FALSE  TRUE     TRUE   TRUE
    # 5  TRUE  TRUE     TRUE   TRUE
    # 6    NA  TRUE     TRUE   TRUE
    # 7 FALSE    NA    FALSE  FALSE
    # 8  TRUE    NA     TRUE   TRUE
    # 9    NA    NA       NA     NA
    

    You can also merge all the parameters into a list to avoid the anonymous function in do.call(). I rewrite it as a function rowAnys to complement rowSums/rowMeans in base.

    rowAnys <- function(x) {
      as.logical(do.call(pmax, c(na.rm = TRUE, x)))
    }
    

    You could also use pmin to implement rowwise-all().

    rowAlls <- function(x) {
      as.logical(do.call(pmin, c(na.rm = TRUE, x)))
    }
    
    df$any <- rowAnys(df[1:2])
    df$all <- rowAlls(df[1:2])
    
    df
    #       a     b expected   any   all
    # 1 FALSE FALSE    FALSE FALSE FALSE
    # 2  TRUE FALSE     TRUE  TRUE FALSE
    # 3    NA FALSE    FALSE FALSE FALSE
    # 4 FALSE  TRUE     TRUE  TRUE FALSE
    # 5  TRUE  TRUE     TRUE  TRUE  TRUE
    # 6    NA  TRUE     TRUE  TRUE  TRUE
    # 7 FALSE    NA    FALSE FALSE FALSE
    # 8  TRUE    NA     TRUE  TRUE  TRUE
    # 9    NA    NA       NA    NA    NA