rmaxminnainf

Min and Max across multiple columns with NAs


For the following sample data dat, is there a way to calculate min and max while handling NAs. My input is:

dat <- read.table(text = "ID  Name   PM      TP2   Sigma
                          1   Tim    1       2    3
                          2   Sam    0       NA   1
                          3   Pam    2       1    NA
                          4   Ali    1       0    2
                          NA  NA     NA      NA   NA
                          6   Tim    2       0    7", header = TRUE)

My required output is:

ID  Name  PM      TP2   Sigma  Min  Max
1    Tim  1       2    3       1    3
2    Sam  0       NA   1       0    1
3    Pam  2       1    NA      1    2
4    Ali  1       0    2       0    2
NA   NA   NA      NA   NA      NA   NA
6    Tim  2       0    7       0    7

My Effort

1- I have seen similar posts but none of them has discussed issues where all entries in a column were NAs e.g., Get the min of two columns Based on this, I have tried pmin() and pmax(), but they do not work for me.

2- Another similar question is minimum (or maximum) value of each row across multiple columns. Again, there is no need to handle NAs.

3- Lastly, this question minimum (or maximum) value of each row across multiple columns talks about NA but not all elements in a column have missing values.

4- Also, some of the solutions require that the columns list to be included to be excluded is typed manually, my original data is quite wide, I want to have an easier solution where I can express columns by numbers rather than names.

Partial Solution

I have tried the following solution but Min column ends up having Inf and the Max column ends up having -Inf.

dat$min = apply(dat[,c(2:4)], 1, min, na.rm = TRUE)
dat$max = apply(dat[,c(2:4)], 1, max, na.rm = TRUE)

I can manually get rid of Inf by using something like:

dat$min[is.infinite(dat$min)] = NA

But I was wondering if there is a better way of achieving my desired outcome? Any advice would be greatly appreciated.

Thank you for your time.


Solution

  • You can use hablar's min_ and max_ function which returns NA if all values are NA.

    library(dplyr)
    library(hablar)
    
    dat %>%
      rowwise() %>%
      mutate(min = min_(c_across(-ID)), 
             max = max_(c_across(-ID)))
    

    You can also use this with apply -

    cbind(dat, t(apply(dat[-1], 1, function(x) c(min = min_(x), max = max_(x)))))
    
    #  ID PM TP2 Sigma min max
    #1  1  1   2     3   1   3
    #2  2  0  NA     1   0   1
    #3  3  2   1    NA   1   2
    #4  4  1   0     2   0   2
    #5 NA NA  NA    NA  NA  NA
    #6  5  2   0     7   0   7