rcol

colnames as vector if colsums is > 0


Trying to get the colnames of columns which totals are > 0. I have tried

df[sapply(df, function(x) any(x > 0))]

but this gives me the actual numeric value. I then tried

names(df)[colSums(df)>0]

but received a NULL. Here is an example of my df where I would expect the output of "SRR960396" and "SRR960403".

> dput(df)
structure(c(14, 0, 0, 0, 0, 0, 0, 0, 0, 13, 0, 0), dim = 4:3, dimnames = list(
    c("27b1da20c7a5364614a540521d5e38c0", "bd60888ac07bff9651845c6f6aca4fa8", 
    "e7474b1fd688aa54814538b69a56d202", "85907e71eb899b7edce9bf8d23746413"
    ), c("SRR960396", "SRR960402", "SRR960403")))

Solution

  • You should be aware that df is NOT a data.frame but a matrix, i.e.,

    > class(df)
    [1] "matrix" "array"
    

    so colnames(df), instead of names, should be used to fetch the column names.


    A workaround is using colSums + which to locate the columns that have positive values, e.g.,

    > names(which(colSums(df > 0) > 0))
    [1] "SRR960396" "SRR960403"