[SOLVED] How to find rows with most values filled in a matrix?

How to find rows with most values filled in a matrix?

Given a matrix (mat1) like this:

mat1 <- matrix(c(1, "", 2, 3, 4, "", 2, 4, "", 5, 2, 1, 4, "", 3, 2, "", 3, "", ""), nrow = 4, ncol = 5)

How would I go about finding say the top 3 rows with the most non-empty string values? For example in mat1, row 1 has 3 values, row 2 has 2 values, row 3 has 4 values, and row 4 has 4 values.

Is there a way where I can perhaps tabulate this in a frequency table of some sort or at least return a vector of the top rows?

Solution

if we create a function, we can convert to 'long' format, subset out the blank elements, and get the frequency of the dim attribute for row names

f1 <- function(mat, n) {
   row.names(mat) <- seq_len(nrow(mat))
   head(sort(table(subset(as.data.frame.table(mat),
        Freq != "")$Var1), decreasing = TRUE), n)
 }

f1(mat1, 3)
#  3 4 1 
#  4 4 3

The output showed is a named vector with names representing the row index or row names and the values as the frequency of non-blanks. The n argument specified by the user gives the top n non-blank rows