Given a matrix (mat1) like this:
mat1 <- matrix(c(1, "", 2, 3, 4, "", 2, 4, "", 5, 2, 1, 4, "", 3, 2, "", 3, "", ""), nrow = 4, ncol = 5)
How would I go about finding say the top 3 rows with the most non-empty string values? For example in mat1, row 1 has 3 values, row 2 has 2 values, row 3 has 4 values, and row 4 has 4 values.
Is there a way where I can perhaps tabulate this in a frequency table of some sort or at least return a vector of the top rows?
if we create a function, we can convert to 'long' format, subset
out the blank elements, and get the frequency of the dim attribute for row names
f1 <- function(mat, n) {
row.names(mat) <- seq_len(nrow(mat))
head(sort(table(subset(as.data.frame.table(mat),
Freq != "")$Var1), decreasing = TRUE), n)
}
f1(mat1, 3)
# 3 4 1
# 4 4 3
The output showed is a named vector with names representing the row index or row names and the values as the frequency of non-blanks. The n
argument specified by the user gives the top n
non-blank rows