rloopsrowstringrsearch-keywords

How to add total number of same string with new column in data matrix with R


Suppose I have a matrix, 5 by 5 with fruit names (5 class fruits). I want to add 5 new columns in this existing matrix with the total number of single fruits in each of the rows, and finally one extra row to show the summation of each same kind of fruits. the data matrix is like this,

    [,1]   [,2]   [,3]   [,4]   [,5]
[1,]mango        banana         mango
[2,]apple  kiwi         banana
[3,]            mango
[4,]mango       apple
[5,]                    orange

I want to get output (data frame) like this,

    [,1]  [,2]  [,3]  [,4]  [,5] [apple] [banana] [kiwi] [mango] [orange]
[1,]mango      banana       mango   0        1       0      2        0
[2,]apple kiwi       banana         1        1       1      0        0
[3,]           mango                0        0       0      1        0
[4,]mango      apple                1        0       0      1        0   
[5,]                 orange         0        0       0      0        1
[6,]                                2        2       1      4        1

I have tried grep, it is breaking down the whole matrix into a column vector. I actually do not have idea how to do it for whole data matrix with R. Here is the code,

fruits <- matrix(c("mango", "", "banana", "", "mango", "apple", "kiwi", "", "banana", "","", "", "mango", "", "", "mango", "", "apple", "", "", "", "", "", "orange", ""), nrow = 5, ncol = 5, byrow = TRUE)
fruits$apple <- length(grep("apple", fruits[1:nrow(fruits), 1:ncol(fruits)]))
fruits$banana <- length(grep("banana", fruits[1:nrow(fruits), 1:ncol(fruits)]))
fruits$kiwi <- length(grep("kiwi", fruits[1:nrow(fruits), 1:ncol(fruits)]))
fruits$mango <- length(grep("mango", fruits[1:nrow(fruits), 1:ncol(fruits)]))
fruits$orange <- length(grep("orange", fruits[1:nrow(fruits), 1:ncol(fruits)]))

Please help.


Solution

  • We can also melt and cast the data frame with counts. Then add a row of sums:

    library(reshape2)
    library(tidyr)
    
    #melt fruits matrix
    g <- gather(as.data.frame(t(fruits)))
    
    #cast data wide and bind to original matrix
    d <- cbind(fruits, dcast(g, key~value)[-(1:2)])
    
    #add row of sums
    rbind(d,c(rep("", 5),colSums(d[-(1:5)])))
    #       1    2      3      4     5 apple banana kiwi mango orange
    # 1 mango      banana        mango     0      1    0     2      0
    # 2 apple kiwi        banana           1      1    1     0      0
    # 3             mango                  0      0    0     1      0
    # 4 mango       apple                  1      0    0     1      0
    # 5                   orange           0      0    0     0      1
    # 6                                    2      2    1     4      1