rcharactermatchingmap-matching

pmatch returning character match in wrong way


I have a character word matrix (3 by 3) and a dictionary (n by 1). I want to find the characters match with the dictionary, will be replaced by the character position exists in the dictionary and all the unmatched words will be removed (either NA, or NULL). I have tried simple pmatch, but it is not working. I am looking for the result like this,

1    5    2
1    4    NA
3    NA   2

but getting result in this way, [1] 1 NA 3 5 4 NA 2 NA NA

I tried the code bellow,

a <- matrix(c("456_rr", "432_uu", "522_kk", "456_rr", "432_tt", "522", "456_ss", "432", "522_kk"), nrow = 3, ncol = 3, byrow = TRUE) # WordTable

b <- matrix(c("456_rr","522_kk","456_ss", "432_tt", "432_uu"), nrow = 5, ncol = 1) # Dictionary

c <- pmatch(a, b)

print(c)

Solution

  • The nrow in the 'b' is not correct. It should be 5

    b <- matrix(c("456_rr","522_kk","456_ss", "432_tt", 
         "432_uu"), nrow = 5, ncol = 1) 
    v1 <- pmatch(a,b)
    m1 
    #[1]  1 NA  3  5  4 NA  2 NA NA
    
    dim(m1) <- dim(a)
    m1
    #     [,1] [,2] [,3]
    #[1,]    1    5    2
    #[2,]   NA    4   NA
    #[3,]    3   NA   NA
    

    If we need to get the index of duplicates, use duplicates.ok=TRUE which is by default FALSE

    `dim<-`(pmatch(a, b, duplicates.ok=TRUE), dim(a))
    #     [,1] [,2] [,3]
    #[1,]    1    5    2
    #[2,]    1    4    2
    #[3,]    3   NA    2