rcountspell-checkingaspell

Count misspelled words in R


Row<-c(1,2,3,4,5)
Content<-c("I love cheese", "whre is the fish", "Final Countdow", "show me your s", "where is what")
Data<-cbind(Row, Content)
View(Data)

I wanted to create a function which tells me how many words are wrong per Row.

A intermediate step would be to have it look like this:

Row<-c(1,2,3,4,5)
Content<-c("I love cheese", "whre is the fs", "Final Countdow", "show me your s", "where is     what")
MisspelledWords<-c(NA, "whre, fs", "Countdow","s",NA)
Data<-cbind(Row, Content,MisspelledWords)

I know that i have to use aspell but i'm having problems to perform aspell on only rows and not always directly on the whole file, finally i want to Count how many words are wrong on every Row For this i would take code of: Count the number of words in a string in R?


Solution

  • Inspired by this article, here's a try with which_misspelled and check_spelling in library(qdap).

    library(qdap)
    
    # which_misspelled
    n_misspelled <- sapply(Content, function(x){
      length(which_misspelled(x, suggest = FALSE))
    })
    
    data.frame(Content, n_misspelled, row.names = NULL)
    #             Content n_misspelled
    # 1     I love cheese            0
    # 2    whre is the fs            2
    # 3    Final Countdow            1
    # 4    show me your s            0
    # 5 where is     what            0
    
    
    # check_spelling
    df <- check_spelling(Content, n.suggest = 0)                        
    
    n_misspelled <- as.vector(table(factor(df$row, levels = Row)))
    
    data.frame(Content, n_misspelled)
    #             Content n_misspelled
    # 1     I love cheese            0
    # 2    whre is the fs            2
    # 3    Final Countdow            1
    # 4    show me your s            0
    # 5 where is     what            0