[SOLVED] dplyr filter using qdap::which_misspelt OR dplyr filter with a nested function

dplyr filter using qdap::which_misspelt OR dplyr filter with a nested function

A small data frame:

words <- data.frame(terms = c("qhick brown fox",
          "tom dick harry", 
          "cats dgs"))

If I use qdap::which_misspelled I can find out missspelled words:

> which_misspelled(words)
      1       8 
"qhick"   "dgs"

But what I want to do is to subset words df on the rows that contain misspelling. The above returns index 1 and 8 referring to all words provided in my df, regardless of which row.

How can I subset my df based on any rows that contain misspelled words?

(Bonus if can be done with dplyr filter)

Solution

How about just use check_spelling which is vectorized, and the result contains a column of row numbers you can use to subset the data frame:

library(qdap)
words[check_spelling(words$terms)$row,,drop=F]

#            terms
#1 qhick brown fox
#3        cats dgs

The which_misspelled function seems meant to check for a single string instead of a data frame:

which_misspelled - Check the spelling for a string.