phpstripos

Bad word check in PHP using stripos


I implemented this "bad word" check function in php:

# bad word detector
function check_badwords($string) {
    $badwords = array(a number of words some may find inappropriate for SE);
    foreach($badwords as $item) {
        if(stripos($string, $item) !== false) return true;
    }
    return false;
}

It works alright, except I'm having a little problem. If the $string is:

Who is the best guitarist ever?

...it returns true, because there is a match with Who ($string) and ho (in $badwords array). How could the function be modified so that it only checks for complete words, and not just part of words?

Thanks!


Solution

  • In order to check for complete words you should use regular expressions:

    function check_badwords($string)
    {
        $badwords = array(/* the big list of words here */);
        // Create the regex
        $re = '/\b('.implode('|', $badwords).')\b/';
        // Check if it matches the sentence
        return preg_match($re, $string);
    }
    

    How the regex works

    The regular expression starts and ends with the special sequence \b that matches a word boundary (i.e. when a word character is followed by a non-word character or viceversa; the word characters are the letters, the digits and the underscore).

    Between the two word boundaries there is a subpattern that contains all the bad words separated by |. The subpattern matches any of the bad words.

    If you want to know what bad word was found you can change the function:

    function check_badwords($string)
    {
        $badwords = array(/* the big list of words here */);
        $re = '/\b('.implode('|', $badwords).')\b/';
        // Check for matches, save the first match in $match
        $result = preg_match($re, $string, $match);
        // if $result is TRUE then $match[1] contains the first bad word found in $string
       return $result;
    }