phpreplacecpu-wordblacklistword-boundary

Remove whole blacklisted words from a string


The problem with the code, below, is that it removes characters from the string rather than the words.

$str = "In a minute, remove all of the corks from these bottles in the cellar";

$useless_words = array("the", "of", "or", "in", "a");

$newstr = str_replace($useless_words, "", $str);

Output of above:

In mute, remove ll   cks from se bottles   cellr

I need the output to be:

minute, remove all corks from these bottles cellar

I'm assuming I can't use str_replace(). What can I do to achieve this?


Solution

  • preg_replace will do the job:

    $str = "The game start in a minute, remove all of the corks from these bottles in the cellar";
    $useless_words = array("the", "of", "or", "in", "a");
    $pattern = '/\h+(?:' . implode($useless_words, '|') . ')\b/i';
    $newstr = preg_replace($pattern, "", $str);
    echo $newstr,"\n";
    

    Output:

    The game start minute, remove all corks from these bottles cellar
    

    Explanation:

    The pattern looks like : /\h+(?:the|of|or|in|a)\b/i

    /                   : regex delimiter
      \h+               : 1 or more horizontal spaces
      (?:               : start non capture group
        the|of|or|in|a  : alternatives for all the useless words
      )                 : end group
      \b                : word boundary, make sure we don't have a word character before
    /i                  : regex delimiter, case insensitive