phpregexpreg-match-allcontinue

Validate a string and return a dynamic number of isolated words


I would like to validate my input string and extract an unpredictable number of substrings from it -- with one regex pattern.

An example string:

location in [chambre, cuisine, salle-de-bain, jardin]

In only one step, I want to verify that the shape is word in [word, word, word...] and I would like to catch each word. (I want to do it in only one step for performance, because this code already works with three steps, but it's too long)

My current regular expression is:

/([a-zA-Z]+)\s+in\s+\[\s*([a-zA-Z-]+)\s*(?:,\s*([a-zA-Z-]+)\s*)*\s*\]/

I catch location, chambre and jardin. I don't catch cuisine and salle-de-bain.

$condition = 'location in [chambre, cuisine, salle-de-bain, jardin]';
preg_match('/([a-zA-Z]+)\s+in\s+\[\s*([a-zA-Z-]+)\s*(?:,\s*([a-zA-Z-]+)\s*)*\s*\]/', $condition, $matches);
var_dump($matches);
array:4 [▼
  0 => "location in [chambre, cuisine, salle-de-bain, jardin]"
  1 => "location"
  2 => "chambre"
  3 => "jardin"
]

I don't find what is wrong in my regular expression to catch the 2 missing words. I only get the first one and the last one in array.


Solution

  • In PHP, repeated capturing groups will always keep the last substring captured only.

    You can use preg_match_all with a regex like

    [a-zA-Z]+(?=\s+in\s+\[\s*[a-zA-Z-]+(?:\s*,\s*[a-zA-Z-]+)*\s*])|(?:\G(?!^)\s*,\s*|(?<=[a-zA-Z])\s+in\s+\[\s*)\K[a-zA-Z-]+(?=(?:\s*,\s*[a-zA-Z-]+)*\s*])
    

    See the regex demo. Details: