regexpcresemgrep

regex matching duplicates in a comma separated list


I'm trying to regex match any duplicate words (i.e. alphanumeric and can have dashes) in some yaml with a PCRE tool.

I have found a consecutive, duplicate regex matcher:

(?<=,|^)([^,]*)(,\1)+(?=,|$)

it will catch:

hello-world,hello-world,goodbye-world,goodbye-world

but not the hello-worlds in

hello-world,goodbye-world,goodbye-world,hello-world

Could someone help me try to build a regex pattern for the second case (or both cases)?


Solution

  • You may use this regex:

    (?<=,|^)([^,]+)(?=(?>,[^,]*)*,\1(?>,|$)),
    

    RegEx Demo

    RegEx Details: