regexregex-lookaroundswikipedia

Regex for removing duplicate values in a specific Wikipedia template


I am trying to remove duplicate values in a Wikipedia template (and only in this one) with a regex using the AutoWikiBrowser bot (that works with the .NET flavour).

I want to find {{mul|fr|en|fr}} and replace it with {{mul|fr|en}}

\b(\w+)\s*\|\s*(?=.*\1) works, but it may also affect other templates that should not be modified.

I tried \{\{mul\|\b(\w+)\s*\|\s*(?=.*\1), but it doesn't work properly.

Note: a Wikipedia template is encapsulated in double curly brackets, with its name followed by parameters and values separated by pipes. Here, the parameters are unnamed and absent, and the template is named "mul", which gives {{mul|<foo>|<bar>|<baz>|<...>}}


Solution

  • You can use

    (?<={{mul\|(?:(?!{{|}}).)*?)\b(\w+)\|(?=(?:(?!{{|}}).)*\b\1\b)
    

    See the regex demo.

    Details: