regexregex-lookaroundswikipediamediawiki-templateswikitext

Regex for a date not preceded by = or |


I'm looking for a regular expression to use with AutoWikiBrowser (flavour .NET) that allows me to encapsulate simple dates in Wikipedia.fr {{date}} template, but not if they're preceded by an = (which means they're probably a template parameter value) or a | (which means they're probably already in a {{date}} template).

I've tried a negative workaround without success (only the first digit is ignored, the rest of the date is).

I want this to be matched: 12 octobre 2012 but not this: foo=12 octobre 2012/foo = 12 octobre 2012 or this: bar|12 octobre 2012/bar | 12 octobre 2012 (note that there may be a space before the date)

Here is my test on regex101: https://regex101.com/r/RS28vP/2


Solution

  • Thanks to @The fourth bird that answered the question in the first post comments:

    (?<!(?:\w+ *[=|] *|{{\w+ *\| *)[\w ]*)(?:(\d+(?:er)?|\d+{{er}}|{{\d+er}}) *)?(janvier|f[ée]vrier|mars|avril|mai|juin|juillet|a[ôo][ûu]t|septembre|octobre|novembre|d[ée]cembre) *(\d{3,4})
    

    regex101.com/r/bqHMlI/1