I'm running into an issue when trying to define an "identifier" with a pest.rs
grammar, where an identifier as a string that isn't in a set of reserved keywords.
I want to use an arbitrary matching rule (let's say its ASCII_ALPHA_LOWER+
), so there can be any non-zero number of lowercase letters one after each other, but I want the rule to only match if the result is not be equal to a specific string (say, "abc"
).
In regex, this can be achieved by using a negative lookahead combined with the anchor characters - ^(?!abc$)[a-z]+
because the anchors stop the negative lookahead as soon as it matches the abc
, which means that a string like abcd
would be matched.
However, pest doesn't have these anchor characters, and the best that can be achieved is something like:
identifier {
!("abc") ~ ASCII_ALPHA_LOWER+
}
But because there are no anchor characters, any string that begins with abc
would be matched regardless of the rest of its characters, which is not optimal because I only want the rule to fail if it exactly matches one of the reserved keywords.
Does anyone know how I could achieve the same functionality as the regex with pest's grammars?
Instead of negative lookahead, use an optional modifier:
ident = { "abc"? ~ ASCII_ALPHA_LOWER+}
This requires that if the identifier starts with a keyword it must have something following it.