rustpest

How to parse with a matching rule as long as the result is not some particular string [pest.rs]


I'm running into an issue when trying to define an "identifier" with a pest.rs grammar, where an identifier as a string that isn't in a set of reserved keywords.

I want to use an arbitrary matching rule (let's say its ASCII_ALPHA_LOWER+), so there can be any non-zero number of lowercase letters one after each other, but I want the rule to only match if the result is not be equal to a specific string (say, "abc").

In regex, this can be achieved by using a negative lookahead combined with the anchor characters - ^(?!abc$)[a-z]+ because the anchors stop the negative lookahead as soon as it matches the abc, which means that a string like abcd would be matched.

However, pest doesn't have these anchor characters, and the best that can be achieved is something like:

identifier { 
    !("abc") ~ ASCII_ALPHA_LOWER+ 
}

But because there are no anchor characters, any string that begins with abc would be matched regardless of the rest of its characters, which is not optimal because I only want the rule to fail if it exactly matches one of the reserved keywords.

Does anyone know how I could achieve the same functionality as the regex with pest's grammars?


Solution

  • Instead of negative lookahead, use an optional modifier:

    ident = { "abc"? ~ ASCII_ALPHA_LOWER+}
    

    This requires that if the identifier starts with a keyword it must have something following it.