javaregexjflexword-boundary

Is there an alternative to \b and/or negative lookahead for JFLEX?


I am building a Scanner and can't seem to find a way to identify operators like "if" or "else" using JFlex & Regex. Since JFlex doesn't fully conform I can't use word-boundary or (?<=\s|^) + (?=\s|$) because neither ? or $ are allowed. The idea is to find the correctly written operators not ifo or elso. Thanks in advance.


Solution

  • The idea is to find the correctly written operators not ifo or elso. Thanks in advance.

    Just use "if" and "else" and have another pattern that would match ifo and elseo (like for identifiers) and that comes after the patterns for if and else:

    "if"                   { /* it's an if */ }
    "else"                 { /* it's an else */ }
    [a-zA-Z_][a-zA-Z_0-9]* { /* it's an identifier */ }
    

    Following the maximal munch rule, this will match the identifier rule for inputs like ifo and elseo and will only match the "if" and "else" rules when there is no longer prefix of the input that would match the identifier rule.

    If your language doesn't have identifiers and ifo and elseo are just supposed to be invalid tokens, you can keep the pattern and just change the action to treat it as an invalid token rather than an identifier.