I have the following JISON file (lite version of my actual file, but reproduces my problem):
%lex
%%
"do" return 'DO';
[a-zA-Z_][a-zA-Z0-9_]* return 'ID';
"::" return 'DOUBLECOLON'
<<EOF>> return 'ENDOFFILE';
/lex
%%
start
: ID DOUBLECOLON ID ENDOFFILE
{$$ = {type: "enumval", enum: $1, val: $3}}
;
It is for parsing something like "AnimalTypes::cat". It works fine for things like "AnimalTypes::cat", but the when it sees dog instead of cat, it asumes it's a DO instead of an id. I can see why it does that, but how do I get around it? I've been looking at other JISON documents, but can't seem to spot the difference that (I assume) makes those work.
This is the error I get:
JisonParserError: Parse error on line 1:
PetTypes::dog
----------^
Expecting "ID", "enumstr", "id", got unexpected "DO"
Repro steps:
minimal-repro.jison
jison -m es -o ./minimal.mjs ./minimal-repro.jison
to create parsertest.mjs
with code like:import Parser from "./minimal.mjs";
Parser.parser.parse("PetTypes::dog")
node test.mjs
Edit: Updated with a reproducible example. Edit2: Simpler JISON
Unlike (f)lex, the jison lexer accepts the first matching pattern, even if it is not the longest matching pattern. You can get the (f)lex behaviour by using
%option flex
However, that significantly slows down the scanner.
The original jison automatically added \b
to the end of patterns which ended with a literal string matching an alphabetic character, to make it easier to match keywords without incurring this overhead. In jison-gho, this feature was turned off unless you specify
%option easy_keyword_rules
See https://github.com/zaach/jison/wiki/Deviations-From-Flex-Bison#user-content-literal-tokens.
So either of those options will achieve the behaviour you expect.