I have the following JISON file (lite version of my actual file, but reproduces my problem):
%lex
%%
"do" return 'DO';
[a-zA-Z_][a-zA-Z0-9_]* return 'ID';
"::" return 'DOUBLECOLON'
<<EOF>> return 'ENDOFFILE';
/lex
%%
start
: ID DOUBLECOLON ID ENDOFFILE
{$$ = {type: "enumval", enum: $1, val: $3}}
;
It is for parsing something like "AnimalTypes::cat". It works fine for things like "AnimalTypes::cat", but the when it sees dog instead of cat, it asumes it's a DO instead of an id. I can see why it does that, but how do I get around it? I've been looking at other JISON documents, but can't seem to spot the difference that (I assume) makes those work.
This is the error I get:
JisonParserError: Parse error on line 1:
PetTypes::dog
----------^
Expecting "ID", "enumstr", "id", got unexpected "DO"
Repro steps:
- Install jison-gho globally from npm (or modify code to use local version). I use Node v14.6.0.
- Save the JISON above as
minimal-repro.jison - Run:
jison -m es -o ./minimal.mjs ./minimal-repro.jisonto create parser - Create a file named
test.mjswith code like:
import Parser from "./minimal.mjs";
Parser.parser.parse("PetTypes::dog")
- Run
node test.mjs
Edit: Updated with a reproducible example. Edit2: Simpler JISON
Unlike (f)lex, the jison lexer accepts the first matching pattern, even if it is not the longest matching pattern. You can get the (f)lex behaviour by using
However, that significantly slows down the scanner.
The original jison automatically added
\bto the end of patterns which ended with a literal string matching an alphabetic character, to make it easier to match keywords without incurring this overhead. In jison-gho, this feature was turned off unless you specifySee https://github.com/zaach/jison/wiki/Deviations-From-Flex-Bison#user-content-literal-tokens.
So either of those options will achieve the behaviour you expect.