JISON: How do I avoid "dog" being parsed as "do"?

Question

JISON: How do I avoid "dog" being parsed as "do"?

117 views Asked by palantus At 05 January 2021 at 15:25

I have the following JISON file (lite version of my actual file, but reproduces my problem):

%lex

%%

"do"                        return 'DO';
[a-zA-Z_][a-zA-Z0-9_]*      return 'ID';
"::"                        return 'DOUBLECOLON'
<<EOF>>                     return 'ENDOFFILE';

/lex

%%

start
    : ID DOUBLECOLON ID ENDOFFILE
    {$$ = {type: "enumval", enum: $1, val: $3}}
    ;

It is for parsing something like "AnimalTypes::cat". It works fine for things like "AnimalTypes::cat", but the when it sees dog instead of cat, it asumes it's a DO instead of an id. I can see why it does that, but how do I get around it? I've been looking at other JISON documents, but can't seem to spot the difference that (I assume) makes those work.

This is the error I get:

JisonParserError: Parse error on line 1:
PetTypes::dog
----------^
Expecting "ID", "enumstr", "id", got unexpected "DO"

Repro steps:

Install jison-gho globally from npm (or modify code to use local version). I use Node v14.6.0.
Save the JISON above as minimal-repro.jison
Run: jison -m es -o ./minimal.mjs ./minimal-repro.jison to create parser
Create a file named test.mjs with code like:

import Parser from "./minimal.mjs";
Parser.parser.parse("PetTypes::dog")

Run node test.mjs

Edit: Updated with a reproducible example. Edit2: Simpler JISON

Original Q&A

There are 1 answers

**rici** · Accepted Answer · 2021-01-06T12:19:50+00:00

Unlike (f)lex, the jison lexer accepts the first matching pattern, even if it is not the longest matching pattern. You can get the (f)lex behaviour by using

 %option flex

However, that significantly slows down the scanner.

The original jison automatically added \b to the end of patterns which ended with a literal string matching an alphabetic character, to make it easier to match keywords without incurring this overhead. In jison-gho, this feature was turned off unless you specify

 %option easy_keyword_rules

See https://github.com/zaach/jison/wiki/Deviations-From-Flex-Bison#user-content-literal-tokens.

So either of those options will achieve the behaviour you expect.

TechQA.

JISON: How do I avoid "dog" being parsed as "do"?

There are 1 answers

Related Questions in JISON

Popular Questions

Trending Questions