For XPath parsing, I need to distinguish the token type of "name" in the cases where it is followed by "::" (possibly after whitespace), followed by "(" (possibly after whitespace), or followed by neither.
In JLex we did this with a routine that read ahead in the yy_* buffer, but that isn't exposed in JFlex, and a lookahead RE should be a cleaner solution than handcoded lookahead.
Unfortunately, my first attempt isn't working as expected; it's falling into the third category (standalone name) more often than I would have expected.
What I'm trying is the following pattern, where "self" is the thing I'm trying to separate between the two cases. Please pardon the "GONK_" -- that's my convention for debugging wrappers. And the wrappered newSymbol() is itself a wrapper for new Symbol() which has a side effect; sorry about that.
"self/\s*::" { return GONK_newSymbol(sym.SELF); }
"self/\s*[(]" { return GONK_newSymbol(sym.SELF); }
"self" { return GONK_newSymbol(sym.QNAME,yytext()); }
As I understand it, JFlex's RE rules are "longest match wins, ties broken in favor of first match", so I expected the lookaheads to take precedence. But putting a trace printout in the GONK_ functions tells me that the third case (sym.QNAME) is almost always being taken.
I'm sure my error is obvious and I'm looking right at it, but... What did I miss?
Found it. Shouldn't have quoted the RE syntax except literals (to emphasize that they are literals).
Knew I was looking right at it.
Leaving the question here because I didn't find a good example of using JFlex lookahead elsewhere.