I'm interested in using lex to tokenize my input string, but I do not want it to be possible to "fail". Instead, I want to have some type of DEFAULT or TEXT token, which would contain all the non-matching characters between recognized tokens.
Anyone have experience with something like this?
                        
To expand on @Chris Dodd's answer, the final rule in any lex script should be:
and don't write any single-character rules like
"+" return PLUS;. Just use the special characters you recognize directly in the grammar, e.g.term: term '+' factor;.This practice: