I am trying to create a parser for a simple subset of SQL using a grammar written with BNF in Brag. My Brag code looks like this:
#lang brag
statement : "select" fields "from" source joins* filters*
fields : field ("," field)*
field : WORD
source : WORD
joins : join*
join : "join" source "on" "(" condition ")"
filters : "where" condition ("and" | "or" condition)*
condition : field "=" field
But when I attempt to use that grammar to parse a basic SQL statement, I run into the following error:
> (parse-to-datum "select * from table")
Encountered unexpected token of type "s" (value "s") while parsing 'unknown [line=#f, column=#f, offset=#f]
I'm a total beginner to grammars and brag. Any ideas what I'm doing incorrectly?
You need to lex/tokenize the string first. The input to
parse/parse-to-datumshould be a list of tokens. Also, brag is case sensitive, meaning that the input should beselectrather thanSELECT. After you do that, it should work:For the case sensitivity issue, this is fact not a problem, as you can perform normalization during the tokenization phase.
Your grammar looks weird, however. You probably should not deal with whitespaces. Instead, the whitespace should similarly be dealt with in the tokenization phase.
See https://beautifulracket.com/bf/the-tokenizer-and-reader.html for more information about tokenization.
An alternative possibility is to use other parsers. https://docs.racket-lang.org/megaparsack/index.html, for instance, can parse a string to a datum (or syntax datum) right away, though it uses some advanced concept in functional programming, so in a way it might be more difficult to use.