I am trying to parse a limited set of valid strings which have a common prefix with attoparsec. However, My attempts result in either a Partial result or a premature Done:
{-# LANGUAGE OverloadedStrings #-}
import Control.Applicative
import qualified Data.Attoparsec.Text as PT
data Thing = Foobar | Foobaz | Foobarz
thingParser1 = PT.string "foobarz" *> return Foobarz
<|> PT.string "foobaz" *> return Foobaz
<|> PT.string "foobar" *> return Foobar
thingParser2 = PT.string "foobar" *> return Foobar
<|> PT.string "foobaz" *> return Foobaz
<|> PT.string "foobarz" *> return Foobarz
What I want is for "foobar" to result in Foobar, "foobarz" to result in Foobarz and "foobaz" to result in Foobaz. However
PT.parse thingParser1 "foobar"
results in a PT.Partial and
PT.parse thingParser2 "foobarz"
results in a PT.Done "z" Foobar.
As you see the order of alternatives matters in the parsec family of parser combinator libraries. It will first try the parser on the left and only continue with the parser on the right if that fails.
Another thing to notice is that your parsers don't require that the input ends after parsing. You can force that by using
parseOnlyinstead ofparseto run the actual parser. Or you can use themaybeResultoreitherResultfunctions to convert theResultinto aMaybeorEitherrespectively.That solution will work for
thingParser1, butthingParser2will still not work. This is because you need to have both thestringparser and anendOfInputunder a singletry, this would work:A slightly better approach is to do a quick look ahead to see if an
zfollows thefoobar, you can do that like this:But this backtracking also degrades the performance, so I would stick with the
thingParser1implementation.