I'm a Haskell beginner, using Attoparsec to find some color expressions in a text. I want to be able to match, for example, "light blue-green" and "light blue green" in a text. But of course I need a generalized solution for any string like that. So I've been thinking that it would be something like
"light" >> sep >> "blue" >> sep >> "green"
where sep = inClass "\n\r- "
In other words, I think I need a way to intercalate >> sep >> to a list of words. Something like:
import qualified Data.Text as T
import Data.Attoparsec.Text
-- | Makes a parser from a list of words, accepting
-- spaces, newlines, and hyphens as separators.
wordListParser :: [T.Text] -> Parser
wordListParser wordList = -- Some magic here
Or maybe I'm thinking about this the wrong way entirely, and there's an easier way?
Edit: this minimal non-working example feels like it's almost there:
{-# LANGUAGE OverloadedStrings #-}
import Replace.Attoparsec.Text
import Data.Attoparsec.Text as AT
import qualified Data.Text as T
import Control.Applicative (empty)
wordListParser :: [T.Text] -> Parser T.Text
wordListParser (w:ws) = string w >> satisfy (inClass " -") >> wordListParser ws
wordListParser [w] = string w
wordListParser [] = empty -- or whatever the empty parser is
main :: IO ()
main = parseTest (wordListParser (T.words "light green blue")) "light green-blue"
which I think can be run with something like
stack runhaskell ThisFile.hs --package attoparsec replace-attoparsec text
Here is what I would do, assuming that you have a data type for your colours; if you don't, just substitute it for what you're using. The function
parseColourGentakes anyTextthat is space-separated, and generates a parser that accepts a colour where each word is separated by one or more legal separators.You can then feed your
wordListto the parser; however, it needs to be an association list:This way, you can configure all of your colours and their corresponding colour names in one place, and you can then run the parser like so:
EDIT
After the edit in your question, I think I understand what you want a little bit better. FWIW, I would still prefer the solution above, but here is how to fix your last block of code:
(w:ws)and[w]overlap, so if you want the runtime to catch the single-element pattern, you have to place it on top.a >> bmeans "run actiona, discard its result, then run actionband use that result". Which is why your parser (with the fix above) will outputDone "" "blue". A simple way to fix this is to usedonotation to bind the result of all three computations, and return their concatenation.Here is what your code now looks like:
One last thing: your current implementation will not parse Windows line breaks (
\n\r). I don't know if you dropped\nand\rfrom your separator characters, but if you haven't and Windows files are a possibility for you, it's something to keep in mind.