I am trying to parse a limited set of valid strings which have a common prefix with attoparsec. However, My attempts result in either a Partial
result or a premature Done
:
{-# LANGUAGE OverloadedStrings #-}
import Control.Applicative
import qualified Data.Attoparsec.Text as PT
data Thing = Foobar | Foobaz | Foobarz
thingParser1 = PT.string "foobarz" *> return Foobarz
<|> PT.string "foobaz" *> return Foobaz
<|> PT.string "foobar" *> return Foobar
thingParser2 = PT.string "foobar" *> return Foobar
<|> PT.string "foobaz" *> return Foobaz
<|> PT.string "foobarz" *> return Foobarz
What I want is for "foobar" to result in Foobar
, "foobarz" to result in Foobarz
and "foobaz" to result in Foobaz
. However
PT.parse thingParser1 "foobar"
results in a PT.Partial
and
PT.parse thingParser2 "foobarz"
results in a PT.Done "z" Foobar
.
As you see the order of alternatives matters in the parsec family of parser combinator libraries. It will first try the parser on the left and only continue with the parser on the right if that fails.
Another thing to notice is that your parsers don't require that the input ends after parsing. You can force that by using parseOnly
instead of parse
to run the actual parser. Or you can use the maybeResult
or eitherResult
functions to convert the Result
into a Maybe
or Either
respectively.
That solution will work for thingParser1
, but thingParser2
will still not work. This is because you need to have both the string
parser and an endOfInput
under a single try
, this would work:
thingParser3 = Foobar <$ PT.string "foobar" <* endOfInput
<|> Foobaz <$ PT.string "foobaz" <* endOfInput
<|> Foobarz <$ PT.string "foobarz" <* endOfInput
A slightly better approach is to do a quick look ahead to see if an z
follows the foobar
, you can do that like this:
thingParser4 = Foobar <$ (do
PT.string "foobar"
c <- peekChar
guard (maybe True (/= 'z') c))
<|> Foobaz <$ PT.string "foobaz"
<|> Foobarz <$ PT.string "foobarz"
But this backtracking also degrades the performance, so I would stick with the thingParser1
implementation.