parsinghaskellparsec

Parsec try : should try to go to next option


Currently, I have the following code:

import Control.Applicative ((<|>))
import Text.Parsec (ParseError, endBy, sepBy, try)
import Text.Parsec.String (Parser)
import qualified Data.Char as Char
import qualified Text.Parsec as Parsec

data Operation = Lt | Gt deriving (Show)

data Value =
      Raw String
    | Op Operation
  deriving (Show)

sampleStr :: String
sampleStr =  unlines
  [ "#BEGIN#"
  , "x <- 3.14 + 2.72;"
  , "x < 10;"
  ]

gtParser :: Parser Value
gtParser = do
  Parsec.string "<"
  return $ Op Gt

ltParser :: Parser Value
ltParser = do
  Parsec.string ">"
  return $ Op Lt

opParser :: Parser Value
opParser = gtParser <|> ltParser

rawParser :: Parser Value
rawParser = do
  str <- Parsec.many1 $ Parsec.satisfy $ not . Char.isSpace
  return $ Raw str

valueParser :: Parser Value
valueParser = try opParser <|> rawParser

eolParser :: Parser Char
eolParser = try (Parsec.char ';' >> Parsec.endOfLine)
         <|> Parsec.endOfLine

lineParser :: Parser [Value]
lineParser = sepBy valueParser $ Parsec.many1 $ Parsec.char ' '

fileParser :: Parser [[Value]]
fileParser = endBy lineParser eolParser

parse :: String -> Either ParseError [[Value]]
parse = Parsec.parse fileParser "fail..."

main :: IO ()
main = print $ parse sampleStr

This will fail with the message

Left "fail..." (line 2, column 4):
unexpected "-"
expecting " ", ";" or new-line

To my understanding, since I have try opParser, after Parsec sees that the token <- cannot be parsed by opParser, it should go to rawParser. (It is essentially a lookahead).

What is my misunderstanding, and how do I fix this error?


Solution

  • You can replicate the problem with the smaller test case:

    > Parsec.parse fileParser "foo" "x <- 3.14"
    

    The problem is that fileParser first calls lineParser, which successfully parses "x <" into [Raw "x", Op Gt] and leaves "- 3.14" yet to be parsed. Unfortunately, fileParser now expects to parse something with eolParser, but eolParser can't parse "- 3.14" because it starts with neither a semicolon nor an endOfLine.

    Your try opParser has no effect here because opParser successfully parses <, so there's nothing to backtrack from.

    There are many ways you might fix the problem. If <- is the only case where a < might be misparsed, you could exclude this case with notFollowedBy:

    gtParser :: Parser Value
    gtParser = do
      Parsec.string "<"
      notFollowedBy $ Parsec.string "-"
      return $ Op Gt