I've got a few hundred lines written parsing a toy language I'm playing with. Thought I was starting to really grok Parsec. But now I'm stumbling over what appears to be a very simple parsing task, so I'm clearly missing some element of basic understanding.
I want to match examples like this (not really actually, but in this minimal example I do):
I've reduced it to a contrived minimal example (sitting in file MinEx.hs):
module MinEx where
import Text.Parsec
import Text.Parsec.Token
import Data.Char (isSpace)
import Data.Maybe (fromMaybe)
import System.Environment (getArgs)
import System.IO.Unsafe (unsafePerformIO)
myDef :: LanguageDef st
myDef = LanguageDef
{ commentStart = ""
, commentEnd = ""
, commentLine = "#"
, nestedComments = True
, identStart = letter
, identLetter = alphaNum
, opStart = opLetter myDef
, opLetter = oneOf ":!#$%&*+./<=>?@\\^|-~"
, reservedOpNames = []
, reservedNames = []
, caseSensitive = True
}
TokenParser{parens = myParens
, identifier = myIdentifier
, reservedOp = myReservedOp
, reserved = myReserved
, semiSep1 = mySemiSep1
, whiteSpace = myWhiteSpace } = makeTokenParser myDef
simpleSpace :: Parsec String st ()
simpleSpace = skipMany1 (satisfy isSpace)
upperIdentifier :: Parsec String st String
upperIdentifier = lookAhead upper >> myIdentifier
x `uio` y = unsafePerformIO x `seq` y
nameThenEnd :: Parsec String st String
nameThenEnd = do
print "at name" `uio` string "name"
print "at spaces after name" `uio` simpleSpace
maybeName <- print "at ident" `uio` optionMaybe upperIdentifier
-- I only match this here and not above with `optionMaybe (upperIdentifier <* simpleSpace)` for debugging purposes.
case maybeName of
Nothing -> (print "at no spaces after no ident" >> print maybeName) `uio` return ()
Just name -> (print "at spaces after ident" >> print maybeName) `uio` simpleSpace
string "end"
return (fromMaybe "" maybeName)
main :: IO ()
main = getArgs >>= \args -> print (parse (nameThenEnd <* eof) "" (args !! 0))
Example run where it works, when no ident is given:
> runhaskell MinEx.hs "name end"
"at name"
"at spaces after name"
"at ident"
"at no spaces after no ident"
Nothing
Right ""
Non-working example:
> runhaskell MinEx.hs "name Foo end"
"at name"
"at spaces after name"
"at ident"
"at spaces after ident"
Just "Foo"
Left (line 1, column 10):
unexpected "e"
Thanks. Apologies if I'm missing something very obvious.
The generated identifier
parser is a lexeme parser, which means that it will eat any whitespace following the identifier. Thus your simpleSpace
fails because there is no space left to consume, and you defined it with skipMany1
.
You usually don't need to handle whitespace manually if you use the parsers generated by makeTokenParser
(for example, use symbol
or reserved
instead of string
). See the documentation for more information.