I'm having a hard time defining a left-to-right arithmetic expression parser in Haskell. What I have done so far is defining a right-to-left parser, following the "G. Hutton, Programming in Haskell" book.
-- the aexpr integer parser
aexpr_int :: Parser Int
aexpr_int = do
a1 <- aterm_int
s <- sign -- return 1 for "+" and -1 for "-"
a2 <- aexpr_int
return (a1 + (s * a2))
<|>
aterm_int
-- the aterm integer parser
aterm_int :: Parser Int
aterm_int = do
a1 <- aterm_int
char '*'
a2 <- afactor_int
return (a1 * a2);
<|>
do
a1 <- afactor_int
char '/'
a2 <- aterm_int
return (div a1 a2)
<|>
afactor_int
-- afactor_int
afactor_int :: Parser Int
afactor_int = do
token (char '(')
e <- aexpr_int
token (char ')')
return e
<|>
do
s <- sign
ic <- aexpr_int
return (s * ic)
<|>
token int_const
So this parses 1 - 2 - 3 - 4
as 1 - (2 - (3 - 4))
, but I want it to parse as ((1 - 2) - 3) - 4
. How can I reach this result?
The trick here is to define an aexpr_int
parser that first parses an aterm_int
, and then recursively, using a helper function on an accumulating expression, checks for multiple additional occurrences of sign >> aterm_int
, adding the additional terms to the accumulator. It might look something like this:
-- the aexpr integer parser
aexpr_int :: Parser Int
aexpr_int = do
a1 <- aterm_int
go a1
where go expr =
do
s <- sign
a2 <- aterm_int
go (expr + (s * a2))
<|> return expr
Together with the following:
-- the aterm integer parser
aterm_int :: Parser Int
aterm_int = do
a1 <- afactor_int
go a1
where go expr =
do
char '*'
a2 <- afactor_int
go (expr * a2)
<|>
do
char '/'
a2 <- afactor_int
go (div expr a2)
<|> return expr
-- afactor_int
afactor_int :: Parser Int
afactor_int = do
token (char '(')
e <- aexpr_int
token (char ')')
return e
<|>
do
s <- sign
ic <- afactor_int
return (s * ic)
<|>
token int_const
this seems to work correctly:
> parseTest aexpr_int "1-2-3-4"
-8
Note that, if you're using a real parser library instead of trying to write your own parser code for learning purposes, you'll want to use the library's built-in expression parser or combinators with names like chainl
or sepBy
to accomplish this.