haskellparsecmegaparsec

How to turn a function into a parser in haskell (Megaparsec)


This is my current code for multiplying the first and last numbers in a string

type Parser = Parsec Void String
parsefirst :: Parser Int 
parsefirst = do 
    getnotdigits 
    x <- some digitChar
    y <- some finishstr 
    let z = ranParserRight y
    return $ read x * getrightnum x z
parseright :: Parser Int 
parseright = do 
    getnotdigits 
    x <- some digitChar
    y <- finishstr 
    let z = ranParserRight y
    return $ getrightnum x z
ranParserRight = parseMaybe parseright
getrightnum = fromMaybe . read 
finishstr = many (satisfy (const True))
getnotdigits = many (satisfy (not . isDigit))

I want to do something like this:

type Parser = Parsec Void String
parsefirst :: Parser Int 
parsefirst = do 
    getnotdigits 
    x <- some digitChar
    z <- simpleParserRight
    return $ read x * getrightnum x z
parseright :: Parser Int 
parseright = do 
    getnotdigits 
    x <- some digitChar
    z <- simpleParserRight 
    return $ getrightnum x z
simpleParserRight = toParser (parseMaybe parseright)
getrightnum = fromMaybe . read 
getnotdigits = many (satisfy (not . isDigit))

However, I don't think I can. I searched up the possible type signature in hoogle and came up empty.


Solution

  • The usual way to approach this sort of thing is to compose parsers together using the Applicative and Monad interfaces they provide.

    {- cabal:
    build-depends: base, megaparsec, parser-combinators
    -}
    
    import Data.Char
    
    import Text.Megaparsec
    import Data.Void
    
    import Control.Monad.Combinators (skipManyTill)
    
    
    type Parser = Parsec Void String
    
    ignoreUntil :: Parser a -> Parser a
    ignoreUntil = skipManyTill anySingle
    
    ignoreRemaining :: Parser ()
    ignoreRemaining = ignoreUntil eof
    
    
    num :: Parser Int
    num = read <$> some (satisfy isDigit)
    
    firstNum :: Parser Int
    firstNum = ignoreUntil num
    
    lastNum :: Parser Int
    lastNum = last <$> some (try (ignoreUntil num))
    
    
    multiplyFirstAndLast :: Parser Int
    multiplyFirstAndLast = (*) <$> firstNum <*> lastNum <* ignoreRemaining
    

    This is an example that can be loaded with cabal repl to experiment with. I've pulled in the parser-combinators library so I can import the additional tools that megaparsec doesn't re-export. Note that megaparsec does depend on and re-export many functions from parser-combinators, so I'm not adding a new compile-time dependency. I just wanted skipManyTill without having to reimplement it myself.

    So if I load it and experiment with it a bit:

    $ cabal repl parse.hs
    Ok, one module loaded.
    ghci> parseMaybe multiplyFirstAndLast "blue, 2, 1, 5, fish"
    Just 10
    ghci> parseMaybe multiplyFirstAndLast "3blue, 2, 1, 5, fish7"
    Just 21
    ghci> parseMaybe multiplyFirstAndLast "The number is: 4"
    Nothing
    ghci> parseMaybe multiplyFirstAndLast "The numbers are: 4 and 3"
    Just 12
    ghci> parseMaybe multiplyFirstAndLast "The numbers are: 4, 25, and 3"
    Just 12
    

    That looks like it achieves your stated goal, if not by doing it exactly how you thought you would.

    Some implementation notes:

    1. Parser combinators are intended to be a composable interface. They should be built from composable units which are then connected via the the composition interfaces like Applicative or Monad. (I used only Applicative directly, but Monad is available for operations that are more complex or are context-sensitive.)
    2. I started with a couple generic utilities. ignoreUntil throws away input until it finds something that matches the provided parser. ignoreRemaining throws away input until the end.
    3. num and firstNum are not especially interesting applications of existing utilities.
    4. lastNum separates finding the last occurrence of something from finding all occurrences. It's easy to use the provided some combinator to make a list of things that match. As a separate step, last is applied to the list it made. Separating these concerns reduces the amount I need to think about in a way I enjoy.
    5. On the other hand, lastNum also needs to use try, megaparsec's biggest API failure. For performance reasons, megaparsec throws away partial input that matches the start of a parser. But if the parser ends up failing later, that input is gone and megaparsec is in a state where it can't try alternate parses of the same data, so all parses fail. The try combinator exists to mark a point where it should start saving input, so that if it decides to try an alternative parser the input is still there to observe. (This is not the same as general backtracking. There is no way to make megaparsec do general backtracking. Between that fact and the necessity of using try, I often recommend against using parsec-like libraries. Unless you need performance at the cost of a good API, I recommend going elsewhere.)