haskellmegaparsec

Megaparsec, backtracking user state with StateT and ParsecT


Using Megaparsec 5. Following this guide, I can achieve a back-tracking user-state by combining StateT and ParsecT (non-defined types should be obvious/irrelevant):

type MyParser a = StateT UserState (ParsecT Dec T.Text Identity) a

if I run a parser p :: MyParser a, like this:

parsed = runParser (runStateT p initialUserState) "" input

The type of parsed is:

Either (ParseError Char Dec) (a, UserState)

Which means, in case of error, the user state is lost.

Is there any way to have it in both cases?

EDIT: Could I perhaps, in case of error, use a custom error component instead of Dec (a feature introduced in 5.0) and encapsulate the user state in there?


Solution

  • You can use a custom error component combined with the observing function for this purpose (see this great post for more information):

    {-# LANGUAGE RecordWildCards #-}
    
    module Main where
    
    import Text.Megaparsec
    import qualified Data.Set as Set
    import Control.Monad.State.Lazy
    
    data MyState = MyState Int deriving (Ord, Eq, Show)
    data MyErrorComponent = MyErrorComponent (Maybe MyState) deriving (Ord, Eq, Show)
    
    instance ErrorComponent MyErrorComponent where
        representFail _ = MyErrorComponent Nothing 
        representIndentation _ _ _= MyErrorComponent Nothing 
    
    type Parser = StateT MyState (Parsec MyErrorComponent String)
    
    trackState :: Parser a -> Parser a
    trackState parser = do
        result <- observing parser -- run parser but don't fail right away
        case result of
            Right x -> return x -- if it succeeds we're done here
            Left ParseError {..} -> do
                state <- get -- read the current state to add it to the error component
                failure errorUnexpected errorExpected $
                    if Set.null errorCustom then Set.singleton (MyErrorComponent $ Just state) else errorCustom
    

    In the above snipped, observing functions a bit like a try/catch block that catches a parse error, then reads the current state and adds the it to the custom error component. The custom error component in turn is returned when runParser returns a ParseError.

    Here's a demonstration how this function could be used:

    a = trackState $ do
        put (MyState 6)
        string "foo"
    
    b = trackState $ do
        put (MyState 5)
        a
    
    main = putStrLn (show $ runParser (runStateT b (MyState 0)) "" "bar") 
    

    In reality you would probably want to do something more clever (for instance I imagine you could also add the entire stack of states you go through while traversing the stack).