haskelltypesfunctional-programmingmonadsstate-monad

How to convert a generic type in Haskell into a String within a Monadic context


In the MVE code below I have tried to create a function collect which is supposed to take a RegModule-monad as argument such as the scanChar and when this scanChar or other RegModule succeeds in scanning a char as seen in its case branch then the collect shall behave the same way, that is 'scan' the char as well, and increment the associated i as seen in scanChar. On top of the behavior of scanChar however it shall also return a string, hence the return type RegModule d (String, a), where string are all of the 'scanned' Chars within an input string. This shall however not only apply to the scanChar but more generally to other types of monadic-functions using RegModule, but as a start if it could be implemented to only take scanChar into account that will be fine.

The problem is that when I try to return a string I get a type inconsistency error since data d' that I try to use for the function is not explicitly of that time. I have tried with using show but this require that I change the type signature of the method collect, which I would like to avaoid. Any ideas about how to work around this without changing the type signature of any of the methods?

import qualified Data.Set as S

import Control.Monad

type CharSet = S.Set Char

data RE =
    RClass Bool CharSet

newtype RegModule d a =
  RegModule {runRegModule :: String -> Int -> d -> [(a, Int, d)]}

instance Monad (RegModule d) where
  return a = RegModule (\_s _i d -> return (a, 0, d))
  m >>= f =
    RegModule (\s i d -> do (a, j, d') <- runRegModule m s i d
                            (b, j', d'') <- runRegModule (f a) s (i + j) d'
                            return (b, j + j', d''))

instance Functor (RegModule d) where fmap = liftM
instance Applicative (RegModule d) where pure = return; (<*>) = ap

scanChar :: RegModule d Char
scanChar = RegModule (\s i d ->
  case drop i s of
    (c:cs) -> return (c, i+1, d)
    [] -> []
  )

regfail :: RegModule d a
regfail = RegModule (\_s _i d -> []
                )
regEX :: RE -> RegModule [String] ()
regEX (RClass b cs) = do
  next <- scanChar  
  if (S.member next cs)
    then return ()
    else regfail

fetchData :: RegModule d d
fetchData = RegModule (\_s _i d -> [(d, 0, d)])

collect :: RegModule d a -> RegModule d (String, a)
collect module = do
  a <- module
  consumed <- fetchData
  let consumedStr = (show consumed)
  return (consumedStr, a)


 
runRegModuleThrice :: RegModule d a -> String -> Int -> d -> [(a, Int, d)]
runRegModuleThrice matcher input startPos state =
  let (result1, pos1, newState1) = head $ runRegModule matcher input startPos state
      (result2, pos2, newState2) = head $ runRegModule matcher input pos1 newState1
      (result3, pos3, newState3) = head $ runRegModule matcher input pos2 newState2
  in [(result1, pos1, newState1), (result2, pos2, newState2), (result3, pos3, newState3)]


Solution

  • Your monad seems a little buggy. In the signature:

    String -> Int -> d -> [(a, Int, d)]
    

    the implementation of >>= suggests that the String is the full, constant input string, the first Int is an offset into that String, and the second Int is the number of characters read by the operation and not the new offset. In particular, when the computation on the LHS of >>= starts at offset i and returns a count of characters scanned j, the computation on the right hand side is run starting at offset i+j, not j.

    However, your scanChar implementation doesn't appear to match this implementation, since it starts scanning at offset i and then returns the new offset i+1, instead of the number of characters read, which should just be 1.

    The reason I bring all this up is that you probably want collect m to run m and then use the offset and number of characters scanned by m to directly extract the scanned substring and add it to the return value, something like:

    collect :: RegModule d a -> RegModule d (String, a)
    collect m = RegModule $ \s i d -> do
      (a, j, d') <- runRegModule m s i d
      pure ((take j (drop i s), a), j, d')
    

    In order for this to work with your scanChar, the definition will need to be fixed:

    scanChar :: RegModule d Char
    scanChar = RegModule (\s i d ->
      case drop i s of
        (c:cs) -> return (c, 1, d)   -- return "1" char scanned
        [] -> []
      )