In the MVE code below I have tried to create a function collect
which is supposed to take a RegModule
-monad as argument such as the scanChar
and when this scanChar
or other RegModule
succeeds in scanning a char as seen in its case branch then the collect
shall behave the same way, that is 'scan' the char as well, and increment the associated i
as seen in scanChar
. On top of the behavior of scanChar
however it shall also return a string, hence the return type RegModule d (String, a)
, where string are all of the 'scanned' Char
s within an input string. This shall however not only apply to the scanChar
but more generally to other types of monadic-functions using RegModule
, but as a start if it could be implemented to only take scanChar
into account that will be fine.
The problem is that when I try to return a string I get a type inconsistency error since data d'
that I try to use for the function is not explicitly of that time. I have tried with using show
but this require that I change the type signature of the method collect
, which I would like to avaoid. Any ideas about how to work around this without changing the type signature of any of the methods?
import qualified Data.Set as S
import Control.Monad
type CharSet = S.Set Char
data RE =
RClass Bool CharSet
newtype RegModule d a =
RegModule {runRegModule :: String -> Int -> d -> [(a, Int, d)]}
instance Monad (RegModule d) where
return a = RegModule (\_s _i d -> return (a, 0, d))
m >>= f =
RegModule (\s i d -> do (a, j, d') <- runRegModule m s i d
(b, j', d'') <- runRegModule (f a) s (i + j) d'
return (b, j + j', d''))
instance Functor (RegModule d) where fmap = liftM
instance Applicative (RegModule d) where pure = return; (<*>) = ap
scanChar :: RegModule d Char
scanChar = RegModule (\s i d ->
case drop i s of
(c:cs) -> return (c, i+1, d)
[] -> []
)
regfail :: RegModule d a
regfail = RegModule (\_s _i d -> []
)
regEX :: RE -> RegModule [String] ()
regEX (RClass b cs) = do
next <- scanChar
if (S.member next cs)
then return ()
else regfail
fetchData :: RegModule d d
fetchData = RegModule (\_s _i d -> [(d, 0, d)])
collect :: RegModule d a -> RegModule d (String, a)
collect module = do
a <- module
consumed <- fetchData
let consumedStr = (show consumed)
return (consumedStr, a)
runRegModuleThrice :: RegModule d a -> String -> Int -> d -> [(a, Int, d)]
runRegModuleThrice matcher input startPos state =
let (result1, pos1, newState1) = head $ runRegModule matcher input startPos state
(result2, pos2, newState2) = head $ runRegModule matcher input pos1 newState1
(result3, pos3, newState3) = head $ runRegModule matcher input pos2 newState2
in [(result1, pos1, newState1), (result2, pos2, newState2), (result3, pos3, newState3)]
Your monad seems a little buggy. In the signature:
String -> Int -> d -> [(a, Int, d)]
the implementation of >>=
suggests that the String
is the full, constant input string, the first Int
is an offset into that String
, and the second Int
is the number of characters read by the operation and not the new offset. In particular, when the computation on the LHS of >>=
starts at offset i
and returns a count of characters scanned j
, the computation on the right hand side is run starting at offset i+j
, not j
.
However, your scanChar
implementation doesn't appear to match this implementation, since it starts scanning at offset i
and then returns the new offset i+1
, instead of the number of characters read, which should just be 1
.
The reason I bring all this up is that you probably want collect m
to run m
and then use the offset and number of characters scanned by m
to directly extract the scanned substring and add it to the return value, something like:
collect :: RegModule d a -> RegModule d (String, a)
collect m = RegModule $ \s i d -> do
(a, j, d') <- runRegModule m s i d
pure ((take j (drop i s), a), j, d')
In order for this to work with your scanChar
, the definition will need to be fixed:
scanChar :: RegModule d Char
scanChar = RegModule (\s i d ->
case drop i s of
(c:cs) -> return (c, 1, d) -- return "1" char scanned
[] -> []
)