haskell functional-programming monad-transformers state-monad io-monad

About the choice of where to apply the monad parameter of a monad transformer

Take the MaybeT monad transformer:

newtype MaybeT m a = MaybeT { runMaybeT :: m (Maybe a) }

I wouldn't have expected a different definition for it, because Maybe is just kind of a box with a (optional) content in it (of type a above), so what could MaybeT do with the parameter m, if not wrappeing the whole Maybe in it? Writing :: Maybe (m a) would have made no sense, as that's just the type of someaction <$> theMaybe.

But in other cases, there could have been other choices.

Take the StateT monad transformer, defined as

newtype StateT s m a = StateT { runStateT :: s -> m (a, s) }

This definition, if I look at it in light of the State monad's definition (as it would be if it was not defined on top of StateT via Identity, clearly),

newtype State s a = State { runState :: s -> (a, s) }

seems to be in line with the interpretation of "a function a -> b as a container with content b", which is how I originally understood what's the meaning of the Functor instance of (->) r. So fair enough, m is wrapping the "content" of the function (i.e. the "content" of the stateful computation).

But this is (or seems to me!) quite different form the case of MaybeT, where m does not wrap the inside of the Maybe, which is a, but the whole Maybe a.

If StateT had to resemble MaybeT (in this fact of using m to wrap the whole thing), it would have been like this

newtype StateT' s m a = StateT' { runStateT' :: m (s -> (a,s)) }

About this, some thoughts come to mind:

What would have been wrong with it?
If nothing would have been flat out wrong, would have such a monad transfomer been of little help?
What I see wrong in it, is that it feels too unrestricted, in the sense that running such a monad transformer on top of IO would mean that one can pick a whole arbitrary stateful computation, s -> (a, s), out of the IO monad, whereas if the actual StateT is stacked on top of IO, IO can only be used to get the new state and the result a; but still, I'm not sure I understand the difference fully.

And finally, what about this?

newtype StateT'' s m a = StateT'' { runStateT'' :: s -> (m a,s) }

This feels a bit useless, because... it feels like the stateful computation makes no sense, because, again with the example of m being IO, a will come from IO, but... Oh, I really don't understand.

The point is that I have experience of the usefulness of StateT as it is, but I don't really understand the reasons why it is useful, and other alternatives wouldn't have been. Or would they?

Solution

To consider what can go wrong, take m = IO and think about what effects can be represented by your types, and which instead can not.

newtype StateT' s m a = StateT' { runStateT' :: m (s -> (a,s)) }

Well, this becomes IO (s -> (a,s)), so it performs the IO effects before reading the current state to produce the new state (and the result a).

So, this can not express the computation "read the current state and print it". Not that useful.

newtype StateT'' s m a = StateT'' { runStateT'' :: s -> (m a,s) }

Here we get s -> (IO a,s). It's easy to obtain from this a function s -> s that computes the new state in a pure way. This means that the IO can not affect the new state.

So, this can not express the computation "ask the user for keyboard input, and change the state depending on that". Also not that useful.

On top of these concerns, we should note that the type we get by placing m in random places could even fail to be a monad! I am not sure we can define a sensible >>= for the types you propose. Attempting that could be a nice exercise which might lead to convincing yourself it can not really work.