Somewhat mystified by the following code. In non-toy version of the problem I'm trying to do a monadic computation in a monad Result, the values of which can only be constructed from within IO. Seems like the magic behind IO makes such computations strict, but I can't figure out how exactly that happens.
The code:
data Result a = Result a | Failure deriving (Show)
instance Functor Result where
fmap f (Result a) = Result (f a)
fmap f Failure = Failure
instance Applicative Result where
pure = return
(<*>) = ap
instance Monad Result where
return = Result
Result a >>= f = f a
Failure >>= _ = Failure
compute :: Int -> Result Int
compute 3 = Failure
compute x = traceShow x $ Result x
compute2 :: Monad m => Int -> m (Result Int)
compute2 3 = return Failure
compute2 x = traceShow x $ return $ Result x
compute3 :: Monad m => Int -> m (Result Int)
compute3 = return . compute
main :: IO ()
main = do
let results = mapM compute [1..5]
print $ results
results2 <- mapM compute2 [1..5]
print $ sequence results2
results3 <- mapM compute3 [1..5]
print $ sequence results3
let results2' = runIdentity $ mapM compute2 [1..5]
print $ sequence results2'
The output:
1
2
Failure
1
2
4
5
Failure
1
2
Failure
1
2
Failure
Nice test cases. Here's what's happening:
in mapM compute
we see laziness at work, as usual. No surprise here.
in mapM compute2
we work inside the IO monad, whose mapM
definition will demand the whole list: unlike Result
which skips the tail of the list as soon as Failure
is found, IO
will always scan the whole list. Note the code:
compute2 x = traceShow x $ return $ Result x
So, the above wil print the debug message as soon as each element of the list of IO actions is accessed. All are, so we print everything.
in mapM compute3
we now use, roughly:
compute3 x = return $ traceShow x $ Result x
Now, since return
in IO is lazy, it will not trigger the traceShow
when returning the IO action. So, when mapM compute3
is run, no message is seen. Instead, we see messages only when sequence results3
is run, which forces the Result
-- not all of them, but only as much as needed.
the final Identity
example is also quite tricky. Note this:
> newtype Id1 a = Id1 a
> data Id2 a = Id2 a
> Id1 (trace "hey!" True) `seq` 42
hey!
42
> Id2 (trace "hey!" True) `seq` 42
42
when using a newtype
, at runtime there is no boxing/unboxing (AKA lifting) involved, so forcing a Id1 x
value causes x
to be forced. With data
types this does not happen: the value is wrapped in a box (e.g. Id2 undefined
is not equivalent to undefined
).
In your example, you add an Identity
constructor, but that is from the newtype Identity
!! So, when calling
return $ traceShow x $ Result x
the return
here does not wrap anything, and the traceShow
is immediately triggered as soon as mapM
is run.