Considering the Haskell code:
do -- it's IO !!!
...
!x <- someFunction
... usage of x
how useful is !
here? It's a result of monadic (it's IO
!) evaluation, so my questions are:
if x
is a primitive type like UTCTime
(getCurrentTime
for example), Int
, Bool
- does !
make sense?
if x
has "head" like String
, Map k v
, some record maybe? Then x
can be lazy? How lazy? Does it mean that when I will use x
later then real IO
happens?
Does IO
monad have some more specific behavior in this context? Like little bit more strict than other (more "pure") monads?
By default I/O is lazy, then how does Haskell know when to call it immediately (passing through this expression) or when to substitute it with a thunk? By type of x
? Something like: GHC has a list of primitive types and everything outside the list is substituted?
Any explanations are appreciated.
Unfortunately, there simply isn't any possible blanket answer. In general, you must know the implementation of someFunction
-- and details about how x
is used -- to decide whether forcing x
is a good idea.
Some commentary:
"By default I/O is lazy" is either extremely misleading or straight up wrong depending on what you meant by it. In m >>= f
, the effects of m
always* happen before f
is called with an argument.
The type returned by an action isn't enough to decide whether forcing is worthwhile or not. For example, one could write:
badTriangle :: Int -> Int
badTriangle n = foldr (+) 0 [0..n]
ioTriangle :: Int -> IO Int
ioTriangle = pure . badTriangle
Although Int
is a primitive type, do !x <- ioTriangle n; ...
behaves very differently from do x <- ioTriangle n; ...
. A particularly relevant and somewhat common example here is the difference between getLine <&> read
, which throws exceptions when its value is used (potentially much later in the program), and getLine >>= readIO
(aka readLn
), which throws exceptions when the bad value is entered.
In the other direction, getLine
by itself returns a non-primitive type, but forcing its result does nothing relevant.
Knowing whether your action returns a thunk or not is not enough to decide whether forcing is a good idea. In some cases, you may know that the value will eventually be used and that the resulting value uses less memory than the thunk used to build it; then forcing is a good idea. But either of those conditions may fail. You may have a program which never uses the resulting value; then forcing immediately may require a lot of computation that never ends up being useful. Or you may have a very small thunk that describes a very large value; then forcing immediately may require a large allocation now that could have been delayed to a better time.
However, the other direction of reasoning is okay: if you know your action does not return a thunk, then you do not need to force its result.
The whole story gets even more interesting and complicated when you introduce threads. You can pass thunks between threads and control which thread forces it, and this is sometimes useful. So even knowing that a value will eventually be used and is small is no longer enough -- you may want to leave it unevaluated anyway and foist that work onto some other thread.
* Okay, okay, there are a few exceptions, like readFile
and getContents
, all of which use unsafeInterleaveIO
somewhere in their implementation. But this is not the normal situation.