I'm having difficulties in understanding Haskell lazy evaluation.
I wrote simple test program. It reads 4 lines of data and second and fourth input line has lots of numbers.
consumeList :: [Int] -> [Int] -> [Int]
consumeList [] _ = error "hi" -- to generate heap debug
consumeList (x:xs) y = consumeList xs y
main = do
inputdata <- getContents
let (x:y:z:k:xs) = lines inputdata
s = map (read ::String->Int) $ words $ k
t = []
print $ consumeList s t
words
and map
is performed
on streams of characters lazily, this program use constant memory.
But As I add arguments t
, situation is changed.
My expectation is since t
is map
and words
on lazy stream,
and t
is not used in consumeList
, this change should not alter
memory usage. But no.
consumeList :: [Int] -> [Int] -> [Int]
consumeList [] _ = error "hi" -- to generate heap debug
consumeList (x:xs) y = consumeList xs y
main = do
inputdata <- getContents
let (x:y:z:k:xs) = lines inputdata
s = map (read ::String->Int) $ words $ k
t = map (read ::String->Int) $ words $ y
print $ consumeList s t -- <-- t is not used
Q1) Why does this program keeps allocating memory when t
is not used at all?
I have another question. When I pattern match lazy stream with [,]
, not
with (:)
memory allocation behaviour is changed.
consumeList :: [Int] -> [Int] -> [Int]
consumeList [] _ = error "hi" -- to generate heap debug
consumeList (x:xs) y = consumeList xs y
main = do
inputdata <- getContents
let [x,y,z,k] = lines inputdata -- <---- changed from (x:y:..)
s = map (read ::String->Int) $ words $ k
t = []
print $ consumeList s t
Q2) are (:)
and [,]
different in terms of lazy evalutation?
Any comments are welcomed. Thanks
[EDIT]
Q3) Then, is it possible to process 4th line first and the process 2nd line, without increasing memory consumption?
The experiment guided by Derek is as follows. With switching y and k from second example, I got same result:
consumeList :: [Int] -> [Int] -> [Int]
consumeList [] _ = error "hi"
consumeList (x:xs) y = consumeList xs y
main = do
inputdata <- getContents
let (x:y:z:k:xs) = lines inputdata
s = map (read ::String->Int) $ words $ y -- <- swap with k
t = map (read ::String->Int) $ words $ k -- <- swap with y
print $ consumeList s t
To answer your first question, t
is live as far as the garbage collector is concerned until you reach the end of the consumeList
. That wouldn't be such a big deal since t
would be a thunk pointing at work to do, but the problem here is that thunk is now keeping y
alive and getContents
has to actually read in y
to get to k
. In your first example, y
could be garbage collected as it was being read in. (As an experiment, if you switched y
and k
in this example, I suspect you'd see behavior very much like your first example.)
For your second question, let [x,y,z,k] = ...
means "(irrefutably) match a list of exactly four elements". That means when you force k
it needs to (at that point) check that there are no further elements, which means it needs to read in all of the input corresponding to k
before it can start processing it. In the earlier case, let (x:y:z:k:xs) = ...
it can start processing k
immediately because it doesn't have to first check whether xs
is []
or (_:_)
.