haskelllazy-evaluationspace-leak

Confused: Haskell IO Laziness


I'm having difficulties in understanding Haskell lazy evaluation.

I wrote simple test program. It reads 4 lines of data and second and fourth input line has lots of numbers.

consumeList :: [Int] -> [Int] -> [Int]
consumeList [] _ = error "hi" -- to generate heap debug
consumeList (x:xs) y = consumeList xs y   
main = do
    inputdata <- getContents
    let (x:y:z:k:xs) = lines inputdata
        s = map (read ::String->Int) $ words $ k
        t = []
    print $ consumeList s t

words and map is performed on streams of characters lazily, this program use constant memory. constant memory usage

But As I add arguments t, situation is changed. My expectation is since t is map and words on lazy stream, and t is not used in consumeList, this change should not alter memory usage. But no.

consumeList :: [Int] -> [Int] -> [Int]
consumeList [] _ = error "hi" -- to generate heap debug
consumeList (x:xs) y = consumeList xs y
main = do
    inputdata <- getContents
    let (x:y:z:k:xs) = lines inputdata
        s = map (read ::String->Int) $ words $ k
        t = map (read ::String->Int) $ words $ y
    print $ consumeList s t    -- <-- t is not used

memory is increasing

Q1) Why does this program keeps allocating memory when t is not used at all?

I have another question. When I pattern match lazy stream with [,], not with (:) memory allocation behaviour is changed.

consumeList :: [Int] -> [Int] -> [Int]
consumeList [] _ = error "hi" -- to generate heap debug
consumeList (x:xs) y = consumeList xs y   
main = do
    inputdata <- getContents
    let [x,y,z,k] = lines inputdata    -- <---- changed from (x:y:..)
        s = map (read ::String->Int) $ words $ k
        t = []
    print $ consumeList s t

memory keeps increasing

Q2) are (:) and [,] different in terms of lazy evalutation?

Any comments are welcomed. Thanks

[EDIT]

Q3) Then, is it possible to process 4th line first and the process 2nd line, without increasing memory consumption?

The experiment guided by Derek is as follows. With switching y and k from second example, I got same result:

consumeList :: [Int] -> [Int] -> [Int]
consumeList [] _ = error "hi"
consumeList (x:xs) y = consumeList xs y
main = do
    inputdata <- getContents
    let (x:y:z:k:xs) = lines inputdata
        s = map (read ::String->Int) $ words $ y  -- <- swap with k
        t = map (read ::String->Int) $ words $ k  -- <- swap with y
    print $ consumeList s t

enter image description here


Solution

  • To answer your first question, t is live as far as the garbage collector is concerned until you reach the end of the consumeList. That wouldn't be such a big deal since t would be a thunk pointing at work to do, but the problem here is that thunk is now keeping y alive and getContents has to actually read in y to get to k. In your first example, y could be garbage collected as it was being read in. (As an experiment, if you switched y and k in this example, I suspect you'd see behavior very much like your first example.)

    For your second question, let [x,y,z,k] = ... means "(irrefutably) match a list of exactly four elements". That means when you force k it needs to (at that point) check that there are no further elements, which means it needs to read in all of the input corresponding to k before it can start processing it. In the earlier case, let (x:y:z:k:xs) = ... it can start processing k immediately because it doesn't have to first check whether xs is [] or (_:_).