functional-programmingschemelispsicplazy-sequences

Space complexity of streams in Scheme

I am reading Structure and Interpretation of Computer Programs (SICP) and would like to make sure that my thinking is correct.

Consider the following simple stream using the recursive definition:

``````(define (integers-starting-from n)
(cons-stream n (integers-starting-from (+ n 1))))

(define ints (integers-starting-from 1))

(car (cdr-stream (cdr-stream (cdr-stream (cdr-stream ints)))))
``````

If we adopt the implementation in SICP, whenever we `cons-stream`, we are effectively consing a variable and a lambda function (for delayed evaluation). So as we `cdr-stream` along this stream, nested lambda functions are created and a chain of frames is stored for the evaluation of lambda functions. Those frames are necessary since lambda functions evaluate expressions and find them in the enclosing frame. Therefore, I suppose that in order to evaluate the n-th element of the stream, you need to store n extra frames that take up linear space.

This is different from the behavior of iterators in other languages. If you need to go far down the stream, much space will be taken. Of course, it is possible to only keep the direct enclosing frame and throw away all the other ancestral frames. Is this what the actual scheme implementation does?

Solution

• "nested lambda functions are created"

nope. There is no nested scope. In

``````(define integers-starting-from
(lambda (n)
(cons-stream n
(integers-starting-from (+ n 1)))))
``````

definition, in the `(integers-starting-from (+ n 1))` call, `integers-starting-from` is a function and `(+ n 1)` is its argument. So when that call is made, the value of `(+ n 1)` is found first and only then the execution enters the `integers-starting-from` definition body. Thus there's no more reference to `n` being held by anything, at that point.

Scheme is an eager programming language, not a lazy one.

So the lambda inside the result of `cons-stream` holds a reference to the call frame at first, yes, but there is no nesting of environments in the tail of that stream after that tail is produced. The value of `(+ n 1)` is already obtained, before the new lambda is created and returned as part of the next `cons` cell representing the stream's next state, i.e. the stream's tail.

Unfolding the calls to lambda functions as the equivalent `let` expressions, we get

``````(define ints (integers-starting-from 1))
=
(define ints (let ((n 1))
(cons-stream n (integers-starting-from (+ n 1)))))
=
(define ints (let ((n 1))
(cons n (lambda () (integers-starting-from (+ n 1))))))
``````

and the call proceeds as

``````(car (cdr-stream (cdr-stream ints)))
=
(let* ((ints       (let ((n 1))
(cons n
(lambda ()
(integers-starting-from (+ n 1))))))
(cdr-ints   ((cdr ints)))
(cddr-ints  ((cdr cdr-ints)))
(res        (car cddr-ints)))
res)
=
(let* ((ints       (let ((n 1))
(cons n
(lambda ()
(integers-starting-from (+ n 1))))))
(cdr-ints   ((cdr ints))
=
((let ((n 1))
(lambda () (integers-starting-from (+ n 1)))))
=
(let ((n 1))
((lambda () (integers-starting-from (+ n 1)))))
=
(let ((n 1))
(integers-starting-from (+ n 1)))
=
(integers-starting-from 2)   ;; args before calls!
=
(let ((n 2))
(cons n
(lambda ()
(integers-starting-from (+ n 1))))))
(cddr-ints  ((cdr cdr-ints)))
(res        (car cddr-ints)))
res)
=
(let* ((ints       (let ((n 1))
(cons n            ;; inde-
(lambda ()
(integers-starting-from (+ n 1))))))
(cdr-ints   (let ((n 2))
(cons n            ;;      pen-
(lambda ()
(integers-starting-from (+ n 1))))))
(cddr-ints  (let ((n 3))
(cons n            ;;           dent
(lambda ()
(integers-starting-from (+ n 1))))))
(res        (car cddr-ints)))
res)
=
(let* ((res        (let ((n 3))
n)))
res)
=
3
``````

So there's no nested lambdas here. Not even a chain of lambdas, because the implementation is non-memoizing. The values for `cdr-ints` and `cddr-ints` are ephemeral, liable to be garbage-collected while the 3rd element is being calculated. Nothing holds any reference to them.

Thus getting the nth element is done in constant space modulo garbage, since all the interim O(n) space entities are eligible to be garbage collected.

In (one possible) memoizing implementation, each `lambda` would be actually replaced by its result in the `cons` cell, and there'd be a chain of three -- still non-nested -- lambdas, congruent to an open-ended list

``````(1 . (2 . (3 . <procedure-to-go-next>)))
``````

In programs which do not hold on to the top entry of such chains, all the interim `cons`es would be eligible for garbage collection as well.

One such example, even with the non-memoizing SICP streams, is the sieve of Eratosthenes. Its performance characteristics are consistent with no memory retention of the prefix portions of its internal streams.