streamschemelazy-evaluationsicphamming-numbers

# Trouble understanding / visualising SICP streams Hamming numbers program

I'm basically stuck at excercise 3.56 in SICP. The problem goes like this:

Exercise 3.56. A famous problem, first raised by R. Hamming, is to enumerate, in ascending order with no repetitions, all positive integers with no prime factors other than 2, 3, or 5. One obvious way to do this is to simply test each integer in turn to see whether it has any factors other than 2, 3, and 5. But this is very inefficient, since, as the integers get larger, fewer and fewer of them fit the requirement. As an alternative, let us call the required stream of numbers S and notice the following facts about it.

• S begins with 1.
• The elements of (scale-stream S 2) are also elements of S.
• The same is true for (scale-stream S 3) and (scale-stream 5 S).
• These are all the elements of S.

Now all we have to do is combine elements from these sources. For this we define a procedure merge that combines two ordered streams into one ordered result stream, eliminating repetitions:

``````(define (merge s1 s2)
(cond ((stream-null? s1) s2)
((stream-null? s2) s1)
(else
(let ((s1car (stream-car s1))
(s2car (stream-car s2)))
(cond ((< s1car s2car)
(cons-stream s1car (merge (stream-cdr s1) s2)))
((> s1car s2car)
(cons-stream s2car (merge s1 (stream-cdr s2))))
(else
(cons-stream s1car
(merge (stream-cdr s1)
(stream-cdr s2)))))))))
``````

Then the required stream may be constructed with merge, as follows:

`(define S (cons-stream 1 (merge <??> <??>)))`

Fill in the missing expressions in the places marked above.

Before this particular problem, I've been able to visualize and understand these implicit stream definitions using a signal processing block diagram with the original stream being fed back to the procedure.

But I've basically hit a wall with this particular problem, I've looked up the solution, but I'm finding it impossible to visualize how the solution would work in my head/paper.

Is there a trick for understanding and coming up with solutions for these sort of problems?

This is the solution that works:

``````(define S
(cons-stream 1 (merge (scale-stream S 2)
(merge (scale-stream S 3)
(scale-stream S 5)))))
``````

Solution

• As a matter of proper naming, `merge` shouldn't be removing duplicates, as its name suggests its being part of `mergesort` which ought to preserve them. `Union` is a better name for such operation, which sees sets represented (here) by increasing lists of unique numbers, which constraint it ought to preserve by removing the duplicates which can only come from both of its arguments.

Back to the problem itself, let's write it symbolically as

`S235 = {1} ∪ 2S235 ∪ 3S235 ∪ 5*S235`

Premature implementation is the mother of all evil! (wait, what?) We won't even yet try to establish how exactly those `∪`s do their job, not even in which order. Or even how many of the terms there are there:

`S23 = {1} ∪ 2S23 ∪ 3S23`

or even

`S2 = {1} ∪ 2*S2`

Now this looks simple enough. We can even fake-implement the union of `A` and `B` here simply as, first, taking all the elements of `A`, and then -- of `B`. And it will work just fine, here, because there's only one element in this `∪`'s left input:

`````` {1} ----∪-->--->--S₂--.--->S₂
/               \
\______*2_______/
---<----<---
``````

How does this work? `1` enters the `∪` combiner, exits it first, unconditionally (NB this discovered requirement is important, for if `∪` had to examine both of its arguments right away we'd get ourselves an infinite loop, a black hole in Haskell argot), is split in two by the `.` splitter, then the first copy of `1` continues forth to the output point while the second copy of `1` goes back through the `*2` multiplier, the resulting `2` enters back the `∪` this time on the right, unopposed by anything on the left (which is at this time already empty), and continues on in the same fashion so `2` goes to the output point, then `4`, then `8`, etc. etc..

To put it differently, `S₂` contains all elements of `{1}`; plus all elements of `{1}` that went through the `*2` multiplier once; and twice; and three times; and so on and so forth -- all the powers of 2 in increasing order:

``````S2 = {1} ∪ 2*{1} ∪ 2*2*{1}                ;; == {1, 2, 4, 8, 16, 32, ...}
∪ 2*2*2*{1}
∪ 2*2*2*2*{1}
∪ ..........
``````

The two `S₂`'s in the diagram are the same because whatever we siphon from it at the splitter point does not affect it.

Wasn't this fun?

So how do we go about adding the multiples of `3` to it? One way to do it is

`S23 = S2 ∪ 3*S23`

`````` {1} ----∪-->--->--S₂--.---S₂----∪-->--->--S₂₃--.--->S₂₃
/               \       /                \
\______*2_______/       \______*3________/
---<----<---            ---<----<---
``````

Here `1` from `S₂` enters the second `∪` combiner and proceeds to the output point `S₂₃` as well as back through the `*3` multiplier, turning into `3`. Now the second `∪` has `2,4,8,...` and `3,...` as its inputs; `2` goes through as well as turning into `6`. Next, `∪` has `4,8,16,...` and `3,6,...`; `3` goes through. Next, `4`; etc., etc., and so on and so forth.

Thus all elements of `S₂` are part of `S₂₃`, but so are also all elements of `S₂` that went through the `*3` multiplier once, and twice, etc., -- all the powers of 2 and 3 multiplied together, in increasing order:

``````S23 = S2 ∪ 3*S2 ∪ 3*3*S2                   ;; = S2 ∪ 3*( S2 ∪ 3*S2
∪ 3*3*3*S2                 ;;               ∪ 3*3*S2
∪ 3*3*3*3*S2               ;;               ∪ 3*3*3*S2
∪ ..........               ;;               ∪ ........ )   !!``````

Why the increasing order? How? Why, that is the responsibility of `∪`! Hello, another discovered requirement. Whatever enters it on either side, it must produce the smaller element before the larger one.

And what is it to do in the event the two are equal? Do we even need to concern ourselves with this question in this here scheme? Can this ever happen, here?

It can't. And so we can implement the `∪` here as a `merge`, not as a `union` (but remember the first discovered requirement! -- is it still valid? needed? with the addition of new cases). `Merge` ought to be more efficient than `union` as it doesn't concern itself with the case of equals.

And for the multiples of 5 also? We continue, as

`S235 = S23 ∪ 5*S235`

`````` {1} ----∪-->--->--S₂--.---S₂----∪-->--->--S₂₃--.---S₂₃----∪-->--->--S₂₃₅--.--->S₂₃₅
/               \       /                \         /                 \
\______*2_______/       \______*3________/         \_______*5________/
---<----<---            ---<----<---                ---<----<---
``````
• Does this describe the code from the book? _______
• Does this describe a code which is about twice faster than the one from the book? _______
• Why is it about twice faster than the code from the book? _______

(fill in the blanks).

And the signal processing block diagram for the book's code is:

``````                                  1 --->---\
cons-stream ->-- S ---.---> S
/----->---->--- *2 --->---\            /                       |
/                            union ->--/                        /
.-->-- *3 -->--\            /                                   /
|                union ->--/                                   /
.-->-- *5 -->--/                                              /
\                                                            /
\__________<__________<__________<_________<_______________/
``````

where the duplicates-removing "union" is called `merge` in the book.