scheme combinators y-combinator the-little-schemer

Y combinator discussion in "The Little Schemer"

So, I've spent a lot of time reading and re-reading the ending of chapter 9 in The Little Schemer, where the applicative Y combinator is developed for the length function. I think my confusion boils down to a single statement that contrasts two versions of length (before the combinator is factored out):

A:
  ((lambda (mk-length)
     (mk-length mk-length))
   (lambda (mk-length)
     (lambda (l)
       (cond
         ((null? l) 0 )
         (else (add1
                ((mk-length mk-length)
                 (cdr l))))))))

B:
((lambda (mk-length)
      (mk-length mk-length))
    (lambda (mk-length)
      ((lambda (length)
         (lambda (l)
           (cond
             ((null? l) 0)
             (else (add1 (length (cdr l)))))))
       (mk-length mk-length))))

Page 170 (4th ed.) states that A

returns a function when we applied it to an argument

while B

does not return a function

thereby producing an infinite regress of self-applications. I'm stumped by this. If B is plagued by this problem, I don't see how A avoids it.

Solution

Great question. For the benefit of those without a functioning DrRacket installation ~~(myself included)~~ I'll try to answer it.

First, let's use some sane (short) variable names, easily trackable by a human eye/mind:

((lambda (h)     ; A.   
     (h h))            ; apply h to h
 (lambda (g)
     (lambda (lst)
       (if (null? lst) 0
         (add1 
               ((g g) (cdr lst)))))))

The first lambda term is what's known as little omega, or U combinator. When applied to something, it causes that term's self-application. Thus the above is equivalent to

(let ((h (lambda (g)        ; A1. 
           (lambda (lst)
             (if (null? lst) 0
               (add1 ((g g) (cdr lst))))))))
  (h h))

When h is applied to h, new binding is formed:

(let ((h (lambda (g)        ; A2. 
           (lambda (lst)
             (if (null? lst) 0
               (add1 ((g g) (cdr lst))))))))
  (let ((g h))
    (lambda (lst)                 ; INNER LAMBDA
            (if (null? lst) 0
              (add1 ((g g) (cdr lst)))))))

Now there's nothing to apply anymore, so the inner lambda function is returned — along with the hidden linkages to the environment frames (i.e. those let bindings) up above it.

This pairing of a lambda expression with its defining environment is known as closure. To the outside world it is just another function of one parameter, (lambda (lst) ...). No more reduction steps left to perform there at the moment.

Now, when that closure — our list-length function — will be called, execution will eventually get to the point of (g g) self-application, and the same reduction steps as outlined above will again be performed (recalculating the same closure). But not earlier.

Now, the authors of that book want to arrive at the Y combinator, so they apply some code transformations to the first expression, to somehow arrange for that self-application (g g) to be performed automatically — so we may write the recursive function application in the normal manner, (f x), instead of having to write it as ((g g) x) for all recursive calls:

((lambda (h)     ; B.
     (h h))            ; apply h to h
 (lambda (g)
   ((lambda (f)           ; 'f' to become bound to '(g g)',
      (lambda (lst) 
        (if (null? lst) 0
          (add1 (f (cdr lst))))))  ; here: (f x) instead of ((g g) x)!
    (g g))))                       ; (this is not quite right)

Now after few reduction steps we arrive at

(let ((h (lambda (g)
           ((lambda (f)    
              (lambda (lst) 
                (if (null? lst) 0
                  (add1 (f (cdr lst))))))
            (g g)))))
  (let ((g h))
    ((lambda (f)
       (lambda (lst)
            (if (null? lst) 0
              (add1 (f (cdr lst))))))
     (g g))))

which is equivalent to

(let ((h (lambda (g)
           ((lambda (f)    
              (lambda (lst) 
                (if (null? lst) 0
                  (add1 (f (cdr lst))))))
            (g g)))))
  (let ((g h))
    (let ((f (g g)))           ; problem! (under applicative-order evaluation)
       (lambda (lst)
            (if (null? lst) 0
              (add1 (f (cdr lst))))))))

And here comes trouble: the self-application of (g g) is performed too early, before that inner lambda can be even returned, as a closure, to the run-time system. We only want it to be reduced when the execution gets to that point inside the lambda expression, after the closure was called. To have it reduced before the closure is even created is ridiculous. A subtle error. :)

Of course, since g is bound to h, (g g) is reduced to (h h) and we're back again where we started, applying h to h. Looping.

Of course the authors are aware of this. They want us to understand it too.

So the culprit is simple — it is the applicative order of evaluation: evaluating the argument before the binding is formed of the function's formal parameter and its argument's value.

That code transformation wasn't quite right, then. It would've worked under normal order where arguments aren't evaluated in advance.

This is remedied easily enough by "eta-expansion", which delays the application until the actual call point: (lambda (x) ((g g) x)) actually says: "will call ((g g) x) when called upon with x as an argument".

And this is actually what that code transformation should have been in the first place:

((lambda (h)     ; C.
     (h h))            ; apply h to h
 (lambda (g)
   ((lambda (f)           ; 'f' to become bound to '(lambda (x) ((g g) x))',
      (lambda (lst) 
        (if (null? lst) 0
          (add1 (f (cdr lst))))))  ; here: (f x) instead of ((g g) x)
    (lambda (x) ((g g) x)))))

Now that next reduction step can be performed:

(let ((h (lambda (g)
           ((lambda (f)    
              (lambda (lst) 
                (if (null? lst) 0
                  (add1 (f (cdr lst))))))
            (lambda (x) ((g g) x))))))
  (let ((g h))
    (let ((f (lambda (x) ((g g) x))))       ; here it's OK
      (lambda (lst)
            (if (null? lst) 0
              (add1 (f (cdr lst))))))))

and the closure (lambda (lst) ...) is formed and returned without a problem, and when (f (cdr lst)) is called (inside the closure) it is reduced to ((g g) (cdr lst)) just as we wanted it to be.

Lastly, we notice that (lambda (f) (lambda (lst ...)) expression in C. doesn't depend on any of the h and g. So we can take it out, make it an argument, and be left with ... the Y combinator:

( ( (lambda (rec)            ; D.
      ( (lambda (h) (h h))  
        (lambda (g)
          (rec  (lambda (x) ((g g) x))))))   ; applicative-order Y combinator
    (lambda (f)
        (lambda (lst) 
          (if (null? lst) 0
            (add1 (f (cdr lst)))))) )
  (list 1 2 3) )                            ; ==> 3

So now, calling Y on a function is equivalent to making a recursive definition out of it:

( y (lambda (f)  (lambda (x) .... (f x) .... )) ) 
===  define  f = (lambda (x) .... (f x) .... )
===  define (f x)  =  .... (f x) ....

... but using letrec (or named let) is better — more efficient, defining the closure in self-referential environment frame. The whole Y thing is a theoretical exercise for the systems where that is not possible — i.e. where it is not possible to name things, to create bindings with names "pointing" to things, referring to things.

Incidentally, the ability to point to things is what distinguishes the higher primates from the rest of the animal kingdom ⁄ living creatures, or so I hear. :)

Here's a new variation. Starting from same

(let ((h (lambda (g)        ; A1. 
           (lambda (lst)
             (if (null? lst) 0
               (add1 ((g g) (cdr lst))))))))
  (h h))

we again get A2,

(let ((h (lambda (g)        ; A2. 
           (lambda (lst)
             (if (null? lst) 0
               (add1 ((g g) (cdr lst))))))))
  (let ((g h))
    (lambda (lst)                 ; INNER LAMBDA
            (if (null? lst) 0
              (add1 ((g g) (cdr lst)))))))

and this is actually equivalent to

(let ((h (lambda (g)        ; A3. 
           (lambda (lst)
             (if (null? lst) 0
               (add1 ((g g) (cdr lst))))))))
 
    (lambda (lst)                 ; INNER LAMBDA
            (if (null? lst) 0
              (add1 ((h h) (cdr lst))))))

Thus (h h) in A1 is equivalent to A3's inner lambda, which has (h h) deep inside it. Which will again turn into the same inner lambda, when called.

The same inner lambda. This is expressed, if one persists in avoiding any use of letrec, with

(let (#;(h (lambda (g)        ; A4. 
           (lambda (lst)
             (if (null? lst) 0
               (add1 ((g g) (cdr lst)))))))
      (r #f))                          ; --<---------.
  (set! r                              ;             |
    (lambda (lst)                 ; INNER LAMBDA     |
            (if (null? lst) 0          ;             |
              (add1 (r (cdr lst))))))  ; --->--------*
  r)

and h isn't needed at all anymore: its only use was in (h h) and we know the result of it, in advance -- it's that inner lambda.

So then, this sets up the inner lambda to refer to itself by name: r.

And before, it was getting itself as an argument: (h h). Which would recreate the same value, again and again, so we might as well record it under that name: r, create it once, and reuse it.

Either way avoids placing the lambda inside itself as a value -- which would create an infinite-sized lambda expression! Or actually, no expression at all, since the substitution process would never finish.

Either of the two ways avoids the infinite unraveling by introducing a delay via an intermediary step: either recreating it by calling the seed (computational step) with the same seed as an argument, or referring to self by name.