I don't understand the desugaring of let p = e1 in e0 into let p = fix (\~p -> e1) in e0, or how the latter actually works in Haskell

According to the Haskell 2010 Report and GHC documentation, a let expression like:

let p = e1 in e0

can be desugared to something like:

let p = fix (\~p -> e1) in e0

This translation models recursive bindings using fix, with an irrefutable pattern to preserve laziness. However, this raises confusion in practice.

I tried to simulate this desugaring manually with the following example:

let myId = fix (\ ~myId -> (\x -> x)) in (myId 1, myId "abc")

To my surprise, this works flawlessly and returns (1, "abc").

Now I’m puzzled on two fronts:

- How is `fix (\ ~myId -> (\x -> x))` actually evaluated?
- How is myId still polymorphic?

Solution

How is fix (\ ~myId -> (\x -> x)) actually evaluated?

How it is actually evaluated is an implementation detail, but the model you should have in mind is the following equational reasoning chain:

  fix (\ ~myId -> (\x -> x))
= let y = (\ ~myId -> (\x -> x)) y in y
= let y = \x -> x in y
= \x -> x

Note that since myId is a variable, it is already an irrefutable pattern, so this is equivalent to using \ myId.

How is myId still polymorphic?

Note that only the outer myId has a polymorphic type; the one accessible inside fix is monomorphic, as one would expect from the type of fix:

let myId = fix (\ myId -> let _ = (myId (), myId True) in \x -> x)
in (myId 1, myId "abc")
-- Couldn't match type ‘()’ with ‘Bool’

It goes something like this:

fix :: forall a. (a -> a) -> a is being used, so let's instantiate its type with fresh metavariables: (α -> α) -> α.
fix is applied to a λ-expression, so let's check that it has type α -> α. To that end, assume that (the inner) myId has type α and check that the body of the λ-expression has type α.
The body is another λ-expression, so we learn that α := β -> γ. Checking \ x -> x :: β -> γ, we learn that γ := β.
The return type of fix is α, so the whole let-bound myId expression has type β -> β. Since let-bound expressions are generalisable and β is a fresh metavariable, we generalise this to the polymorphic type forall b. b -> b.
Hence myId has type forall b. b -> b in the body of the let-expression.

Metavariables stand for monomorphic types, so if we tried to use the inner myId at two different types like in my example, we would first have to solve β := () and then β := Bool, a contradiction.