haskellghcdata-sharingmonomorphism-restriction

NoMonomorphismRestriction helps preserve sharing?


I was trying to answer another question about polymorphism vs sharing when I stumbled upon this strange behaviour.

In GHCi, when I explicitly define a polymorphic constant, it does not get any sharing, which is understandable:

> let fib :: Num a => [a]; fib = 1 : 1 : zipWith (+) fib (tail fib)
> fib !! 30
1346269
(5.63 secs, 604992600 bytes)

On the other hand, if I try to achieve the same by omitting the type signature and disabling the monomorphism restriction, my constant suddenly gets shared!

> :set -XNoMonomorphismRestriction
> let fib = 1 : 1 : zipWith (+) fib (tail fib)
> :t fib
fib :: Num a => [a]
> fib !! 50
20365011074
(0.00 secs, 2110136 bytes)

Why?!

Ugh... When compiled with optimisations, it is fast even with monomorphism restriction disabled.


Solution

  • By giving explicit type signature, you prevent GHC from making certain assumptions about your code. I'll show an example (taken from this question):

    foo (x:y:_) = x == y
    foo [_]     = foo []
    foo []      = False
    

    According to GHCi, the type of this function is Eq a => [a] -> Bool, as you'd expect. However, if you declare foo with this signature, you'll get "ambiguous type variable" error.

    The reason why this function works only without a type signature is because of how typechecking works in GHC. When you omit a type signature, foo is assumed to have monotype [a] -> Bool for some fixed type a. Once you finish typing the binding group, you generalize the types. That's where you get the forall a. ....

    On the other hand, when you declare a polymorphic type signature, you explicitly state that foo is polymorphic (and thus the type of [] doesn't have to match the type of first argument) and boom, you get ambiguous type variable.

    Now, knowing this, let's compare the core:

    fib = 0:1:zipWith (+) fib (tail fib)
    -----
    fib :: forall a. Num a => [a]
    [GblId, Arity=1]
    fib =
      \ (@ a) ($dNum :: Num a) ->
        letrec {
          fib1 [Occ=LoopBreaker] :: [a]
          [LclId]
          fib1 =
            break<3>()
            : @ a
              (fromInteger @ a $dNum (__integer 0))
              (break<2>()
               : @ a
                 (fromInteger @ a $dNum (__integer 1))
                 (break<1>()
                  zipWith
                    @ a @ a @ a (+ @ a $dNum) fib1 (break<0>() tail @ a fib1))); } in
        fib1
    

    And for the second one:

    fib :: Num a => [a]
    fib = 0:1:zipWith (+) fib (tail fib)
    -----
    Rec {
    fib [Occ=LoopBreaker] :: forall a. Num a => [a]
    [GblId, Arity=1]
    fib =
      \ (@ a) ($dNum :: Num a) ->
        break<3>()
        : @ a
          (fromInteger @ a $dNum (__integer 0))
          (break<2>()
           : @ a
             (fromInteger @ a $dNum (__integer 1))
             (break<1>()
              zipWith
                @ a
                @ a
                @ a
                (+ @ a $dNum)
                (fib @ a $dNum)
                (break<0>() tail @ a (fib @ a $dNum))))
    end Rec }
    

    With explicit type signature, as with foo above, GHC has to treat fib as potentially polymorphically recursive value. We could pass some different Num dictionary to fib in zipWith (+) fib ... and at this point we would have to throw most of the list away, since different Num means different (+). Of course, once you compile with optimizations, GHC notices that Num dictionary never changes during "recursive calls" and optimizes it away.

    In the core above, you can see that GHC indeed gives fib a Num dictionary (named $dNum) again and again.

    Because fib without type signature was assumed to be monomorphic before the generalization of entire binding group was finished, the fib subparts were given exactly the same type as the whole fib. Thanks to this, fib looks like:

    {-# LANGUAGE ScopedTypeVariables #-}
    fib :: forall a. Num a => [a]
    fib = fib'
      where
        fib' :: [a]
        fib' = 0:1:zipWith (+) fib' (tail fib')
    

    And because the type stays fixed, you can use just the one dictionary given at start.