haskelltypeshigher-rank-typesrank-n-typesscott-encoding

Why are explicit forall quantifiers necessary for rank-n types?


When I declare this newtype:

newtype ListScott a = 
  ListScott { 
  unconsScott :: (a -> ListScott a -> r) -> r -> r 
}

which would define the hypothetical rank-2 type ListScott :: ((a -> ListScott a -> r) -> r -> r) -> ListScott a, the compiler complains about r not being in scope. Isn't it apparent from the type signature that I want to pass a first class polymorphic function to ListScott?

Why do I need an explicit type quantifier for r for cases like this?

I am not a type theorist and have probably overlooked something...


Solution

  • This is a question of programming language design. It could be inferred in the way you suggest, but I argue that it is a bad idea.

    Isn't it apparent from the type signature that I want to pass a first class polymorphic function to ListScott?

    I don't think that we can obviously tell that much from this definition.

    Universal or existential? Conflict with GADT notation

    Here's something we can write with the GADTs extension:

    data ListScott a where
      ListScott :: { unconsScott :: (a -> ListScott a -> r) -> r -> r } -> ListScott a
    

    Here r is quantified existentially in the unconsScott field, so the constructor has the first type below:

    ListScott :: forall a r. ((a -> ListScott a -> r) -> r -> r) -> ListScott a
    -- as opposed to
    ListScott :: forall a. (forall r. (a -> ListScott a -> r) -> r -> r) -> ListScott a
    

    Inference disables error detection

    What if r is instead meant to be a parameter to ListScott, but we simply forgot to add it? I believe that is a reasonably probable mistake, because both the hypothetical ListScott r a and ListScott a can serve as representations of lists in some ways. Then inference of binders would result in an erroneous type definition being accepted, and errors being reported elsewhere, once the type gets used (hopefully not too far, but that would still be worse than an error at the definition itself).

    Explicitness also prevents typos where a type constructor gets mistyped as a type variable:

    newtype T = T { unT :: maybe int }
    -- unlikely to intentionally mean "forall maybe int. maybe int"
    

    There is not enough context in a type declaration alone to confidently guess the meaning of variables, thus we should be forced to bind variables properly.

    Readability

    Consider record of functions:

    data R a = R
      { f :: (r -> r) -> r -> r
      , g :: r -> r
      }
    
    data R r a = R
      { f :: (r -> r) -> r -> r
      , g :: r -> r
      }
    

    We have to look to the left of = to determine whether r is bound there, and if not we have to mentally insert binders in every field. I find that makes the first version hard to read because the r variable in the two fields would actually not be under the same binder, but it certainly looks otherwise at a glance.

    Comparison to similar constructs

    Note that something similar to what you suggested happens with type classes, which can be seen as a sort of record:

    class Functor f where
      fmap :: (a -> b) -> f a -> f b
    

    Most arguments above apply as well, and I would thus prefer to write that class as:

    class Functor f where
      fmap :: forall a b. (a -> b) -> f a -> f b
    

    A similar thing could be said of local type annotations. However, top-level signatures are different:

    id :: a -> a
    

    That unambiguously means id :: forall a. a -> a, because there is no other level where a could be bound.