haskellhindley-milner

Haskell type checking and determinism


According to the Haskell 2010 language report, its type checker is based on Hindley-Milner. So consider a function f of this type,

f :: forall a. [a] -> Int

It could be the length function for instance. According to Hindley-Milner, f [] type checks to Int. We can prove this by instantiating the type of f to [Int] -> Int, and the type of [] to [Int], then conclude that the application ([Int] -> Int) [Int] is of type Int.

In this proof, I chose to instantiate types forall a. [a] -> Int and forall a. [a] by substituting Int to a. I can substitute Bool instead, the proof works too. Isn't it strange in Hindley-Milner that we can apply a polymorphic type to another, without specifying which instances we use ?

More specifically, what in Haskell prevents me from using the type a in the implementation of f ? I could imagine that f is a function that equals 18 on any [Bool], and equals the usual length function on all other types of lists. In this case, would f [] be 18 or 0 ? The Haskell report says "the kernel is not formally specified", so it's hard to tell.


Solution

  • During type inference, such type variables can indeed instantiated to any type. This may be seen as a highly non deterministic step.

    GHC, for what it is worth, uses the internal Any type in such cases. For instance, compiling

    {-# NOINLINE myLength #-}
    myLength :: [a] -> Int
    myLength = length
    
    test :: Int
    test = myLength []
    

    results in the following Core:

    -- RHS size: {terms: 3, types: 4, coercions: 0}
    myLength [InlPrag=NOINLINE] :: forall a_aw2. [a_aw2] -> Int
    [GblId, Str=DmdType]
    myLength =
      \ (@ a_aP5) -> length @ [] Data.Foldable.$fFoldable[] @ a_aP5
    
    -- RHS size: {terms: 2, types: 6, coercions: 0}
    test :: Int
    [GblId, Str=DmdType]
    test = myLength @ GHC.Prim.Any (GHC.Types.[] @ GHC.Prim.Any)
    

    where GHC.Prim.Any occurs in the last line.

    Now, is that really not deterministic? Well, it does involve a kind of non deterministic step "in the middle" of the algorithm, but the final resulting (most general) type is Int, and deterministically so. It does not matter what type we choose for a, we always get type Int at the end.

    Of course, getting the same type is not enough: we also want to get the same Int value. I conjecture that it can be proven that, given

    f :: forall a. T a
    g :: forall a. T a -> U
    

    then

    g @ V (f @ V) :: U
    

    is the same value whatever type V is. This should be a consequence of parametricity applied to those polymorphic types.