ocamlgadtpeano-numbers

Why Peano numbers in OCaml not working due to scope error?


I have the following peano number written with GADTs:

type z = Z of z

type 'a s = Z | S of 'a

type _ t = Z : z t | S : 'n t -> 'n s t

module T = struct
  type nonrec 'a t = 'a t
end

type 'a nat = 'a t

type e = T : 'n nat -> e

The following function to decode a 'a nat (or 'a t) into the number it encoded, works:

let to_int : type n. n t -> int =
  let rec go : type n. int -> n t -> int =
   fun acc n -> match n with Z -> acc | S n -> go (acc + 1) n
  in
  fun x -> go 0 x

but if I try to rewrite it almost exactly the same this way:

let to_int2 (type a) (a: a nat) : int =
  let rec go (type a) (acc : int) (x : a nat) : int =
    match x with
    | Z -> acc
    | S v -> go (acc + 1) v
  in
  go 0 a

I get a scope error. What's the difference between the two functions?

138 |     | S v -> go (acc + 1) v
                                ^
Error: This expression has type $0 t but an expression was expected of type
         'a
       The type constructor $0 would escape its scope

Solution

  • The root issue is polymorphic recursion, GADTs are a red herring here.

    Without an explicit annotation, recursive functions are not polymorphic in their own definition. For instance, the following function has type int -> int

    let rec id x =
       let _discard = lazy (id 0) in
       x;;
    

    because id is not polymorphic in

       let _discard = lazy (id 0) in
    

    and thus id 0 implies that the type of id is int -> 'a which leads to id having type int -> int. In order to define polymorphic recursive function, one need to add an explicit universally quantified annotation

    let rec id : 'a. 'a -> 'a = fun x ->
      let _discard = lazy (id 0) in
      x
    

    With this change, id recovers its expected 'a -> 'a type. This requirement does not change with GADTs. Simplifying your code

    let rec to_int (type a) (x : a nat) : int =
        match x with
        | Z -> 0
        | S v ->  1 + to_int v
    

    the annotation x: a nat implies that the function to_int only works with a nat, but you are applying to an incompatible type (and ones that lives in a too narrow scope but that is secondary).

    Like in the non-GADT case, the solution is to add an explicit polymorphic annotation:

    let rec to_int: 'a. 'a nat -> int = fun (type a) (x : a nat) ->
        match x with
        | Z -> 0
        | S v ->  1 + to_int v
    

    Since the form 'a. 'a nat -> int = fun (type a) (x : a nat) -> is both a mouthful and quite often needed with recursive function on GADTs, there is a shortcut notation available:

    let rec to_int: type a. a nat -> int = fun x ->
        match x with
        | Z -> 0
        | S v ->  1 + to_int v
    

    For people not very familiar with GADTs, this form is the one to prefer whenever one write a GADT function. Indeed, not only this avoids the issue with polymorphic recursion, writing down the explicit type of a function before trying to implement it is generally a good idea with GADTs.

    See also https://ocaml.org/manual/polymorphism.html#s:polymorphic-recursion , https://ocaml.org/manual/gadts-tutorial.html#s%3Agadts-recfun , and https://v2.ocaml.org/manual/locallyabstract.html#p:polymorpic-locally-abstract .