typesocamlhigher-order-functionsgadtlocally-abstract-type

Using GADTs with higher order functions


I'm trying to model a "heterogeneous tree", ie. a tree where the nodes have different "kinds" and each "kind" are restricted in the "kind" of children they may contain:

type id = string
type block
type inline

type _ node =
  | Paragraph : id * inline node list -> block node
  | Strong : id * inline node list -> inline node
  | Text : id * string -> inline node

A tree can then be defined like this:

let document =
    Paragraph ("p1", [
      Text ("text1", "Hello ");
      Strong ("strong1", [
        Text ("text2", "Glorious")
      ]);
      Text ("text3", " World!")
  ])

Usually this would be done using separate variants for each "kind" of node, but I'm trying to define it as a GADT in order to be able to manipulate the tree using higher-order functions that pattern match on every kind of node:

function
  | Text ("text2", _) ->
    Some (Text ("text2", "Dreadful"))
  | _ ->
    None

The problem I have is in defining the function that accepts the above higher-order function and apply it to every node. So far I have this:

let rec replaceNode (type a) (f: a node -> a node option) (node: a node): a node =
  match f node with
  | Some otherNode -> otherNode
  | None ->
    match node with
    | Paragraph (id, children) -> 
      Paragraph (id, (List.map (replaceNode f) children))
    | Strong (id, children) ->
      Strong (id, (List.map (replaceNode f) children))
    | Text (_, _) -> node

But the compiler gives me the following error on the highlighted line

This expression has type block node -> a node option but an expression was expected of type block node -> a node option This instance of block is ambiguous: it would escape the scope of its equation

Or if I change the type of f to 'a node -> 'a node option I get this error instead

This expression has type a node but an expression was expected of type a node The type constructor a would escape its scope

Clearly I don't fully understand locally abstract types (or GADTs really, for that matter), but from what I do understand these errors seem to arise because the type is, as the name suggests, "local", and while polymorphic on the outside, passing it on would "leak" it, I guess?

So my question is, first and foremost: Is this even possible to do (and by "this" I think I mean to pattern match on a GADT in a higher-order function, but I'm not even sure that's the actual problem)?

Playground with all the code here


Solution

  • There are two root problems here ( which are a bit muddled by the presence of GADTs). The first issue is that replaceNode is a second rank polymorphic function. Indeed, in the first match, f is applied to a node of type a node, but inside the Paragraph branch, it is applied to a node of type inline node. The type-checker error here is a bit complicated by the List.map call, but rewriting the function as

    let rec replaceNode (type a) (f:a node -> a node option) 
    (node:a node): a node =
      match f node with
      | Some otherNode -> otherNode
      | None ->
        match node with
        | Paragraph(id, []) -> Paragraph(id,[])
        | Paragraph (id, a :: children) -> 
          Paragraph (id, f a :: (List.map (replaceNode f) children))
        | Strong (id, children) ->
          Strong (id, (List.map (replaceNode f) children))
        | Text (_, _) -> node;;
    

    yields a more straightforward error:

    Error: This expression has type inline node
    but an expression was expected of type a node
    Type inline is not compatible with type a

    The problem is thus that we need to reassure the type-checker that f works for any type a and not only the original type a. In other words, the type of f should be 'a. 'a node -> 'a node option (aka forall 'a. 'a -> 'a node option). Unfortunalety, explicit polymorphic annotation are only possible in first position (prenex) in OCaml, thus we cannot just change the type of f in replaceNode. However, it is possible to works around this issue by using polymorphic record field or method.

    For instance, using the record path, we can define a record type mapper:

    type mapper = { f:'a. 'a node -> 'a node option } [@@unboxed]
    

    of which the field f has the right explicit polymorphic notation (aka universal quantification) and then use it in replaceNode:

    let rec replaceNode (type a) {f} (node: a node): a node =
      match f node with
      | Some otherNode -> otherNode
      | None ->
        match node with
        | Paragraph (id, children) -> 
          Paragraph (id, (List.map (replaceNode {f}) children))
        | Strong (id, children) ->
          Strong (id, (List.map (replaceNode {f}) children))
        | Text (_, _) -> node
    

    But then the second issue pops up: this replaceNode function has for type mapper -> inline node -> inline node. Where does the inline type comes from? Ths time the issue is polymorhpic recursion. Without an explicit polymorphic annotation, the type of replaceNode is considered constant inside its recursive definition. In other words, the type-checker considers that replaceNode has for type mapper -> 'elt node -> 'elt node for a given 'elt. And in the paragraph and strong branchs, the children list is a list of inline node. Thus List.map (replaceNode {f}) children implies for the type-checker that 'elt=inline and thus the type of replaceNode becomes mapper -> inline node -> inline node.

    To fix this issue, we need to add another polymorphic annotation. Fortunately, this time, we can add it directly:

    let rec replaceNode: type a. mapper -> a node -> a node =
      fun {f} node -> match f node with
      | Some otherNode -> otherNode
      | None ->
        match node with
        | Paragraph (id, children) -> 
          Paragraph (id, (List.map (replaceNode {f}) children))
        | Strong (id, children) ->
          Strong (id, (List.map (replaceNode {f}) children))
        | Text (_, _) -> node;;
    

    At last, we obtain a function of type mapper -> 'a node -> 'a node. Note that let f: type a.… is a shortcut notation for combining locally abstract type and explicit polymorphic annotation.

    Completing the explanation, the locally abstract (type a) is needed here because only abstract types can be refined when pattern matching over GADTs. In other words, we need it to precise that the type a in the Paragraph, Strong and Text obeys different equalities: a = block in the Paragraph branch, a = inline in the Strong and Text branch.

    EDIT: How to define a mapper?

    This locally abstract type bit is in fact important when defining a mapper. For instance, defining

    let f = function
      | Text ("text2", _) -> Some (Text ("text2", "Dreadful"))
      | _ -> None
    

    yields a type inline node -> inline node option for f, since matching over the constructor Text yields the equality 'type_of_scrutinee=inline.

    To correct this point, one need to add a locally abstract type annotation to make the type-checker able to refine the type of the scrutinee branch-by-branch:

     let f (type a) (node:a) : a node option= match node with
     | Text ("text2", _) -> Some (Text ("text2", "Dreadful"))
     | _ -> None
    

    Then this f has the right type and can be wrapped inside a mapper record:

    let f = { f }
    

    Advertisement: All this is detailed in the OCaml manual starting at version 4.06 .