I'm trying to model a "heterogeneous tree", ie. a tree where the nodes have different "kinds" and each "kind" are restricted in the "kind" of children they may contain:
type id = string
type block
type inline
type _ node =
| Paragraph : id * inline node list -> block node
| Strong : id * inline node list -> inline node
| Text : id * string -> inline node
A tree can then be defined like this:
let document =
Paragraph ("p1", [
Text ("text1", "Hello ");
Strong ("strong1", [
Text ("text2", "Glorious")
]);
Text ("text3", " World!")
])
Usually this would be done using separate variants for each "kind" of node, but I'm trying to define it as a GADT in order to be able to manipulate the tree using higher-order functions that pattern match on every kind of node:
function
| Text ("text2", _) ->
Some (Text ("text2", "Dreadful"))
| _ ->
None
The problem I have is in defining the function that accepts the above higher-order function and apply it to every node. So far I have this:
let rec replaceNode (type a) (f: a node -> a node option) (node: a node): a node = match f node with | Some otherNode -> otherNode | None -> match node with | Paragraph (id, children) -> Paragraph (id, (List.map (replaceNode f) children)) | Strong (id, children) -> Strong (id, (List.map (replaceNode f) children)) | Text (_, _) -> node
But the compiler gives me the following error on the highlighted line
This expression has type block node -> a node option but an expression was expected of type block node -> a node option This instance of block is ambiguous: it would escape the scope of its equation
Or if I change the type of f
to 'a node -> 'a node option
I get this error instead
This expression has type a node but an expression was expected of type a node The type constructor a would escape its scope
Clearly I don't fully understand locally abstract types (or GADTs really, for that matter), but from what I do understand these errors seem to arise because the type is, as the name suggests, "local", and while polymorphic on the outside, passing it on would "leak" it, I guess?
So my question is, first and foremost: Is this even possible to do (and by "this" I think I mean to pattern match on a GADT in a higher-order function, but I'm not even sure that's the actual problem)?
There are two root problems here ( which are a bit muddled by the presence of GADTs). The first issue is that replaceNode
is a second rank polymorphic function. Indeed, in the first match, f
is applied to a node of type a node
, but inside the Paragraph
branch, it is applied to a node of type inline node
. The type-checker error here is a bit complicated by the List.map
call, but rewriting the function as
let rec replaceNode (type a) (f:a node -> a node option)
(node:a node): a node =
match f node with
| Some otherNode -> otherNode
| None ->
match node with
| Paragraph(id, []) -> Paragraph(id,[])
| Paragraph (id, a :: children) ->
Paragraph (id, f a :: (List.map (replaceNode f) children))
| Strong (id, children) ->
Strong (id, (List.map (replaceNode f) children))
| Text (_, _) -> node;;
yields a more straightforward error:
Error: This expression has type inline node
but an expression was expected of type a node
Type inline is not compatible with type a
The problem is thus that we need to reassure the type-checker that f
works for any type a
and not only the original type a
. In other words, the type of f
should be 'a. 'a node -> 'a node option
(aka forall 'a. 'a -> 'a node option
). Unfortunalety, explicit polymorphic annotation are only possible in first position (prenex) in OCaml, thus we cannot just change the type of f
in replaceNode
. However, it is possible to works around this issue by using polymorphic record field or method.
For instance, using the record path, we can define a record type mapper
:
type mapper = { f:'a. 'a node -> 'a node option } [@@unboxed]
of which the field f
has the right explicit polymorphic notation (aka universal quantification) and then use it in replaceNode
:
let rec replaceNode (type a) {f} (node: a node): a node =
match f node with
| Some otherNode -> otherNode
| None ->
match node with
| Paragraph (id, children) ->
Paragraph (id, (List.map (replaceNode {f}) children))
| Strong (id, children) ->
Strong (id, (List.map (replaceNode {f}) children))
| Text (_, _) -> node
But then the second issue pops up: this replaceNode
function has for type
mapper -> inline node -> inline node
. Where does the inline type comes from? Ths time the issue is polymorhpic recursion. Without an explicit polymorphic annotation, the type of replaceNode
is considered constant inside its recursive definition. In other words, the type-checker considers that replaceNode
has for type mapper -> 'elt node -> 'elt node
for a given 'elt
. And in the paragraph
and strong
branchs, the children
list is a list of inline node
. Thus List.map (replaceNode {f}) children
implies for the type-checker that 'elt
=inline
and thus the type of replaceNode
becomes mapper -> inline node -> inline node
.
To fix this issue, we need to add another polymorphic annotation. Fortunately, this time, we can add it directly:
let rec replaceNode: type a. mapper -> a node -> a node =
fun {f} node -> match f node with
| Some otherNode -> otherNode
| None ->
match node with
| Paragraph (id, children) ->
Paragraph (id, (List.map (replaceNode {f}) children))
| Strong (id, children) ->
Strong (id, (List.map (replaceNode {f}) children))
| Text (_, _) -> node;;
At last, we obtain a function of type mapper -> 'a node -> 'a node
.
Note that let f: type a.…
is a shortcut notation for combining locally abstract type and explicit polymorphic annotation.
Completing the explanation, the locally abstract (type a)
is needed here because only abstract types can be refined when pattern matching over GADTs. In other words, we need it to precise that the type a
in the Paragraph
, Strong
and Text
obeys different equalities: a
= block
in the Paragraph branch, a
= inline
in the Strong
and Text
branch.
EDIT: How to define a mapper?
This locally abstract type bit is in fact important when defining a mapper. For instance, defining
let f = function
| Text ("text2", _) -> Some (Text ("text2", "Dreadful"))
| _ -> None
yields a type inline node -> inline node option
for f
, since matching over the constructor Text
yields the equality 'type_of_scrutinee=inline
.
To correct this point, one need to add a locally abstract type annotation to make the type-checker able to refine the type of the scrutinee branch-by-branch:
let f (type a) (node:a) : a node option= match node with
| Text ("text2", _) -> Some (Text ("text2", "Dreadful"))
| _ -> None
Then this f has the right type and can be wrapped inside a mapper record:
let f = { f }
Advertisement: All this is detailed in the OCaml manual starting at version 4.06 .