ocamlgadtheterogeneous-list

Is it possible to have elements of a heterogeneous list depend on the type of preceding elements?


Context

I want to model the routes of a web application such that it satisfies the following requirements:

  1. Can enforce complete definitions
  2. Can create incomplete definitions
  3. Can check if an incomplete definition "matches" (i.e. is contained in) a complete definition.

As an example to work with:

type root =
  | Fruit of fruit
  | Veggie of veggie
  
and fruit =
  | Apple of apple
  | Banana of banana
  
and veggie =
  | Carrot of carrot
  
and apple = { diameter: int; cultivar: string; }
and banana = { length: int }
and carrot = { length: int; color: [`Orange | `Purple] }

With this, we can easily create and enforce complete definitions:

let complete = Fruit (Apple { diameter = 8; cultivar = "Golden Delicious" })

But cannot create incomplete definitions

let incomplete = Fruit Apple
                       ^^^^^
Error: The constructor Apple expects 1 argument(s),
       but is applied here to 0 argument(s)

And therefore also cannot match incomplete and complete definitions, but we can at least verbosely implement a partial equality function that ignores the parameters:

let equal a b = match a, b with
  | Fruit (Apple _), Fruit (Apple _ ) -> true
  | Fruit (Apple _), _ -> false
  | Fruit (Banana _), Fruit (Banana _ ) -> true
  | Fruit (Banana _), _ -> false
  | Veggie (Carrot _) , Veggie (Carrot _) -> true
  | Veggie (Carrot _) , _ -> false

A vague idea towards a possible solution

So I had this idea of using a tree of GADTs and a heterogeneous list to make the definitions more flexible by having the routes defined as lists, e.g.:

let route = [Fruit; Apple; { diameter = 6; cultivar = "Granny Smith" }]

they could then be used with pattern matching and recursion to more easily destructure and compare them.

Unfortunately implementing this isn't quite so easy. This is what I have so far:

type _ root =
  | Fruit : _ fruit root
  | Veggie : _ veggie root

and _ fruit =
  | Apple : apple fruit
  | Banana : banana fruit

and _ veggie =
  | Carrot : carrot veggie

and apple = { diameter: int; cultivar: string; }
and banana = { length: int }
and carrot = { length: int; color: [`Orange | `Purple] }

type 'a t =
  | [] : _ root t
  | ( :: ) : 'b * 'a t -> 'b t

Two problems that I see here:

  1. 'b isn't constrained by 'a, so anything can be put into the list, as long as it starts with a root t, and there probably isn't a way to recover the type of the elements either. I think this would require Higher-Kinded Types, but maybe there's a way around that?
  2. Even if I was able to solve that, I'm not sure how I'd be able to terminate it. Perhaps the params could be made into GADTs too, and terminate with unit.

Solution

  • It is possible to have heterogeneous lists for which the type of each element depends on the previous element and impose the constraint on the following type. The core idea is to realize that each element need to define in which context it is allowed and which context, and it is then a matter of chaining matching context:

     type ('a,'b) t =
      | [] : ('a,'a) t
      | ( :: ) : ('a, 'b) element * ('b,'c) t -> ('a,'c) t
    

    Here the type ('a,'b) t describes a heterogeneous list which start at the context type 'a and stop at the context type 'b. And it the type definition of ('a,'b) element which determine which transitions is allowed.

    In your case, the element type could be defined as something like

    module Tag = struct
      type final = Done
      type root = Root
      type fruit = Fruit
      type veggie = Veggie
    end
    type (_,_) element=
      | Fruit : (Tag.root, Tag.fruit) element
      | Veggie : (Tag.root, Tag.veggie) element
      | Apple : (Tag.fruit, apple) element
      | Banana : (Tag.fruit, banana) element
      | Carrot: (Tag.veggie, carrot) element
      | End: 'a -> ('a, Tag.final) element
    

    It is important to notice that the module Tag only provides type level tags(indices) that are not associated to any value.

    With this definition:

    let fruit = [Fruit]
    

    is a (Tag.root,Tag.fruit) element: the element is only allowed to the top and requires that the following element is allowed in the Tag.fruit context. A valid next element would then be

    let apple = [Fruit;Apple]
    

    which is a (Tag.root,Tag.apple) t path. Finally, it is possible to close a path with the End constructor once we are in a context that maps to concrete type:

    type complete = (Tag.root,Tag.final) t
    let full_apple : complete =
      [Fruit; Apple; End { diameter=0; cultivar="apple"}]
    

    And this construction is still static enough that it is generally possible to recover enough type information for handling partial paths at the price of some redundancy:

    let rec prefix: type a b c d. (a,b) t -> (c,d) t -> bool = fun pre x ->
      match pre, x with
      | [], _ -> true
      | Fruit :: q, Fruit :: r -> prefix q r
      | Veggie :: q, Veggie :: r -> prefix q r
      | [Apple], Apple :: r -> true
      | [Banana], Banana :: r -> true
      | [Carrot], Carrot :: r -> true
      | [Apple; End x], [Apple; End y] -> x = y
      | [Banana; End x], [Banana; End y] -> x = y 
      | [Carrot; End x], [Carrot; End y] -> x = y
      | _ -> false