moduleocamlfunctorfirst-class-modules

How to create a set of elements without knowing the type of the element?


I'm running into problems around recursive/mutually referential module definitions trying to use Caml's Map/Set stuff. I really want ones that just work on types, not modules. I feel like it should be possible to do this with first-class modules, but I'm failing to make the syntax work.

The signature I want is:

module type NonFunctorSet = sig
  type 'a t
  val create : ('a -> 'a -> int) -> 'a t
  val add : 'a t -> 'a -> 'a t
  val remove : 'a t -> 'a -> 'a t
  val elements : 'a t -> 'a list
end

Possibly with other Caml.Set functions included. My idea for how this would work is something like:

type 'a t = {
 m : (module Caml.Set.S with type elt = 'a);
 set : m.t
}

let create (compare : 'a -> 'a -> t) =
    module m = Caml.Set.Make(struct type t = 'a let compare = compare end) in
    let set = m.empty in
    {m = m; set = set;}
end

But that doesn't work for a number of reasons; 'a isn't exposed in the right places, I can't reference m.t in the same record where m was defined, etc.

Is there a version of this that works?

Adding more context about my use case:

I have two modules, Region and Tribe. Tribe needs access to a lot of the interface of Region, so I am currently creating Tribe as a functor, MakeTribe(Region : RegionT). Region mostly doesn't need to know about Tribe, but it does need to be able to store a mutable collection of Tribe.t that represent the tribes living in that region.

So, somehow or other, I need a RegionT like

module type RegionT = sig
  type <region>
  val get_local_tribes : <region> -> <tribes>
  val add_tribe : <region> -> <tribe> -> unit
  ...
end

I don't really care about the specific syntax of <tribe>, <tribes> and <region> in this, so long as the fully built Tribe module can know that Region.get_local_tribes, etc, will yield an actual Tribe.t The circular dependency problem is that the type <tribe> does not exist until the module Tribe is created. My idea so far has been to have RegionT.t actually be 'a RegionT.t, and then Tribe could simply refer to Tribe.t Region.t. This is all fine if I'm satisfied with keeping a <tribe> list inside Region, but I want it to be a set.

I feel this should be possible based on the following example code :

module Example : sig
  type t
  val compare : t -> t -> int
end = struct
  type t = int
  let compare = Int.compare
end

module ExampleSet = Caml.Set.Make(struct type t = Example.t let compare = Example.compare end)

All that Example exposes in its interface is a type and a function from two instances of that type to an int; why is that more than having a 'a -> 'a -> int, which has the same things?


Solution

  • Using Polymoprhic Sets and Maps from the Base Library

    In Base and Core libraries, from Jane Street, ordered data structures, such as maps, sets, hash tables, and hash sets, are all implemented as polymorphic data structures, instead of functorized versions as in the vanilla OCaml standard library.

    You can read about them more in the Real World OCaml Maps and Hashtbales chapter. But here are quick recipes. When you see a comparator in the function interface, e.g., in Map.empty what it actually wants you is to give you a module that implements the comparator interface. The good news is that most of the modules in Base/Core are implementing it, so you don't have to worry or know anything about this to use it, e.g.,

    # open Base;;
    # let empty = Map.empty (module Int);;
    val empty : (Base.Int.t, 'a, Base.Int.comparator_witness) Base.Map.t =
      <abstr>
    # Map.add empty 1 "one";;
    - : (Base.Int.t, string, Base.Int.comparator_witness) Base.Map.t
        Base.Map.Or_duplicate.t
    = `Ok <abstr>
    

    So the simple rule, if you want a set,map,hashtable,hashset where the key element has type foo, just pass (module Foo) as a comparator.

    Now, what if you want to make a mapping from your custom type? E.g., a pair of ints that you would like to compare in lexicographical order.

    First of all, we need to define sexp_of and compare functions. For our type. We will use ppx derivers for it, but it is easy to make it manually if you need.

     module Pair = struct
       type t = int * int [@@deriving compare, sexp_of]
     end
    

    Now, to create a comparator, we just need to use the Base.Comparator.Make functor, e.g.,

     module Lexicographical_order = struct 
        include Pair
        include Base.Comparator.Make(Pair)
     end
    

    So now we can do,

    
    # let empty = Set.empty (module Lexicographical_order);;
    val empty :
      (Lexicographical_order.t, Lexicographical_order.comparator_witness)
      Base.Set.t = <abstr>
    # Set.add empty (1,2);;
    - : (Lexicographical_order.t, Lexicographical_order.comparator_witness)
        Base.Set.t
    = <abstr>
    

    Despite that Base's data structures are polymorphic they strictly require that the module that provides the comparator is instantiated and known. You can just use the compare function to create a polymorphic data structure because Base will instantiate a witness type for each defined compare function and capture it in the data structure type to enable binary methods. Anyway, it is a complex issue, read on for easier (and harder) solutions.

    Instantiating Sets on mutually dependent modules

    In fact, OCaml supports mutually recursive funtors and although I would suggest you to break the recursion by introducing a common abstraction on which both Region and Tribe depend, you can still encode your problem in OCaml, e.g.,

    module rec Tribe : sig
      type t
      val create : string -> t
      val compare : t -> t -> int
      val regions : t -> Region.t list
    end = struct
      type t = string * Region.t list
      let create name = name,[]
      let compare (x,_) (y,_) = String.compare x y
      let regions (_,r) = r
    end
    and Region : sig
      type t
      val empty : t
      val add_tribe : Tribe.t -> t -> t
      val tribes : t -> Tribe.t list
    end = struct
      module Tribes = Set.Make(Tribe)
      type t = Tribes.t
      let empty = Tribes.empty
      let add_tribe = Tribes.add
      let tribes = Tribes.elements
    end
    

    Breaking the Dependency Loop

    A much better solution would be to redesign your modules and break the dependency loop. The simplest approach would be just to choose some identifier that will be used to compare tribes, e.g., by their unique names,

    module Region : sig
      type 'a t
      val empty : 'a t
      val add_tribe : string -> 'a -> 'a t -> 'a t
      val tribes : 'a t -> 'a list
    end = struct
      module Tribes = Map.Make(String)
      type 'a t = 'a Tribes.t
      let empty = Tribes.empty
      let add_tribe = Tribes.add
      let tribes r = Tribes.bindings r |> List.map snd
    
    end
    
    module Tribe : sig
      type t
      val create : string -> t
      val name : t -> string
      val regions : t -> t Region.t list
      val conquer : t Region.t -> t -> t Region.t
    end = struct
      type t = Tribe of string * t Region.t list
      let create name = Tribe (name,[])
      let name (Tribe (name,_)) = name
      let regions (Tribe (_,r)) = r
      let conquer region tribe =
        Region.add_tribe (name tribe) tribe region
    end
    

    There are also tons of other options and in general, when you have mutual dependencies it is actually an indicator of a problem in your design. So, I would still revisit the design stage and eschew the circular dependencies.

    Creating Polymorphic Sets using the Vanilla OCaml Standard Library

    It is not an easy task, especially if you need to handle operations that involve several sets, e.g., Set.union. The problem is that Set.Make is generating a new type for the set per each compare function so when we need to union two sets it is hard for us to prove to the OCaml compiler that they were created from the same type. It is possible but really painful, I am showing how to do this only to discourage you from doing this (and to showcase OCaml's dynamic typing capabilities).

    First of all we need a witness type that will reify an OCaml type for the set into a concrete value.

    type _ witness = ..
    
    
    module type Witness = sig
      type t
      type _ witness += Id : t witness
    end
    

    Now we can define our polymorphic set as an existential that holds the set itself and the module with operations. It also holds the tid (for type identifier) that we will later use to recover the type 's of the set.

    type 'a set = Set : {
        set : 's;
        ops : (module Set.S with type elt = 'a and type t = 's);
        tid : (module Witness with type t = 's);
      } -> 'a set
    

    Now we can write the create function that will take the compare function and turn it into a set,

    let create : type a s. (a -> a -> int) -> a set =
      fun compare ->
      let module S = Set.Make(struct
          type t = a
          let compare = compare
        end) in
      let module W = struct
        type t = S.t
        type _ witness += Id : t witness
      end in
      Set {
        set = S.empty;
        ops = (module S);
        tid = (module W);
      }
    

    The caveat here is that each call to create will generate a new instance of the set type 's so we can compare/union/etc two sets that were created with the same create function. In other words, all sets in our implementation shall share the same ancestor. But before that lets take a pain and implement at least two operations, add and union,

    let add : type a. a -> a set -> a set =
      fun elt (Set {set; tid; ops=(module Set)}) -> Set {
          set = Set.add elt set;
          ops = (module Set);
          tid;
        }
    
    let union : type a. a set -> a set -> a set =
      fun (Set {set=s1; tid=(module W1); ops=(module Set)})
        (Set {set=s2; tid=(module W2)}) ->
        match W1.Id with
        | W2.Id -> Set {
            set = Set.union s1 s2;
            tid = (module W1);
            ops = (module Set);
          }
        | _ -> failwith "sets are potentially using different types"
    

    Now, we can play with it a bit,

    
    # let empty = create compare;;
    val empty : '_weak1 set = Set {set = <poly>; ops = <module>; tid = <module>}
    # let x1 = add 1 empty;;
    val x1 : int set = Set {set = <poly>; ops = <module>; tid = <module>}
    # let x2 = add 2 empty;;
    val x2 : int set = Set {set = <poly>; ops = <module>; tid = <module>}
    # let x3 = union x1 x2;;
    val x3 : int set = Set {set = <poly>; ops = <module>; tid = <module>}
    # let x4 = create compare;;
    val x4 : '_weak2 set = Set {set = <poly>; ops = <module>; tid = <module>}
    # union x3 x4;;
    Exception: Failure "sets are potentially using different types".
    #