ocamlocaml-batteries

Use of "%identity" in mutable accumulator


To enable tail recursion, the ocaml batteries library uses a mutable accumulator in the list module:

type 'a mut_list =  {
  hd: 'a;
  mutable tl: 'a list
}

external inj : 'a mut_list -> 'a list = "%identity"

module Acc = struct
  let dummy () =
    { hd = Obj.magic (); tl = [] }
  let create x =
    { hd = x; tl = [] }
  let accum acc x =
    let cell = create x in
    acc.tl <- inj cell;
    cell
end

It seems like the mutable list is simply cast to the immutable list type, but mut_list and list do not have the same type definition.

Is this safe? How and why does this code work?


Solution

  • The OCaml manual describes the memory layout for various types. For records:

    Records are also represented by zero-tagged blocks. The ordering of labels in the record type declaration determines the layout of the record fields: the value associated to the label declared first is stored in field 0 of the block, the value associated to the second label goes in field 1, and so on.

    And for variants:

    Constructed terms are represented either by unboxed integers (for constant constructors) or by blocks whose tag encode the constructor (for non-constant constructors). The constant constructors and the non-constant constructors for a given concrete type are numbered separately, starting from 0, in the order in which they appear in the concrete type declaration. A constant constructor is represented by the unboxed integer equal to its constructor number. A non-constant constructor declared with n arguments is represented by a block of size n, tagged with the constructor number; the n fields contain its arguments.

    From this we can infer that the :: constructor is laid out as

    | 0 | <value in current element> | <pointer to next element> |
    

    This is because the :: constructor is the first non-constant constructor, thus tagged "0", followed by its arguments.

    The mutable record is laid out the same way, since records are always tagged "0" followed by the contents in order.

    From these, we can infer that this use of %identity does what we want.