ocamllazy-evaluationmemory-layoutdata-representation

How does OCaml represent lazy values at runtime?


This chapter in Real World OCaml describes the runtime memory layout of different data types. However, there is no discussion on lazy values.

Note: This question is about the OCaml compiler/runtime. I understand that there is no standard specification for how lazy values should be implemented by an OCaml compiler/runtime.


Solution

  • In simple terms, a lazy value that needs computation is represented as a thunk which is rewritten by a reference to the computed value (if such exists) once the value is forced. A lazy value that doesn't need a computation (and is not a float) is represented as it is.

    First, let's focus on values that do not require computation. Those are constants, functions (not their applications, nor partial applications), or identifiers. They are represented without any extra boxing and have the same representation as their eager counterparts, e.g.,

    # Obj.repr (lazy 42) == Obj.repr 42;;
    - : bool = true
    
    # Obj.tag (Obj.repr sin) = (Obj.tag (Obj.repr (lazy sin)));;
    - : bool = true
    
    # Obj.closure_tag = (Obj.tag (Obj.repr (lazy sin)));;
    - : bool = true
    

    The same is true for types which usually have a boxed representation, e.g., strings,

    let s = "hello" in
    Obj.repr s == Obj.repr (lazy s);;
    - : bool = true
    

    The only exception is the float type (because of another optimization which enables unboxed storage of arrays or records of floats, which would be broken otherwise). Floats are stored in the forwarded notation, as a boxed value with a header indicating the Forward_tag and the only field being the stored value.

    Values that are classified as computations are stored as thunks. If we will speak OCaml (note, that is not the actual implementation, but the concept is the same)

    type 'a value = Deferred of (unit -> 'a) | Ready of 'a 
    type 'a lazy_t = {
      mutable lazy : 'a value;
    }
    

    and the lazy operator captures the enclosed expression, i.e., on the syntactic level of the language, it translates something like this:

    lazy x => {lazy = Deferred (fun () -> x)
    

    Here are some interactions with OCaml which showcase the representation:

    let x = lazy (2+2) in
    Obj.lazy_tag = Obj.tag (Obj.repr x);;
    - : bool = true
    
    let x = lazy (2+2) in
    let _ = Lazy.force x in
    Obj.forward_tag = Obj.tag (Obj.repr x);;
    - : bool = true
    

    As we can see a computation is stored as a thunk (and uses 4 words)

    let x = lazy (2+2) in
    Obj.reachable_words (Obj.repr x);;
      - : int = 4
    

    while after we force the computation it will be stored as a forwarded (boxed) int,

    let x = lazy (2+2) in
    let _ = Lazy.force x in
    Obj.reachable_words (Obj.repr x);;
    - : int = 2
    

    ) There is also a special case for exceptions, which are computations that diverge and therefore do not have values, thus the couldn't be translated to the forwarded form. As a result, exceptions remain lazy values even after being forced, e.g.,

    let x = lazy (raise Not_found) in 
    Obj.lazy_tag = Obj.tag (Obj.repr x);;
    - : bool = true
    
    let x = lazy (raise Not_found) in 
    try Lazy.force x with Not_found -> 
    Obj.lazy_tag = Obj.tag (Obj.repr x)
    

    Implementation wise, a computation which raises an exception is substituted by a function that raises this exception. So there is still some memoization happening, in other words, if you had lazy (x (); y (); z ()) and y () is raising an exception E, then the lazy value payload will be substituted by a function fun () -> raise E, i.e., it will never repeat x (), nor it will ever reach z ().

    Lazy values in Multicore

    Laziness is a restricted form of mutability and, like any other mutability, it complicates things when parallel computations come into play.

    In OCaml implementation, lazy values are not only changing their value throughout time but also type and representation. The representation of a value in OCaml is dictated by the header. For performance reasons, the OCaml Multicore Team decided to forbid any changed to the header, thus values can no longer change their representations (otherwise, if they would allow changing the header, each access to the header field would require an expensive synchronization).

    A solution to this problem introduces a new level of indirection, where the state of the lazy value is stored in its payload (which actually makes the new lazy representation even closer to our conceptual view).

    Before we delve into the implementation there is also one more thing to be explained about the lazy values in OCaml. When a lazy value is forced, it is not immediately updated to the result of the computation, since the computation itself could be recursive and reference the lazy value. That's why on the first step before the computation attached to the lazy value is called, the payload of a lazy function is substituted with a function that raises the Lazy.Undefined exception, so that not well-formed recursive expressions still terminate nicely.

    This trick was hijacked and reused by the Multicore Team to make lazy values safe in the presence of multiple threads trying to force it at the same time. When a lazy value is being forced they substitute its payload with a function called bomb which checks whether the lazy value is referenced again (either because the computation recurses, or because it is shared with another thread) during the evaluation, and if the reference is from the same domain then it triggers the Undefined exception, indicating that this is not a well-formed lazy value, or if the domain is different, then it raises theRacyLazy exception, that indicates that there is an unserialized access to the same lazy value from different domains.

    The crucial moment here is to understand that since lazy is a mutable value, it is still the responsibility of a user to properly serialize accesses to it. How to do this properly and efficiently is still in the Future Work section.

    References to implementation

    This is