I’ve been dealing with a design problem for quite a while now, where cyclic dependencies are the fundamental problem, and I’m having some problems resolving it elegantly. I’m coming from C, where cyclic dependencies are both possible and quite easily resolvable.
The following is a very simplified image of the files in the project which are of interest:
ast.ml (doesn’t actually have an interface, I’m not too keen on copying the whole type)
type loc = string * (int * int) * (int * int)
and id = string * loc
and decl =
| Decl_Func of decl_func
and decl_func = {
df_Name: id;
mutable df_SymTab: sym_tab option;
}
(* goes on for about 100 more types *)
symtab.mli
type t
type symbol =
| Sym_Func of Ast.decl_func
val lookup_by_id: Ast.id -> symbol
(there are more files to be added in the future)
In C I'd simply make the symbol table a pointer, and forward declare it. Problem solved. This, unfortunately, isn't possible in OCaml.
Each of the implementations is quite large. Which means I absolutely do not want to make everything recursive modules, since that would mean the implementation file will be 10kloc or even more, with a ton of code which is not really related (beyond the big recursive type).
How would I solve this, while still maintaining a somewhat modular design?
You're not the first to have that problem and there are numerous different solutions depending on workflow, taste and needs.
Here is a good way to think about it.
By leaves, I mean the types like loc
or id
that do not depend on any other type. They don't need to be in your recursive type definition and therefore shouldn't be.
Moreover, you'll probably have specific functions to handle locations and identifiers and having those function close to the type definition is good practice. So, you can create a ast_loc.ml and a ast_id.ml file with the appropriate definitions and basic functions.
This may seem like little, but it will actually help make your code clearer with the added bonus of lightening up ast.ml.
Now, I do not recommend you use that extensively, as it tends to make code harder to read, as it has more indirections. Check this out:
type 't v = Thing of 't
(* potentially in a different later file *)
type t = Stuff of t v
By using a type parameter, you can delay the usage of recursivity in your type definition. Note that I do not recommend you use it for your whole AST as it will make maintaining a pain but if you have some middle nodes that behave quite independently of the rest, this may help.
These for instance, can be often used:
type 'a named = { id : id; v : 'a; }
type 'a located = { loc : loc; v: 'a; }
This method is particularly useful if it helps factorize your type definition. But, as I have already stated: don't abuse it! It is easy to do, but hard to maintain.
As of today, the Parsetree
file of the OCaml compiler has 958 lines. That's what it's supposed to have. It is a complex tree structure and that should be visible.
Note that the file is just a type definition. Subsequent files contain the code to manipulate that definition (and usually don't introduce new types that are necessary outside their module).
In a way, I am a bit contradicting the point I made about loc
and id
arguing that you should separate type definition and code, but this is a different case: loc
and id
are simple types that can be manipulated independently. symbol
only makes sense within your AST definition. Also, nothing keeps you from creating a symbol.ml file that manipulates that part of the AST without containing the type definition (comments are your friends, Merlin is a must).
Also, recursive functors is not something I'd advise unless you really need them.