mongodbtreeaggregation-frameworkancestordescendant

MongoDB Tree Model: Get all ancestors, Get all descendants


I have an arbitrary tree structure.

Example data structure:

root
  |--node1
  |     |--node2
  |     |     |--leaf1
  |     |
  |     |--leaf2
  |
  |--node3
        |--leaf3

Each node and leaf has 2 properties: id and name.


The important queries:

1.: A leaf id is given. The query should return the whole path from root to that leaf, with all node's id and name properties.

It's not important if the return value is an sorted array of nodes or if it's an object where the nodes are nested.

Example: If the id of leaf2 is given, the query should return: root(id, name), node1(id, name), leaf2(id, name).


2.: Given any node id: Get the whole (sub)tree. Here it would be nice to retrieve a single object where each node has a children array.


Thoughts, trials and errors:

1.: First I tried to simply model the tree as a single JSON document, but then the query would become impossible: There's no way to find out at which nesting level the leaf is. And if I knew the whole path of ids from root to the leaf, I'd had to use a projection with multiple positional operators and that's not supported by MongoDB at the moment. Additionally it's not possible to index the leaf ids because the nesting can be infinite.

2.: Next idea was to use a flat data design, where each node has an array which contains the node's ancestor ids:

{
  id: ...,
  name: ...,
  ancestors: [ rootId, node1Id, ... ]
}

This way I'd have to do 2 queries, to get the whole path from root to some node or leaf, which is quite nice.

Questions:

If I choose data model 2.: How can I get the whole tree, or a subtree?

Getting all descendants is easy: find({ancestors:"myStartingNodeId"}). But those will of course be not sorted or nested.

Is there a way using the aggregation framework or a completely different data model to solve this problem?

Thank you!


Solution

  • Here's what data structure I finally came up with. It's optimized for read queries. Some write queries (like moving subtrees) can be painful.

    {
      id: "...",
      ancestors: ["parent_node_id", ..., "root_node_id"], // order is important!
      children: ["child1_id", "child2_id", ...]
    }
    

    Benefits:

    How to use it:

    Drawbacks: