jsonpostgresqladjacency-list

Adjacency List to JSON graph with Postgres


I have the following schema for the tags table:

CREATE TABLE tags (
    id integer NOT NULL,
    name character varying(255) NOT NULL,
    parent_id integer
);

I need to build a query to return the following structure (here represented as yaml for readability):

- name: Ciencia
  parent_id: 
  id: 7
  children:
  - name: Química
    parent_id: 7
    id: 9
    children: []
  - name: Biología
    parent_id: 7
    id: 8
    children:
    - name: Botánica
      parent_id: 8
      id: 19
      children: []
    - name: Etología
      parent_id: 8
      id: 18
      children: []

After some trial and error and looking for similar questions in SO, I've came up with this query:

    WITH RECURSIVE tagtree AS (
      SELECT tags.name, tags.parent_id, tags.id, json '[]' children
      FROM tags
      WHERE NOT EXISTS (SELECT 1 FROM tags tt WHERE tt.parent_id = tags.id)

      UNION ALL

      SELECT (tags).name, (tags).parent_id, (tags).id, array_to_json(array_agg(tagtree)) children FROM (
        SELECT tags, tagtree
        FROM tagtree
        JOIN tags ON tagtree.parent_id = tags.id
      ) v
      GROUP BY v.tags
    )

    SELECT array_to_json(array_agg(tagtree)) json
    FROM tagtree
    WHERE parent_id IS NULL

But it returns the following results when converted to yaml:

- name: Ciencia
  parent_id: 
  id: 7
  children:
  - name: Química
    parent_id: 7
    id: 9
    children: []
- name: Ciencia
  parent_id: 
  id: 7
  children:
  - name: Biología
    parent_id: 7
    id: 8
    children:
    - name: Botánica
      parent_id: 8
      id: 19
      children: []
    - name: Etología
      parent_id: 8
      id: 18
      children: []

The root node is duplicated. I could merge the results to the expected result in my app code but I feel I am close and it could be done al from PG.

Here's an example with SQL Fiddle: http://sqlfiddle.com/#!15/1846e/1/0

Expected output: https://gist.github.com/maca/e7002eb10f36fcdbc51b

Actual output: https://gist.github.com/maca/78e84fb7c05ff23f07f4


Solution

  • Here's a solution using PLV8 for your schema.

    First, build a materialized path using PLSQL function and recursive CTEs.

    CREATE OR REPLACE FUNCTION get_children(tag_id integer)
    RETURNS json AS $$
    DECLARE
    result json;
    BEGIN
    SELECT array_to_json(array_agg(row_to_json(t))) INTO result
        FROM (
    WITH RECURSIVE tree AS (
      SELECT id, name, ARRAY[]::INTEGER[] AS ancestors
      FROM tags WHERE parent_id IS NULL
     
      UNION ALL
     
      SELECT tags.id, tags.name, tree.ancestors || tags.parent_id
      FROM tags, tree
      WHERE tags.parent_id = tree.id
    ) SELECT id, name, ARRAY[]::INTEGER[] AS children FROM tree WHERE $1 = tree.ancestors[array_upper(tree.ancestors,1)]
    ) t;
    RETURN result;
    END;
    $$ LANGUAGE plpgsql;
    

    Then, build the tree from the output of the above function.

    CREATE OR REPLACE FUNCTION get_tree(data json) RETURNS json AS $$
    
    var root = [];
    
    for(var i in data) {
      build_tree(data[i]['id'], data[i]['name'], data[i]['children']);
    }
    
    function build_tree(id, name, children) {
      var exists = getObject(root, id);
      if(exists) {
           exists['children'] = children;
      }
      else {
        root.push({'id': id, 'name': name, 'children': children});
      }
    }
    
    
    function getObject(theObject, id) {
        var result = null;
        if(theObject instanceof Array) {
            for(var i = 0; i < theObject.length; i++) {
                result = getObject(theObject[i], id);
                if (result) {
                    break;
                }   
            }
        }
        else
        {
            for(var prop in theObject) {
                if(prop == 'id') {
                    if(theObject[prop] === id) {
                        return theObject;
                    }
                }
                if(theObject[prop] instanceof Object || theObject[prop] instanceof Array) {
                    result = getObject(theObject[prop], id);
                    if (result) {
                        break;
                    }
                } 
            }
        }
        return result;
    }
    
        return JSON.stringify(root);
    $$ LANGUAGE plv8 IMMUTABLE STRICT;
    

    This will yield the required JSON mentioned in your question. Hope that helps.

    I've written a detailed post/breakdown of how this solution works here.