postgresqlfunctionplpgsqldynamic-sqlmultiple-tables

Get IDs from multiple columns in multiple tables as one set or array


I have multiple tables with each two rows of interest: connection_node_start_id and connection_node_end_id. My goal is to get a collection of all those IDs, either as a flat ARRAY or as a new TABLE consisting of one row.

Example output ARRAY:

result = {1,4,7,9,2,5}

Example output TABLE:

IDS
-------
1
4
7
9
2 
5

My fist attempt is somewhat clumsy and does not work properly as the SELECT statement just returns one row. It seems there must be a simple way to do this, can someone point me into the right direction?

CREATE OR REPLACE FUNCTION get_connection_nodes(anyarray)
  RETURNS anyarray AS
$$
DECLARE
  table_name varchar;
  result integer[];
  sel integer[];
BEGIN
  FOREACH table_name IN ARRAY $1
  LOOP
     RAISE NOTICE 'table_name(%)',table_name;
     EXECUTE 'SELECT ARRAY[connection_node_end_id, 
                           connection_node_start_id] FROM ' || table_name INTO sel;
    RAISE NOTICE 'sel(%)',sel;
    result  := array_cat(result, sel);  
  END LOOP;
  RETURN result;            
END
$$
  LANGUAGE 'plpgsql';

Test table:

connection_node_start_id | connection_node_end_id
--------------------------------------------------
 1                       | 4 
 7                       | 9 

Call:

SELECT get_connection_nodes(ARRAY['test_table']);

Result:

{1,4}  -- only 1st row, rest is missing

Solution

  • CREATE OR REPLACE FUNCTION get_connection_nodes(text[])
      RETURNS TABLE (ids int)
      LANGUAGE plpgsql AS
    $func$
    DECLARE
       _tbl text;
    BEGIN
       FOREACH _tbl IN ARRAY $1
       LOOP
          RETURN QUERY EXECUTE format('
             SELECT t.id
             FROM   %I, LATERAL (VALUES (connection_node_start_id)
                                      , (connection_node_end_id)) t(id)'
           , _tbl);
       END LOOP;
    END
    $func$;
    

    Related answer on dba.SE:

    Or drop the loop and concatenate a single query. Probably fastest:

    CREATE OR REPLACE FUNCTION get_connection_nodes2(text[])
      RETURNS TABLE (ids int)
      LANGUAGE plpgsql AS
    $func$
    BEGIN
    RETURN QUERY EXECUTE (    
       SELECT string_agg(format(
                 'SELECT t.id FROM %I, LATERAL (VALUES (connection_node_start_id)
                                                     , (connection_node_end_id)) t(id)'
               , tbl), ' UNION ALL ')
       FROM   unnest($1) tbl
       );
    END
    $func$;
    

    Related:

    LATERAL was introduced with Postgres 9.3.

    For very old Postgres version

    You can use the set-returning function unnest() in the SELECT list, too:

    CREATE OR REPLACE FUNCTION get_connection_nodes2(text[])
      RETURNS TABLE (ids int)
      LANGUAGE plpgsql AS
    $func$
    BEGIN
       RETURN QUERY EXECUTE (
       SELECT string_agg(
                'SELECT unnest(ARRAY[connection_node_start_id
                                   , connection_node_end_id]) FROM ' || tbl
              , ' UNION ALL '
              )
       FROM (SELECT quote_ident(tbl) AS tbl FROM unnest($1) tbl) t
       );
    END
    $func$;
    

    Should work with pg 8.4+ (or maybe even older). Works with current Postgres (9.4) as well, but LATERAL is much cleaner.

    Or make it very simple:

    CREATE OR REPLACE FUNCTION get_connection_nodes3(text[])
      RETURNS TABLE (ids int)
      LANGUAGE plpgsql AS
    $func$
    BEGIN
       RETURN QUERY EXECUTE (
       SELECT string_agg(format(
                 'SELECT connection_node_start_id FROM %1$I
                  UNION ALL
                  SELECT connection_node_end_id FROM %1$I'
               , tbl), ' UNION ALL ')
       FROM   unnest($1) tbl
       );
    END
    $func$;
    

    format() was introduced with pg 9.1.

    Might be a bit slower with big tables because each table is scanned once for every column (so 2 times here). Sort order in the result is different, too - but that does not seem to matter for you.

    Be sure to sanitize escape identifiers to defend against SQL injection and other illegal syntax. Details: