arrayspostgresqlpolymorphismplpgsqlpostgresql-9.4

How do I get the type of an array's elements?


I'm writing a polymorphic PL/pgSQL function that iterates over an array. I am interested in using FOREACH, however I cannot figure out how to declare a temporary variable with the right type.

My function is below, for more information see the comment on line 4.

CREATE OR REPLACE FUNCTION uniq(ary anyarray) RETURNS anyarray AS $$
DECLARE
  ret ary%TYPE := '{}';
  v ???; -- how do I get the element type of @ary@?
BEGIN
  IF ary IS NULL THEN
    return NULL;
  END IF;

  FOREACH v IN ARRAY ary LOOP
    IF NOT v = any(ret) THEN
      ret = array_append(ret, v);
    END IF;
  END LOOP;

  RETURN ret;
END;
$$ LANGUAGE plpgsql;

Solution

  • Answer to primary question

    You cannot declare a variable of a polymorphic type without a "template" variable or parameter.

    There are related examples in the manual at the end of the chapter Declaring Function Parameters, but this trick is not covered:

    Add another IN, INOUT or OUT parameter with data type ANYELEMENT to the function definition. It resolves to the matching element type automatically and can be (ab)used as variable inside the function body directly or as template for more variables:

    CREATE OR REPLACE FUNCTION uniq1(ary ANYARRAY, v ANYELEMENT = NULL)
      RETURNS anyarray
      LANGUAGE plpgsql AS
    $func$
    DECLARE
       ret      ary%TYPE := '{}';
       some_var v%TYPE;  -- we could declare more variables now
                         -- but we don't need to
    BEGIN
       IF ary IS NULL THEN
          RETURN NULL;
       END IF;
    
       FOREACH v IN ARRAY ary LOOP  -- instead, we can use v directly
          IF NOT v = any(ret) THEN
             ret := array_append(ret, v);
          END IF;
       END LOOP;
    
       RETURN ret;
    END
    $func$;

    Related:

    Copying types like that only works in the DECLARE section and is different type casting. It is explained in the manual here.

    Assign a default value, so the added parameter does not have to be included in the function call: ANYELEMENT= NULL

    Call (unchanged):

    SELECT uniq1('{1,2,1}'::int[]);
    SELECT uniq1('{foo,bar,bar}'::text[]);
    

    Better function

    I would actually use an OUT parameter for convenience and invert the test logic:

    CREATE OR REPLACE FUNCTION uniq2(ary ANYARRAY, elem ANYELEMENT = NULL
                                   , OUT ret ANYARRAY)
      RETURNS anyarray
      LANGUAGE plpgsql AS
    $func$
    BEGIN
       IF ary IS NULL
          THEN RETURN;
          ELSE ret := '{}';  -- init
       END IF;
    
       FOREACH elem IN ARRAY ary LOOP
          IF elem = ANY(ret) THEN  -- do nothing
          ELSE
             ret := array_append(ret, elem);
          END IF;
       END LOOP;
    END
    $func$;
    

    But this still does not cover all cases containing null elements.

    Proper function

    To work for null elements as well:

    CREATE OR REPLACE FUNCTION uniq3(ary ANYARRAY, elem ANYELEMENT = NULL
                                   , OUT ret ANYARRAY)
      RETURNS anyarray
      LANGUAGE plpgsql AS
    $func$
    BEGIN
       IF ary IS NULL
          THEN RETURN;
          ELSE ret := '{}';  -- init
       END IF;
    
       FOREACH elem IN ARRAY ary LOOP
          IF elem IS NULL THEN  -- special test for NULL
             IF array_length(array_remove(ret, NULL), 1) = array_length(ret, 1) THEN
                ret := array_append(ret, NULL);
             END IF;
          ELSIF elem = ANY(ret) THEN  -- do nothing
          ELSE
             ret := array_append(ret, elem);
          END IF;
       END LOOP;
    END
    $func$;
    

    Checking for NULL in an array is a bit of a pain:

    All of these functions are just proof of concept. I would use neither. Instead:

    Superior solutions with plain SQL

    In Postgres 9.4 use WITH ORDINALITY to preserve original order of elements. Detailed explanation:

    Basic code for single value:

    SELECT ARRAY (
       SELECT elem
       FROM  (
          SELECT DISTINCT ON (elem) elem, i
          FROM   unnest('{1,2,1,NULL,4,NULL}'::int[]) WITH ORDINALITY u(elem, i)
          ORDER  BY elem, i
          ) sub
       ORDER  BY i) AS uniq;
    

    Returns:

    uniq
    ------------
    {1,2,NULL,4}
    

    About DISTINCT ON:

    Built into a query:

    SELECT *
    FROM   test t
         , LATERAL (
       SELECT ARRAY (
          SELECT elem
          FROM  (
             SELECT DISTINCT ON (elem) elem, i
             FROM   unnest(t.arr) WITH ORDINALITY u(elem, i)
             ORDER  BY elem, i
             ) sub
          ORDER BY i) AS arr
       ) a;
    

    This has a tiny corner case: it returns an empty array a NULL array. To cover all bases:

    SELECT t.*, CASE WHEN t.arr IS NULL THEN NULL ELSE a.arr END AS arr
    FROM   test t
         , LATERAL (
       SELECT ARRAY (
          SELECT elem
          FROM  (
             SELECT DISTINCT ON (elem) elem, ord
             FROM   unnest(t.arr) WITH ORDINALITY u(elem, ord)
             ORDER  BY elem, ord
             ) sub
          ORDER BY ord) AS arr
       ) a;
    

    Or:

    SELECT *
    FROM   test t
    LEFT   JOIN LATERAL (
       SELECT ARRAY (
          SELECT elem
          FROM  (
             SELECT DISTINCT ON (elem) elem, i
             FROM   unnest(t.arr) WITH ORDINALITY u(elem, i)
             ORDER  BY elem, i
             ) sub
          ORDER BY i) AS arr
       ) a ON t.arr IS NOT NULL;
    

    In Postgres 9.3 or older you can substitute with generate_subscripts():

    SELECT *
    FROM   test t
         , LATERAL (
       SELECT ARRAY (
          SELECT elem
          FROM  (
             SELECT DISTINCT ON (t.arr[i]) t.arr[i] AS elem, i
             FROM   generate_subscripts(t.arr, 1) i
             ORDER  BY t.arr[i], i
             ) sub
          ORDER  BY i
          ) AS arr
       ) a;
    

    We need this in sqlfiddle, which currently only supports pg 9.3, so WITH ORDINALITY is not available:

    SQL Fiddle.