I'm writing a polymorphic PL/pgSQL function that iterates over an array. I am interested in using FOREACH
, however I cannot figure out how to declare a temporary variable with the right type.
My function is below, for more information see the comment on line 4.
CREATE OR REPLACE FUNCTION uniq(ary anyarray) RETURNS anyarray AS $$
DECLARE
ret ary%TYPE := '{}';
v ???; -- how do I get the element type of @ary@?
BEGIN
IF ary IS NULL THEN
return NULL;
END IF;
FOREACH v IN ARRAY ary LOOP
IF NOT v = any(ret) THEN
ret = array_append(ret, v);
END IF;
END LOOP;
RETURN ret;
END;
$$ LANGUAGE plpgsql;
You cannot declare a variable of a polymorphic type without a "template" variable or parameter.
There are related examples in the manual at the end of the chapter Declaring Function Parameters, but this trick is not covered:
Add another IN
, INOUT
or OUT
parameter with data type ANYELEMENT
to the function definition. It resolves to the matching element type automatically and can be (ab)used as variable inside the function body directly or as template for more variables:
CREATE OR REPLACE FUNCTION uniq1(ary ANYARRAY, v ANYELEMENT = NULL)
RETURNS anyarray
LANGUAGE plpgsql AS
$func$
DECLARE
ret ary%TYPE := '{}';
some_var v%TYPE; -- we could declare more variables now
-- but we don't need to
BEGIN
IF ary IS NULL THEN
RETURN NULL;
END IF;
FOREACH v IN ARRAY ary LOOP -- instead, we can use v directly
IF NOT v = any(ret) THEN
ret := array_append(ret, v);
END IF;
END LOOP;
RETURN ret;
END
$func$;
Related:
Copying types like that only works in the DECLARE
section and is different type casting. It is explained in the manual here.
Assign a default value, so the added parameter does not have to be included in the function call: ANYELEMENT
= NULL
Call (unchanged):
SELECT uniq1('{1,2,1}'::int[]);
SELECT uniq1('{foo,bar,bar}'::text[]);
I would actually use an OUT parameter for convenience and invert the test logic:
CREATE OR REPLACE FUNCTION uniq2(ary ANYARRAY, elem ANYELEMENT = NULL
, OUT ret ANYARRAY)
RETURNS anyarray
LANGUAGE plpgsql AS
$func$
BEGIN
IF ary IS NULL
THEN RETURN;
ELSE ret := '{}'; -- init
END IF;
FOREACH elem IN ARRAY ary LOOP
IF elem = ANY(ret) THEN -- do nothing
ELSE
ret := array_append(ret, elem);
END IF;
END LOOP;
END
$func$;
But this still does not cover all cases containing null elements.
To work for null elements as well:
CREATE OR REPLACE FUNCTION uniq3(ary ANYARRAY, elem ANYELEMENT = NULL
, OUT ret ANYARRAY)
RETURNS anyarray
LANGUAGE plpgsql AS
$func$
BEGIN
IF ary IS NULL
THEN RETURN;
ELSE ret := '{}'; -- init
END IF;
FOREACH elem IN ARRAY ary LOOP
IF elem IS NULL THEN -- special test for NULL
IF array_length(array_remove(ret, NULL), 1) = array_length(ret, 1) THEN
ret := array_append(ret, NULL);
END IF;
ELSIF elem = ANY(ret) THEN -- do nothing
ELSE
ret := array_append(ret, elem);
END IF;
END LOOP;
END
$func$;
Checking for NULL in an array is a bit of a pain:
All of these functions are just proof of concept. I would use neither. Instead:
In Postgres 9.4 use WITH ORDINALITY
to preserve original order of elements.
Detailed explanation:
Basic code for single value:
SELECT ARRAY (
SELECT elem
FROM (
SELECT DISTINCT ON (elem) elem, i
FROM unnest('{1,2,1,NULL,4,NULL}'::int[]) WITH ORDINALITY u(elem, i)
ORDER BY elem, i
) sub
ORDER BY i) AS uniq;
Returns:
uniq
------------
{1,2,NULL,4}
About DISTINCT ON
:
Built into a query:
SELECT *
FROM test t
, LATERAL (
SELECT ARRAY (
SELECT elem
FROM (
SELECT DISTINCT ON (elem) elem, i
FROM unnest(t.arr) WITH ORDINALITY u(elem, i)
ORDER BY elem, i
) sub
ORDER BY i) AS arr
) a;
This has a tiny corner case: it returns an empty array a NULL array. To cover all bases:
SELECT t.*, CASE WHEN t.arr IS NULL THEN NULL ELSE a.arr END AS arr
FROM test t
, LATERAL (
SELECT ARRAY (
SELECT elem
FROM (
SELECT DISTINCT ON (elem) elem, ord
FROM unnest(t.arr) WITH ORDINALITY u(elem, ord)
ORDER BY elem, ord
) sub
ORDER BY ord) AS arr
) a;
Or:
SELECT *
FROM test t
LEFT JOIN LATERAL (
SELECT ARRAY (
SELECT elem
FROM (
SELECT DISTINCT ON (elem) elem, i
FROM unnest(t.arr) WITH ORDINALITY u(elem, i)
ORDER BY elem, i
) sub
ORDER BY i) AS arr
) a ON t.arr IS NOT NULL;
In Postgres 9.3 or older you can substitute with generate_subscripts()
:
SELECT *
FROM test t
, LATERAL (
SELECT ARRAY (
SELECT elem
FROM (
SELECT DISTINCT ON (t.arr[i]) t.arr[i] AS elem, i
FROM generate_subscripts(t.arr, 1) i
ORDER BY t.arr[i], i
) sub
ORDER BY i
) AS arr
) a;
We need this in sqlfiddle, which currently only supports pg 9.3, so WITH ORDINALITY
is not available: