I have a table as following:
create table test
(
id varchar(20),
text varchar(20)
);
insert into test values(1, 'a')
insert into test values(2, 'b')
insert into test values(3, 'c')
To check multi-column in where condition, I can do this:
SELECT * FROM test
WHERE (id, text) = ANY (VALUES('1', 'a')
, ('2', 'b')
, ('3', 'c'));
I have a function that takes 2 arrays as input:
create or replace function test_function
(
p_ids in varchar[],
p_texts in varchar[]
)
returns table (p_id varchar,
p_text varchar) as $$
begin
return query
SELECT id, text
FROM test
WHERE (id, text) = ANY (VALUES('1', 'a')
, ('2', 'b')
, ('3', 'c'));
end;
$$ LANGUAGE plpgsql;
My question is, what is the most efficient (performance-wise) way to to replace ANY (VALUES('1', 'a'), ('2', 'b'), ('3', 'c'))
with the actual array input (p_ids
and p_texts
)?
Some of the choices are:
INNER JOIN
with variadic unnest()
instead of filtering in WHERE
unnest()
in a row()
constructor, inside an array()
constructorunnest()
in that or an IN
subqueryThe first one is not only the shortest, but also the quickest, as demonstrated lower in this post.
create or replace function test_function
( p_ids in varchar[],
p_texts in varchar[]
)
returns table (p_id varchar,
p_text varchar) as $f$
begin
return query
SELECT id, text
FROM test
JOIN unnest(p_ids,p_texts)AS u(id,text) USING(id,text);
--NATURAL JOIN unnest(p_ids,p_texts)AS u(id,text); --discouraged syntax
--WHERE (id, text) = ANY(ARRAY(select (unnest(p_ids),unnest(p_texts))));
--WHERE (id, text) = ANY(ARRAY(select u from unnest(p_ids,p_texts)u));
--WHERE (id, text) IN (select*from unnest(p_ids,p_texts));
end;
$f$ LANGUAGE plpgsql;
select*from test_function('{1,2,3,8,9}','{a,b,c,x}');
p_id | p_text |
---|---|
1 | a |
2 | b |
3 | c |
Watch out for null-padding in case they are of different length
select row( unnest(array['a','b','c','d'])
,unnest(array[1,2])
);
row |
---|
(a,1) |
(b,2) |
(c,) |
(d,) |
The join
with variadic unnest()
seems to perform best here (execution times in seconds, lower is better):
demo2 at db<>fiddle
variant | avg | a | min | mi | max | mx | sum | sm | stddev | st | mode | md |
---|---|---|---|---|---|---|---|---|---|---|---|---|
join_variadic_unnest | 0.000297 | 1 | 0.000240 | 2 | 0.000469 | 1 | 0.006236 | 1 | 0.000080 | 1 | 0.000242 | 2 |
in_variadic_unnest | 0.001278 | 2 | 0.000162 | 1 | 0.022200 | 2 | 0.026829 | 2 | 0.004795 | 2 | 0.000170 | 1 |
array_row_2unnest | 0.469599 | 3 | 0.459486 | 4 | 0.492036 | 4 | 0.861575 | 3 | 0.009940 | 3 | 0.459486 | 4 |
array_variadic_unnest | 0.469757 | 4 | 0.442742 | 3 | 0.490769 | 3 | 0.864888 | 4 | 0.013638 | 4 | 0.442742 | 3 |
Above are timings of each method searching a few times for 11 rows that match two 20-element arrays in a 50k sample.
Here's the two winners searching 100 times for 262 rows based on two 50-element arrays in a 500k sample:
variant | avg | a | min | mi | max | mx | sum | sm | stddev | st | mode | md |
---|---|---|---|---|---|---|---|---|---|---|---|---|
join_variadic_unnest | 0.000497 | 1 | 0.000314 | 1 | 0.003766 | 1 | 0.050234 | 1 | 0.000565 | 1 | 0.000316 | 1 |
in_variadic_unnest | 0.000558 | 2 | 0.000343 | 2 | 0.006439 | 2 | 0.056383 | 2 | 0.000793 | 2 | 0.000345 | 2 |
Note that you can use your target table as if it was a type:
demo3 at db<>fiddle
create or replace function test_function
( tests in test[]
)returns setof test as
$f$ SELECT id
, text
FROM test
NATURAL JOIN unnest(tests);
$f$ language SQL;
Then
select*from test_function(array[ (1,'a')::test
,(2,'b')::test
,(3,'c')::test])
Or
select*from test_function('{"(1,a)","(2,b)","(3,c)"}'::test[])