I have a SQL query where the list of strings is repeated twice: ('foo', 'bar', 'baz', 'bletch')
. What is the most maintainable method to avoid such repetition and make this code more DRY?
I am not optimizing query performance, only maintainability.
select
*
from
gizmo_details
where
_gizmo_name in ('foo', 'bar', 'baz', 'bletch')
or gizmo_id in
(select id from gizmos where
gizmo_name in ('foo', 'bar', 'baz', 'bletch'));
DRY - Don't Repeat Yourself.
Multiple rows in gizmo_details
can match the same input string. Such duplicates are preserved.
But the same row in gizmo_details
can match multiple input strings (if there can be duplicate input), or the same gizmo_details.gizmo_id
can qualify multiple times. Such duplicates are removed.
This may or may not be an issue. It's just something that tends to get overlooked. Several of the solutions posted so far are not true to your original in this respect.
If there can be dupes in any of the three sets (tables, input), you might want to spell out how to deal with dupes. There are fitting query styles for any possible specification.
You did not ask for optimizing query performance (which may be possible on top of anything else here), but correctness always comes first.
A CTE is one way to keep a single query DRY:
WITH input(gizmo_name) AS (
SELECT unnest('{foo, bar, baz, bletch}'::text[]) -- your input as array
)
SELECT *
FROM gizmo_details
WHERE _gizmo_name = ANY (TABLE input)
OR gizmo_id IN (SELECT g.id FROM gizmos g JOIN input USING (gizmo_name));
Much of this comes down to the question of defining constants in SQL. See:
To repeat this query over and over, prepared statements are one way to keep it DRY:
PREPARE qr1 (text[]) AS
SELECT *
FROM gizmo_details gd
WHERE gd._gizmo_name = ANY ($1)
OR gd.gizmo_id IN (SELECT g.id FROM gizmos g WHERE g.gizmo_name = ANY ($1));
Call:
EXECUTE qr1('{foo, bar, baz, bletch}'::text[]); -- array input
This (also) prevents repeating the input inside the query.
Prepared statements are built into many client libraries as standard feature.
A server-side function keeps that "DRY". A VARIADIC
input parameter is just an optional added feature. See:
CREATE OR REPLACE FUNCTION f_my_func(VARIADIC _gizmo_names text[])
RETURNS SETOF gizmo_details
LANGUAGE sql STABLE PARALLEL SAFE AS
$func$
SELECT *
FROM gizmo_details gd
WHERE gd._gizmo_name = ANY (_gizmo_names)
OR gd.gizmo_id IN (SELECT g.id FROM gizmos g WHERE g.gizmo_name = ANY (_gizmo_names));
$func$
Call:
SELECT * FROM f_my_func('foo', 'bar', 'baz', 'bletch'); -- list input
Or:
SELECT * FROM f_my_func(VARIADIC '{foo, bar, baz, bletch}'); -- back to array input
A lot more versatile than prepared statements, plus it can be nested in outer queries (as opposed to a prepared statement).
Aside: using _
as prefix for variables and/or parameters but never for column names is a common naming convention. Optional, but I stick to it.