Lets say I have a table named static
in two different databases houses
and apartments
.
static
table contains the static information of houses such as house_size
, no_rooms
, pool
, spa
.
static
table in houses
database has columns like this:
pool spa house_size sauna no_rooms
1 1 25 1 2
1 0 35 1 3
static
table in apartments
database has columns like this:
pool spa house_size sauna
1 1 25 1
1 0 35 1
I want to run below query without raising any error. Currently, I get error as no_rooms
column does not exists in apartments.public.static
.
select pool, case when spa = 1 then 1 else 0 end as has_spa,
sauna, house_size, case when no_rooms > 2 then 1 else 0 end as rooms
from static;
Solution I tried:
WITH static_new AS (SELECT s.*
FROM (SELECT 0 AS no_rooms) AS dummy
LEFT JOIN LATERAL
( SELECT
pool, spa, sauna, house_size, no_rooms
FROM static
) AS s on true)
SELECT * FROM static_new;
It works, but this query becomes messy when there are more columns involved.
What I am looking for:
Create a function that takes column names and table name and then does the join which I am doing in above query and returns a table. (Should be generic and work for given column names and table name in parameter and return a table.)
Any other nice and tidy solution?
SQL is a strictly typed language and Postgres functions must declare their return type. Returning a variable number of columns from a function is only possible with workarounds, like polymorphic types. See:
But we can't work with the row type in your case, as that varies from database to database. The remaining option: return anonymous records and provide a column definition list with every call. I don't usually recommend this, as providing a column definition list with every call can be tedious - and often pointless. But yours might be one of the rare use cases where it makes sense.
Still, you have to know the data type of possibly missing columns. I'll assume integer
for the purpose of this demo. Else you have to pass data types additionally and build the query accordingly.
CREATE OR REPLACE FUNCTION f_dynamic_select(_tbl regclass
, _cols VARIADIC text[]) -- ①
RETURNS SETOF record -- ② anonymous records
LANGUAGE plpgsql AS
$func$
BEGIN
RETURN QUERY EXECUTE -- ③ dynamic SQL
format(
'SELECT %s FROM %s' -- ④ safe against SQLi
, (
SELECT string_agg(COALESCE(quote_ident(a.attname)
, '0 AS ' || quote_ident(t.col) -- assuming int!
), ', ' ORDER BY t.ord) -- ⑤
FROM unnest(_cols) WITH ORDINALITY t(col, ord) -- ⑤
LEFT JOIN pg_attribute a ON a.attrelid = _tbl -- ⑥
AND a.attnum > 0
AND NOT a.attisdropped
AND a.attname = t.col
)
, _tbl
);
END
$func$;
Call (important!)
SELECT *
FROM f_dynamic_select('static', 'pool', 'spa', 'sauna', 'house_size', 'no_rooms')
AS t(pool int, spa int, house_size int, sauna int, no_rooms int); -- ② column definition list
Your example call, with expressions based on these columns:
SELECT pool, case when spa = 1 then 1 else 0 end as has_spa -- ⑦ expressions
, sauna, house_size
, case when no_rooms > 2 then 1 else 0 end as rooms
FROM f_dynamic_select('static', 'pool', 'spa', 'sauna', 'house_size', 'no_rooms')
AS t(pool int, spa int, house_size int, sauna int, no_rooms int);
db<>fiddle here
① The function takes a table name as regclass
type. See:
... followed by an arbitrary list of column names - in meaningful order. VARIADIC
should be convenient for this. See:
Pass column names as case-sensitive, single-quoted strings. Not (double-quoted) identifiers.
② This may be the first time ever I recommend returning anonymous records from a function - after over 1000 answers on the plpgsql tag. The manual:
If the function has been defined as returning the
record
data type, then an alias or the key wordAS
must be present, followed by a column definition list in the form( column_name data_type [, ... ])
. The column definition list must match the actual number and types of columns returned by the function.
④ Safe against SQL injection, because the table name is passed as regclass
, and the SELECT
list is concatenated using quote_ident()
carefully. See:
⑤ Use WITH ORDINALITY
to preserve original order of columns. See:
⑥ LEFT JOIN
to the system catalog pg_attribute
to identify existing columns. See:
⑦ Move expressions building on the passed columns to the outer SELECT
.
Disclaimer: I would only introduce this level of sophistication if I had to. Maybe you can work with simple views in each database after all?