Given the following query:
WITH t as (
SELECT name, array_agg(DISTINCT("age", "gender")) as "ages_and_genders"
FROM (
SELECT * FROM (VALUES ('bob', 33, 'm'), ('bob', 33, 'f'), ('alice', 30, 'f')) AS t ("name","age", "gender")
) as t
GROUP BY name
)
SELECT name, "ages_and_genders"[1]
FROM t
WHERE array_length("ages_and_genders", 1) = 1
How do I go about breaking apart the record/tuple returned into the "age" and "gender" as separate columns?
I expect to get back a result:
name | age | gender
-------------------
"alice" | 30 | 'f'
Postgres cannot decompose anonymous record types. To access individual fields, the nested structure must be known. Cast to a well-known row type. If no matching type exists, yet, register one first. There are various ways. For ad-hoc use, a "temporary" type (undocumented) is advisable. (For repeated use, register a plain type.)
By default, the value created by a
ROW
expression is of an anonymous record type. If necessary, it can be cast to a named composite type — either the row type of a table, or a composite type created withCREATE TYPE AS
. An explicit cast might be needed to avoid ambiguity.
Run once in your session:
CREATE TYPE pg_temp.my_int_txt AS (age int, gender text);
Then, with a drop-in fix (preserving other awkward syntax):
WITH t as (
SELECT name, array_agg(DISTINCT("age", "gender")::pg_temp.my_int_txt) as "ages_and_genders"
FROM (
SELECT * FROM (VALUES ('bob', 33, 'm'), ('bob', 33, 'f'), ('alice', 30, 'f')) AS t ("name","age", "gender")
) as t
GROUP BY name
)
SELECT name, ("ages_and_genders"[1]).age, ("ages_and_genders"[1]).gender
FROM t
WHERE array_length("ages_and_genders", 1) = 1;
The same, consolidated:
WITH cte AS (
SELECT name, array_agg(DISTINCT ROW(age, gender)::pg_temp.my_int_txt) AS ag
FROM (
VALUES
('bob' , 33, 'm')
, ('bob' , 33, 'f')
, ('alice', 30, 'f')
) AS t (name, age, gender)
GROUP BY name
)
SELECT name, (ag[1]).*
FROM cte
WHERE cardinality(ag) = 1;
Of course, I still wouldn't use that.
Typically, there are much simpler and faster solutions. Like:
SELECT name, min(age) AS age, min(gender) AS gender
FROM (
VALUES
('bob' , 33, 'm')
, ('bob' , 33, 'f')
, ('alice', 30, 'f')
) AS t (name, age, gender)
GROUP BY name
HAVING count(DISTINCT (age, gender)) = 1;
Or:
WITH cte(name, age, gender) AS (
VALUES
('bob' , 33, "char" 'm')
, ('bob' , 33, 'f')
, ('alice', 30, 'f')
)
SELECT SELECT DISTINCT ON (name) *
FROM cte t
WHERE NOT EXISTS (
SELECT FROM cte t1
WHERE t1.name = t.name
AND (t1.age, t1.gender) IS DISTINCT FROM (t.age, t.gender)
);
If age
and gender
are NOT NULL
, you can simplify.
You may not be aware of all the things going on in your code example.
You don't need SELECT * FROM (VALUES ...
in this context. The VALUES
expression can stand on its own.
The simple string and numeric constants default to the basic types text
and integer
. For other data you may want explicit type declarations.
For example, it's typically better to work with birthdays instead of age. The type date
would need an explicit cast or declaration in the first input row (or any, really) like:
WITH t(name, birthday, gender) AS (
VALUES
('bob', date '1990-06-01', 'm') -- !
, ('bob', '1990-06-01', 'f')
...
DISTINCT
is not a function. In your query, it's a syntax element of the aggregate function.
And ("age", "gender")
is a ROW constructor (with redundant double-quotes).
Clear syntax for both: array_agg(DISTINCT ROW(age, gender))
.
If all values are NOT NULL
, simpler expressions are possible. My given queries are null-safe.
Postgres' handling of anonymous records is a bit lacking in corner case situations. Not least because few care, as there are almost always better solutions.
For functions returning anonymous records, you can append a column definition list. Not applicable here.