The goal is a custom aggregate function made with CREATE AGGREGATE called string_agg_oxford
; it is an aggregate function that works similar to string_agg
except it is smart enough to know how many items it is aggregating so that it can place "and" in front of the last item.
So where string_agg(items, ', ')
would return "item1, item2, item3"
, string_agg_oxford(items)
will return "item1, item2, and item3"
.
My failed attempt starts with a type for our accumulator that includes the total number of rows and the index for the current row:
CREATE TYPE oxford_accumulator as (
row_count numeric,
i numeric,
acc text
);
Now we need our accumulator function:
CREATE OR REPLACE FUNCTION oxford_acc (acc oxford_accumulator, curr text)
RETURNS oxford_accumulator
LANGUAGE PLPGSQL
AS $$
BEGIN
IF acc.i + 1 = acc.row_count THEN
RETURN (acc.row_count, acc.i + 1, acc.acc || curr);
END IF;
IF (acc.i + 2 = acc.row_count) AND (acc.row_count = 2) THEN
RETURN (acc.row_count, acc.i + 1, acc.acc || curr || ' and ');
END IF;
IF (i + 2 = acc.row_count) THEN
RETURN (acc.row_count, acc.i + 1, acc.acc || curr || ', and ');
END IF;
RETURN (acc.row_count, acc.i + 1, acc.acc || curr || ', ');
END;
$$;
because the accumulator has swallowed up the total count and the index we have to release this information when the accumulator is finished with an ffunc
.
CREATE OR REPLACE FUNCTION oxford_final (acc oxford_accumulator)
RETURNS text
LANGUAGE PLPGSQL
AS $$
BEGIN
RETURN acc.acc;
END;
$$;
My idea falls apart here where we need to wire it all up because there does not seem to be a way to parametrize the total row count... so fail.
CREATE OR REPLACE AGGREGATE string_agg_oxford (text, row_count numeric) (
INITCOND = (row_count, 0, ''),
-- ^^^ fail
STYPE = oxford_accumulator,
SFUNC = oxford_acc,
FINALFUNC = oxford_final
);
I know something similar can be achieved with a regular function, but I'm not ready to give up yet if there's a way to do this as an aggregator that could be used in a select statement like SELECT string_agg_oxford(clients.full_name) FROM matters GROUP BY matters.matter_id;
Initial condition must be "a string constant in the form accepted for the data type state_data_type" so no way to pass a dynamic value. Credit to @a_horse_with_no_name
For efficiency reasons, it would be easier to "peel the last value out of the accumulator." Credit to @jjanes.
Lets use a unique separator to alleviate any issues matching on ", " because that is fairly common - so we'll use a unique separator and then replace those in the ffunc as is appropriate.
Postgres does not support having capture groups within its lookahead assertions, so instead of a fancy regular expression to find the last occurrence of our unique separator, we will reverse the string and tackle things upside down.
CREATE OR REPLACE FUNCTION oxford_acc (acc text, curr text)
RETURNS text
LANGUAGE PLPGSQL
AS $$
BEGIN
RETURN acc || curr || '@$@$';
-- ^^^ unique separator, can be anything unique
END;
$$;
CREATE OR REPLACE FUNCTION oxford_final (acc text)
RETURNS text
LANGUAGE PLPGSQL
AS $$
DECLARE
my_result text;
my_counter numeric;
BEGIN
SELECT left(acc, -4) INTO my_result;
-- ^^^ removes the last separator
SELECT count(*) FROM regexp_matches(my_result, '@\$@\$', 'g') INTO my_counter;
IF my_counter = 0 THEN
RETURN my_result;
END IF;
IF my_counter = 1 THEN
RETURN regexp_replace(my_result, '@\$@\$', ' and ');
END IF;
SELECT reverse(my_result) INTO my_result;
SELECT regexp_replace(my_result, '\$@\$@', ' dna ,') INTO my_result;
SELECT reverse(my_result) INTO my_result;
RETURN regexp_replace(
my_result,
'@\$@\$',
', ',
'g'
);
END;
$$;
CREATE OR REPLACE AGGREGATE oxford_agg (text) (
INITCOND = '',
STYPE = text,
SFUNC = oxford_acc,
FINALFUNC = oxford_final
);
Suggestions for improvements welcome!