[SOLVED] In Postgres, does percentile_disc reuse calculations from other percentile

In Postgres, does percentile_disc reuse calculations from other percentile_disc in the same query?

Lets say you have some table you're querying percentiles from, with the same grouping used for both columns:

SELECT percentile_disc(0.25) WITHIN GROUP (ORDER BY my_col) as p25,
       percentile_disc(0.75) WITHIN GROUP (ORDER BY my_col) as p75 from my_table

Does Postgres reuse the grouping calculated for the first column when calculating the second?

Solution

It does. You can see that in explain analyse verbose: no matter how many of those you add, you only loop once.
_{demo at db<>fiddle}

explain analyse verbose
SELECT percentile_disc(0.25) WITHIN GROUP (ORDER BY my_col) as p25
      ,percentile_disc(0.75) WITHIN GROUP (ORDER BY my_col) as p75
      ,percentile_disc(0.3) WITHIN GROUP (ORDER BY my_col) as p
    ,percentile_disc(0.4) WITHIN GROUP (ORDER BY my_col)
    ,percentile_disc(0.5) WITHIN GROUP (ORDER BY my_col)
    ,percentile_disc(0.6) WITHIN GROUP (ORDER BY my_col)
    ,percentile_disc(0.7) WITHIN GROUP (ORDER BY my_col)
from my_table;


Aggregate (cost=8464.74..8464.75 rows=1 width=56) (actual time=343.493..343.495 rows=1 loops=1)
Output: percentile_disc('0.25'::double precision) WITHIN GROUP (ORDER BY my_col), percentile_disc('0.75'::double precision) WITHIN GROUP (ORDER BY my_col), percentile_disc('0.3'::double precision) WITHIN GROUP (ORDER BY my_col), percentile_disc('0.4'::double precision) WITHIN GROUP (ORDER BY my_col), percentile_disc('0.5'::double precision) WITHIN GROUP (ORDER BY my_col), percentile_disc('0.6'::double precision) WITHIN GROUP (ORDER BY my_col), percentile_disc('0.7'::double precision) WITHIN GROUP (ORDER BY my_col)
-> Seq Scan on public.my_table (cost=0.00..7214.38 rows=500138 width=8) (actual time=0.016..39.669 rows=500000 loops=1)
Output: my_col
Planning Time: 0.122 ms
Execution Time: 344.358 ms

It also seems like it doesn't really matter how many of those you request. In this test I didn't see any clear correlation between exec time and the number of different percentile_disc() calls in the query. It ran 600 queries against a table with 100k rows, requesting 1-301 different, random percentiles. Those that asked for single percentile averaged around 32ms, same as those requesting 131 or 81 different percentiles.
Whatever differences I saw initially must've been just noise.