I have 2 postgresql instances on different computers(X and Y) on the same network. Y server has PgAdmin installed on and connected to both instances.
On the Y instance, I created FDW connection for a database on X.
My problem is It takes over 30 seconds when I run a select query over foreign table. If I run the same query into direct to X server from PgAdmin, it takes under 1 sec and I though the reason for slow performance couldn't be the network issue.
I tried both with on
and off
for use_remote_estimate
but since my foreign table keeps updating with new data continuously, I decided to keep it off
.
My query is below
SELECT q_num,
count(*)::integer AS total_calls,
count(1) FILTER (WHERE na_code = 0 AND fail_code = 0)::integer AS answered,
count(1) FILTER (WHERE na_code = 0 AND fail_code = 0 AND (wait + poll) <= '00:00:15'::interval)::integer AS answered_15,
count(1) FILTER (WHERE na_code = 0 AND fail_code = 0 AND (wait + poll) <= '00:00:20'::interval)::integer AS answered_20,
count(1) FILTER (WHERE na_code = 0 AND fail_code = 0 AND (wait + poll) <= '00:00:30'::interval)::integer AS answered_30,
count(1) FILTER (WHERE na_code <> 0 OR fail_code <> 0)::integer AS missed,
count(1) FILTER (WHERE na_code <> 0 OR fail_code <> 0 AND (wait + poll) > '00:00:15'::interval)::integer AS missed_15,
count(1) FILTER (WHERE na_code <> 0 OR fail_code <> 0 AND (wait + poll) > '00:00:20'::interval)::integer AS missed_20,
count(1) FILTER (WHERE na_code <> 0 OR fail_code <> 0 AND (wait + poll) > '00:00:30'::interval)::integer AS missed_30,
EXTRACT(epoch FROM sum(wait + poll))::integer AS total_waiting,
EXTRACT(epoch FROM max(wait + poll))::integer AS max_waiting
FROM foreign_table
WHERE time_start > (CURRENT_TIMESTAMP - '03:15:00'::interval)
GROUP BY q_num
time_start
column has index on the remote server.
How can I improve the performance?
EDIT: Just tried to increase fetch_size but it still takes more than 20 seconds.
@Frank Heikens Here is the index
CREATE INDEX IF NOT EXISTS original_table_time_start_idx
ON public.original_table USING btree
(time_start ASC NULLS LAST)
TABLESPACE pg_default;
Postgresql version on X instance
PostgreSQL 11.9, compiled by Visual C++ build 1914, 64-bit
Postgresql version on Y instance
PostgreSQL 14.5, compiled by Visual C++ build 1914, 64-bit
EXPLAIN (ANALYZE, VERBOSE, BUFFERS) on X instance
GroupAggregate (cost=8.51..8.91 rows=2 width=49) (actual time=0.148..0.150 rows=1 loops=1)
Output: q_num, (count(*))::integer, (count(1) FILTER (WHERE ((na_code = 0) AND (fail_code = 0))))::integer, (count(1) FILTER (WHERE ((na_code = 0) AND (fail_code = 0) AND ((wait + poll) <= '00:00:15'::interval))))::integer, (count(1) FILTER (WHERE ((na_code = 0) AND (fail_code = 0) AND ((wait + poll) <= '00:00:20'::interval))))::integer, (count(1) FILTER (WHERE ((na_code = 0) AND (fail_code = 0) AND ((wait + poll) <= '00:00:30'::interval))))::integer, (count(1) FILTER (WHERE ((na_code <> 0) OR (fail_code <> 0))))::integer, (count(1) FILTER (WHERE ((na_code <> 0) OR ((fail_code <> 0) AND ((wait + poll) > '00:00:15'::interval)))))::integer, (count(1) FILTER (WHERE ((na_code <> 0) OR ((fail_code <> 0) AND ((wait + poll) > '00:00:20'::interval)))))::integer, (count(1) FILTER (WHERE ((na_code <> 0) OR ((fail_code <> 0) AND ((wait + poll) > '00:00:30'::interval)))))::integer, (date_part('epoch'::text, sum((wait + poll))))::integer, (date_part('epoch'::text, max((wait + poll))))::integer
Group Key: original_table.q_num
Buffers: shared hit=4
-> Sort (cost=8.51..8.51 rows=3 width=45) (actual time=0.099..0.101 rows=3 loops=1)
Output: q_num, na_code, fail_code, wait, poll
Sort Key: original_table.q_num
Sort Method: quicksort
Memory: 25kB
Buffers: shared hit=4
-> Index Scan using original_table_time_start_idx on public.original_table (cost=0.43..8.48 rows=3 width=45) (actual time=0.052..0.060 rows=3 loops=1)
Output: q_num, na_code, fail_code, wait, poll
Index Cond: (original_table.time_start > (CURRENT_TIMESTAMP - '00:15:00'::interval))
Buffers: shared hit=4
Planning Time: 1.383 ms
Execution Time: 0.638 ms
EXPLAIN (ANALYZE, VERBOSE, BUFFERS) on Y instance
GroupAggregate (cost=48287.41..48295.41 rows=6 width=49) (actual time=22846.842..22846.847 rows=1 loops=1)
Output: q_num, (count(*))::integer, (count(1) FILTER (WHERE ((na_code = 0) AND (fail_code = 0))))::integer, (count(1) FILTER (WHERE ((na_code = 0) AND (fail_code = 0) AND ((wait + poll) <= '00:00:15'::interval))))::integer, (count(1) FILTER (WHERE ((na_code = 0) AND (fail_code = 0) AND ((wait + poll) <= '00:00:20'::interval))))::integer, (count(1) FILTER (WHERE ((na_code = 0) AND (fail_code = 0) AND ((wait + poll) <= '00:00:30'::interval))))::integer, (count(1) FILTER (WHERE ((na_code <> 0) OR (fail_code <> 0))))::integer, (count(1) FILTER (WHERE ((na_code <> 0) OR ((fail_code <> 0) AND ((wait + poll) > '00:00:15'::interval)))))::integer, (count(1) FILTER (WHERE ((na_code <> 0) OR ((fail_code <> 0) AND ((wait + poll) > '00:00:20'::interval)))))::integer, (count(1) FILTER (WHERE ((na_code <> 0) OR ((fail_code <> 0) AND ((wait + poll) > '00:00:30'::interval)))))::integer, (EXTRACT(epoch FROM sum((wait + poll))))::integer, (EXTRACT(epoch FROM max((wait + poll))))::integer
Group Key: foreign_table.q_num
-> Sort (cost=48287.41..48287.59 rows=72 width=45) (actual time=22846.755..22846.759 rows=1 loops=1)
Output: q_num, na_code, fail_code, wait, poll
Sort Key: foreign_table.q_num
Sort Method: quicksort
Memory: 25kB
-> Foreign Scan on public.foreign_table (cost=100.00..48285.19 rows=72 width=45) (actual time=10927.554..22846.717 rows=1 loops=1)
Output: q_num, na_code, fail_code, wait, poll
Filter: (foreign_table.time_start > (CURRENT_TIMESTAMP - '03:15:00'::interval))
Rows Removed by Filter: 720077
Remote SQL: SELECT q_num, time_start, wait, poll, na_code, fail_code FROM public.original_table
Planning Time: 0.522 ms
Execution Time: 22849.716 ms
I hope these informations helps.
expressions using CURRENT_TIMESTAMP are not considered shippable to the foreign side. You could argue that it should be shippable, as you could just materialize it into a literal and then ship the literal, but that is now how it currently works. So instead all rows need to be read back and have their timestamps tested locally.
EDIT: Just tried to increase fetch_size but it still takes more than 20 seconds.
What did you increase it to? Maybe increase it yet more.