postgresqlpartitioningdatabase-partitioningpruning

When does Postgresql do partition pruning with JOIN columns?


I have two tables in a Postgres 11 database:

client table
--------
client_id    integer
client_name  character_varying

file table
--------
file_id      integer
client_id    integer
file_name    character_varying

The client table is not partitioned, the file table is partitioned by client_id (partition by list). When a new client is inserted into the client table, a trigger creates a new partition for the file table. The file table has a foreign key constraint referencing the client table on client_id.

When I execute this SQL (where c.client_id = 1), everything seems fine:

explain  
select *
from client c
join file f using (client_id)
where c.client_id = 1;

Partition pruning is used, only the partition file_p1 is scanned:

Nested Loop  (cost=0.00..3685.05 rows=100001 width=82)
  ->  Seq Scan on client c  (cost=0.00..1.02 rows=1 width=29)
        Filter: (client_id = 1)
  ->  Append  (cost=0.00..2684.02 rows=100001 width=57)
        ->  Seq Scan on file_p1 f  (cost=0.00..2184.01 rows=100001 width=57)
              Filter: (client_id = 1)

But when I use a where clause like "where c.client_name = 'test'", the database scans in all partitions and does not recognize, that client_name "test" is equal to client_id 1:

explain  
select *
from client c
join file f using (client_id)
where c.client_name = 'test';

Execution plan:

Hash Join  (cost=1.04..6507.57 rows=100001 width=82)
  Hash Cond: (f.client_id = c.client_id)
  ->  Append  (cost=0.00..4869.02 rows=200002 width=57)
        ->  Seq Scan on file_p1 f  (cost=0.00..1934.01 rows=100001 width=57)
        ->  Seq Scan on file_p4 f_1  (cost=0.00..1934.00 rows=100000 width=57)
        ->  Seq Scan on file_pdefault f_2  (cost=0.00..1.00 rows=1 width=556)
  ->  Hash  (cost=1.02..1.02 rows=1 width=29)
        ->  Seq Scan on client c  (cost=0.00..1.02 rows=1 width=29)
              Filter: ((name)::text = 'test'::text)

So for this SQL, alle partitions in the file-table are scanned.

So should every select use the column on which the tables are partitioned by? Is the database not able to deviate the partition pruning criteria?


Edit: To add some information:

In the past, I have been working with Oracle databases most of the time.

The execution plan there would be something like

  1. Do a full table scan on client table with the client name to find out the client_id.
  2. Do a "PARTITION LIST" access to the file table, where SQL Developer states PARTITION_START = KEY and PARTITION_STOP = KEY to indicate the exact partition is not known when calculating the execution plan, but the access will be done to only a list of paritions, which are calculated on the client_id found in the client table.

This is what I would have expected in Postgresql as well.


Solution

  • The documentation states that dynamic partition pruning is possible

    (...) During actual execution of the query plan. Partition pruning may also be performed here to remove partitions using values which are only known during actual query execution. This includes values from subqueries and values from execution-time parameters such as those from parameterized nested loop joins.

    If I understand it correctly, it applies to prepared statements or queries with subqueries which provide the partition key value as a parameter. Use explain analyse to see dynamic pruning (my sample data contains a million rows in three partitions):

    explain analyze
    select *
    from file
    where client_id = (
        select client_id
        from client
        where client_name = 'test');
    
    Append  (cost=25.88..22931.88 rows=1000000 width=14) (actual time=0.091..96.139 rows=333333 loops=1)
      InitPlan 1 (returns $0)
        ->  Seq Scan on client  (cost=0.00..25.88 rows=6 width=4) (actual time=0.040..0.042 rows=1 loops=1)
              Filter: (client_name = 'test'::text)
              Rows Removed by Filter: 2
      ->  Seq Scan on file_p1  (cost=0.00..5968.66 rows=333333 width=14) (actual time=0.039..70.026 rows=333333 loops=1)
            Filter: (client_id = $0)
      ->  Seq Scan on file_p2  (cost=0.00..5968.68 rows=333334 width=14) (never executed)
            Filter: (client_id = $0)
      ->  Seq Scan on file_p3  (cost=0.00..5968.66 rows=333333 width=14) (never executed)
            Filter: (client_id = $0)
    Planning Time: 0.423 ms
    Execution Time: 109.189 ms
    

    Note that scans for partitions p2 and p3 were never executed.

    Answering your exact question, the partition pruning in queries with joins described in the question is not implemented in Postgres (yet?)