sqlpostgresqlpostgres15

Execution plan for postgres view combining 2 partitioned tables is accessing all partitions


I have two partitioned tables with similar range partitions (where date_key is a date in integer form -- like 20230901 for Sep 1st, for example:

create table header (
   date_key int,
   header_key bigint,
   other_header_fields character varying(255),
   constraint pk_header primary key (date_key, header_key)
) partition by range (date_key);

create table detail (
   date_key int,
   header_key bigint,
   detail_key bigint,
   other_detail_fields character varying(255), -- [. . .] more fields
   constraint pk_detail primary key (date_key, header_key, detail_key)
) partition by range (date_key);

Partitions are set up with monthly ranges (i.e):

create table header_p20230900 partition of header
   for values from (20230900) to (20231000);

These tables are used in a combined view like so:

create view v_record_details as
select 
   d.date_key,
   h.header_key,
   h.other_header_fields,
   d.detail_key,
   d.other_detail_fields
from detail       d
inner join header h on d.date_key = h.date_key and d.header_key = h.header_key;

When I select from this view, the execution plan does not limit itself to only required partitions from both tables:

select * 
from v_record_details
where date_key = 20230901;

This would limit partitions for the detail table (presumably since I'm selecting the date_key from the detail table); however, it's doing seq scan on all partitions from the header table before the index scan on the partitioned table.

I can't seem to figure out how to get the view to only access the single partition as desired. Is there any way out of this predicament without requiring something goofy like including the date_key from both tables and expecting users to put criteria on both fields?

I have tried modifying the view to include where h.date_key = d.date_key in the view, to no affect.

I've also tried putting explicit criteria in the select statement for the view -- when the view's select statement is executed directly with criteria, (i.e.):

select 
   d.date_key,
   h.header_key,
   h.other_header_fields,
   d.detail_key,
   d.other_detail_fields
from detail       d
inner join header h on d.date_key = h.date_key and d.header_key = h.header_key
where d.date_key = 20230901;

... the behavior is as intended and the execution plan only includes the correct partitions from each table.

It is only when written as a view that the execution plan is wrong.

UPDATE: Based on Richard Huxton's comment, I re-tested and discovered that the issue doesn't happen as stated above, when the select query from the view uses = in the where statement. It is only occurring when trying to put criteria on a range of dates, i.e.:

select *
from v_record_details
where d.date_key between 20230901 and 20230930

In that case, the select statement itself has the same problem as the view, it is scanning to access all partitions from the header table. Plan Analysis


Solution

  • You didn't share the entire query plan, but I get something comparable: The partition pruning for one of the tables works fine when using between (in my case on the table detail) but not on the other one.

    When I change the configuration item enable_partitionwise_join to on, both tables benefit from the partition pruning. The default (at least in all versions up to version 16) is off.

    SHOW enable_partitionwise_join;
    
    SET enable_partitionwise_join TO on;
    
    EXPLAIN __your_query__;
    

    And when everything works fine:

    ALTER SYSTEM SET enable_partitionwise_join TO on; -- superuser needed
    
    SELECT pg_reload_conf(); -- superuser needed