Join performance in postgres

I have 2 tables scheduling_flownode and xact_message with weak relation between them. I am trying to execute the following query

set search_path='ad_96d5be';
explain analyze 
SELECT f.id, f.target_object_id 
FROM "scheduling_flownode" f, 
     "xact_message" m 
where f.target_object_id = m.id 
and f.root_node=True 
AND f.state=1 
and m.state=4 
and m.templatelanguage_id IN (17, 18, 19, 20, 21, 22, 23, 24);

On execution, I get the following query plan

  Gather  (cost=252701.26..1711972.04 rows=374109 width=8) (actual time=17737.908..164181.063 rows=441130 loops=1)
   Workers Planned: 2
   Workers Launched: 2
   Buffers: shared hit=35705 read=1346425, temp read=18190 written=18148
   ->  Hash Join  (cost=251701.26..1673561.14 rows=155879 width=8) (actual time=18805.587..163991.468 rows=147043 loops=3)
         Hash Cond: (f.target_object_id = m.id)
         Buffers: shared hit=35705 read=1346425, temp read=18190 written=18148
         ->  Parallel Bitmap Heap Scan on scheduling_flownode f  (cost=124367.21..1523127.76 rows=2061083 width=8) (actual time=963.910..155466.840 rows=1642157 loops=3)
               Recheck Cond: (state = 1)
               Rows Removed by Index Recheck: 44
               Filter: root_node
               Rows Removed by Filter: 12406874
               Heap Blocks: exact=10570 lossy=427078
               Buffers: shared read=1328631
               ->  Bitmap Index Scan on "root-node-and-state"  (cost=0.00..123130.57 rows=4946600 width=0) (actual time=955.044..955.045 rows=4926472 loops=1)
                     Index Cond: ((root_node = true) AND (state = 1))
                     Buffers: shared read=13464
         ->  Hash  (cost=120677.64..120677.64 rows=405712 width=4) (actual time=7124.131..7124.131 rows=441128 loops=3)
               Buckets: 131072  Batches: 8  Memory Usage: 2966kB
               Buffers: shared hit=35591 read=17793, temp written=3384
               ->  Bitmap Heap Scan on xact_message m  (cost=7893.56..120677.64 rows=405712 width=4) (actual time=61.307..6925.456 rows=441128 loops=3)
                     Recheck Cond: (state = 4)
                     Filter: (templatelanguage_id = ANY ('{17,18,19,20,21,22,23,24}'::integer[]))
                     Rows Removed by Filter: 4
                     Heap Blocks: exact=16585
                     Buffers: shared hit=35591 read=17793
                     ->  Bitmap Index Scan on "state-index"  (cost=0.00..7792.13 rows=421826 width=0) (actual time=58.781..58.781 rows=441132 loops=3)
                           Index Cond: (state = 4)
                           Buffers: shared hit=2420 read=1209
 Planning time: 1.382 ms
 Execution time: 164289.481 ms
(31 rows)

scheduling_flownode here has over 400,00,000 entries and xact_message has about 50,00,000 rows. Working on postgres 10, am I wrong to believe that this much load should be handled easily by postgres ? If it can, am I doing something wrong with the query here ?

Solution

You didn't show what indexes you have, but I strongly recommend that your indexes will cover all of the columns that you are filtering on.

In Postgres 11 that can be done by using COVERING indexes, so for example on table scheduling_flownode you will have an index like:

CREATE INDEX ix_scheduling_flownode_target_object_id 
  ON scheduling_flownode(target_object_id) 
    INCLUDE (state, root_node);

In Postgres 10 simply include the columns in the index:

CREATE INDEX ix_scheduling_flownode_target_object_id 
  ON scheduling_flownode(target_object_id, state, root_node);

Do the same for table xact_message with templatelanguage_id and state.