postgresqlopenvas

PostgreSQL slow query


Having OpenVAS (backed by PostgreSQL) instance that is slow when opening 'Tasks' tab.

Following query runs 22 seconds in PostgreSQL. Any suggestions how this can be optimized?

SELECT id, host,
       iso_time (start_time), iso_time (end_time),
       current_port, max_port, report,
       (SELECT uuid FROM reports WHERE id = report),
       (SELECT uuid FROM hosts
        WHERE id = (SELECT host FROM host_identifiers
                    WHERE source_type = 'Report Host'
                      AND name = 'ip'
                      AND source_id = (SELECT uuid FROM reports
                                       WHERE id = report)
                      AND value = report_hosts.host
                    LIMIT 1)
       )
FROM report_hosts
WHERE report = 702;

The plan is

 Index Scan using report_hosts_by_report on report_hosts  (cost=0.42..1975570.99 rows=447 width=38) (actual time=50.042..22979.257 rows=1206 loops=1)
   Index Cond: (report = 702)
   SubPlan 1
     ->  Index Scan using reports_pkey on reports  (cost=0.28..2.49 rows=1 width=37) (actual time=0.004..0.004 rows=1 loops=1206)
           Index Cond: (id = report_hosts.report)
   SubPlan 4
     ->  Index Scan using hosts_pkey on hosts  (cost=4414.37..4416.59 rows=1 width=37) (actual time=0.001..0.001 rows=0 loops=1206)
           Index Cond: (id = $4)
           InitPlan 3 (returns $4)
             ->  Limit  (cost=2.49..4414.09 rows=1 width=4) (actual time=18.998..18.998 rows=0 loops=1206)
                   InitPlan 2 (returns $2)
                     ->  Index Scan using reports_pkey on reports reports_1  (cost=0.28..2.49 rows=1 width=37) (actual time=0.001..0.001 rows=1 loops=1206)
                           Index Cond: (id = report_hosts.report)
                   ->  Seq Scan on host_identifiers  (cost=0.00..4411.60 rows=1 width=4) (actual time=18.997..18.997 rows=0 loops=1206)
                         Filter: ((source_type = 'Report Host'::text) AND (name = 'ip'::text) AND (source_id = $2) AND (value = report_hosts.host))
                         Rows Removed by Filter: 99459
 Planning time: 0.531 ms
 Execution time: 22979.575 ms

Solution

  • All the time is spent in the 1206 sequential scans of host_identifiers.

    Try to replace the subqueries with joins:

    SELECT rh.id, rh.host,
           iso_time(rh.start_time), iso_time(rh.end_time),
           rh.current_port, rh.max_port, rh.report,
           r.uuid,
           h.uuid
    FROM report_hosts AS rh
       LEFT JOIN reports AS r
          ON rh.report = r.id
       LEFT JOIN host_identifiers AS hi
          ON hi.source_id = r.uuid
             AND hi.value = rh.host
             AND hi.source_type = 'Report Host'
             AND hi.name = 'ip'
       LEFT JOIN hosts AS h
          ON h.id = hi.host
    WHERE rh.report = 702;
    

    This is not exactly the same because it does not account for the LIMIT 1 that makes little sense without an ORDER BY, but it should be close to the truth.

    Proper indexes will make it fast (if they don't yet exist):

    Your query is hard to read because you don't qualify the columns with the table name.