Having OpenVAS (backed by PostgreSQL) instance that is slow when opening 'Tasks' tab.
Following query runs 22 seconds in PostgreSQL. Any suggestions how this can be optimized?
SELECT id, host,
iso_time (start_time), iso_time (end_time),
current_port, max_port, report,
(SELECT uuid FROM reports WHERE id = report),
(SELECT uuid FROM hosts
WHERE id = (SELECT host FROM host_identifiers
WHERE source_type = 'Report Host'
AND name = 'ip'
AND source_id = (SELECT uuid FROM reports
WHERE id = report)
AND value = report_hosts.host
LIMIT 1)
)
FROM report_hosts
WHERE report = 702;
The plan is
Index Scan using report_hosts_by_report on report_hosts (cost=0.42..1975570.99 rows=447 width=38) (actual time=50.042..22979.257 rows=1206 loops=1)
Index Cond: (report = 702)
SubPlan 1
-> Index Scan using reports_pkey on reports (cost=0.28..2.49 rows=1 width=37) (actual time=0.004..0.004 rows=1 loops=1206)
Index Cond: (id = report_hosts.report)
SubPlan 4
-> Index Scan using hosts_pkey on hosts (cost=4414.37..4416.59 rows=1 width=37) (actual time=0.001..0.001 rows=0 loops=1206)
Index Cond: (id = $4)
InitPlan 3 (returns $4)
-> Limit (cost=2.49..4414.09 rows=1 width=4) (actual time=18.998..18.998 rows=0 loops=1206)
InitPlan 2 (returns $2)
-> Index Scan using reports_pkey on reports reports_1 (cost=0.28..2.49 rows=1 width=37) (actual time=0.001..0.001 rows=1 loops=1206)
Index Cond: (id = report_hosts.report)
-> Seq Scan on host_identifiers (cost=0.00..4411.60 rows=1 width=4) (actual time=18.997..18.997 rows=0 loops=1206)
Filter: ((source_type = 'Report Host'::text) AND (name = 'ip'::text) AND (source_id = $2) AND (value = report_hosts.host))
Rows Removed by Filter: 99459
Planning time: 0.531 ms
Execution time: 22979.575 ms
All the time is spent in the 1206 sequential scans of host_identifiers
.
Try to replace the subqueries with joins:
SELECT rh.id, rh.host,
iso_time(rh.start_time), iso_time(rh.end_time),
rh.current_port, rh.max_port, rh.report,
r.uuid,
h.uuid
FROM report_hosts AS rh
LEFT JOIN reports AS r
ON rh.report = r.id
LEFT JOIN host_identifiers AS hi
ON hi.source_id = r.uuid
AND hi.value = rh.host
AND hi.source_type = 'Report Host'
AND hi.name = 'ip'
LEFT JOIN hosts AS h
ON h.id = hi.host
WHERE rh.report = 702;
This is not exactly the same because it does not account for the LIMIT 1
that makes little sense without an ORDER BY
, but it should be close to the truth.
Proper indexes will make it fast (if they don't yet exist):
reports(id)
host_identifiers(source_id, value)
hosts(id)
Your query is hard to read because you don't qualify the columns with the table name.