So i have a table with 1000 rows, i simply say select * from hugedata limit 1
the query plan simply says table scan
then limit operator
attached is query plan
[
{
"Plan": {
"Node Type": "Limit",
"Parallel Aware": false,
"Async Capable": false,
"Startup Cost": 0.00,
"Total Cost": 0.02,
"Plan Rows": 1,
"Plan Width": 41,
"Actual Startup Time": 0.007,
"Actual Total Time": 0.007,
"Actual Rows": 1,
"Actual Loops": 1,
"Output": ["pk", "description", "flags"],
"Shared Hit Blocks": 1,
"Shared Read Blocks": 0,
"Shared Dirtied Blocks": 0,
"Shared Written Blocks": 0,
"Local Hit Blocks": 0,
"Local Read Blocks": 0,
"Local Dirtied Blocks": 0,
"Local Written Blocks": 0,
"Temp Read Blocks": 0,
"Temp Written Blocks": 0,
"WAL Records": 0,
"WAL FPI": 0,
"WAL Bytes": 0,
"Plans": [
{
"Node Type": "Seq Scan",
"Parent Relationship": "Outer",
"Parallel Aware": false,
"Async Capable": false,
"Relation Name": "hugedata",
"Schema": "public",
"Alias": "hugedata",
"Startup Cost": 0.00,
"Total Cost": 15406.01,
"Plan Rows": 1000001,
"Plan Width": 41,
"Actual Startup Time": 0.006,
"Actual Total Time": 0.006,
"Actual Rows": 1,
"Actual Loops": 1,
"Output": ["pk", "description", "flags"],
"Shared Hit Blocks": 1,
"Shared Read Blocks": 0,
"Shared Dirtied Blocks": 0,
"Shared Written Blocks": 0,
"Local Hit Blocks": 0,
"Local Read Blocks": 0,
"Local Dirtied Blocks": 0,
"Local Written Blocks": 0,
"Temp Read Blocks": 0,
"Temp Written Blocks": 0,
"WAL Records": 0,
"WAL FPI": 0,
"WAL Bytes": 0
}
]
},
"Settings": {
},
"Planning": {
"Shared Hit Blocks": 0,
"Shared Read Blocks": 0,
"Shared Dirtied Blocks": 0,
"Shared Written Blocks": 0,
"Local Hit Blocks": 0,
"Local Read Blocks": 0,
"Local Dirtied Blocks": 0,
"Local Written Blocks": 0,
"Temp Read Blocks": 0,
"Temp Written Blocks": 0
},
"Planning Time": 0.035,
"Triggers": [
],
"Execution Time": 0.069
}
]
The question is How is limit clause applied in postgres?
OR
From my search it tells me limit operator works as the 1st point suggested, if this is true why so inefficient implementation, or am i missing anything?
Limit docs say's it takes limit into account when generating query plan but does it Yield at N+1 row or "taken into account" means something else is not specified... https://www.postgresql.org/docs/current/queries-limit.html
The PostgreSQL executor works on demand and top down.
To calculate the first result row, it fetches the first row from the Limit
node. That in turn fetches the first row from the Seq Scan
.
When PostgreSQL fetches the second result row from the Limit
node, it is informed that the execution is done.
in sum, only a single row is fetched from the sequential scan.
You can find this documented in the executor README:
The executor processes a tree of "plan nodes". The plan tree is essentially a demand-pull pipeline of tuple processing operations. Each node, when called, will produce the next tuple in its output sequence, or NULL if no more tuples are available. If the node is not a primitive relation-scanning node, it will have child node(s) that it calls in turn to obtain input tuples.