sqlormselect-n-plus-1

Why is the n+1 selects pattern slow?


I'm rather inexperienced with databases and have just read about the "n+1 selects issue". My follow-up question: Assuming the database resides on the same machine as my program, is cached in RAM and properly indexed, why is the n+1 query pattern slow?

As an example let's take the code from the accepted answer:

SELECT * FROM Cars;

/* for each car */
SELECT * FROM Wheel WHERE CarId = ?

With my mental model of the database cache, each of the SELECT * FROM Wheel WHERE CarId = ? queries should need:

Even if we multiply that by a small constant factor for an additional overhead because of the internal memory structure, it still should be unnoticeably fast. Is the interprocess communication the bottleneck?


Edit: I just found this related article via Hacker News: Following a Select Statement Through Postgres Internals. - HN discussion thread.

Edit 2: To clarify, I do assume N to be large. A non-trivial overhead will add up to a noticeable delay then, yes. I am asking why the overhead is non-trivial in the first place, for the setting described above.


Solution

  • You are correct that avoiding n+1 selects is less important in the scenario you describe. If the database is on a remote machine, communication latencies of > 1ms are common, i.e. the cpu would spend millions of clock cycles waiting for the network.

    If we are on the same machine, the communication delay is several orders of magnitude smaller, but synchronous communication with another process necessarily involves a context switch, which commonly costs > 0.01 ms (source), which is tens of thousands of clock cycles.

    In addition, both the ORM tool and the database will have some overhead per query.

    To conclude, avoiding n+1 selects is far less important if the database is local, but still matters if n is large.