[SOLVED] For instanced draw calls, does the vertex count make a difference to performance?

For instanced draw calls, does the vertex count make a difference to performance?

Example: One wants to render a 100,000 x 100,000 grid. For the sake of argument the grid cannot be loaded as a model and must be manually created during runtime, i.e. a nested for-loop.

Obviously instancing would be the fastest way to render this. But is it better to create a 10x10 mesh and instance it 100,000,000 times, or is it faster to create a 10,000x10,000 mesh and instance it 100 times?

There's the cost of the draw call to consider but there's also the initial startup cost of generating the mesh.

Solution

Broadly speaking, you don't want instances that are too small nor instances that are too large. There is per-instance overhead, and instances cannot share vertices between them, so your 10x10 grid would involve a lot of redundant computations.

That being said, if you really are drawing a grid... you don't need instances at all. That "manual for loop" could probably be in the vertex shader. Reading memory is almost always slower these days than computing values. Even if you can't put the whole loop computation there, it may be possible that you can compute the X/Y just from gl_VertexID, so you'd only have to read the Z value from storage.