While trying to make a compositing window manager for X11 using OpenGL as backend, I'm caught in a nasty situation where glXSwapBuffers() blocks until vblank rendering the compositor irresponsive to X events which causes windows being dragged around to lag behind the cursor by roughly one frame. I've tried multithreading but it didn't work well, so I decide the only proper solution is to make glXSwapBuffers() asynchronous. It would be desirable to send drawing commands to GPU and return immediately without waiting for the operations to actually finish, and AFAIK this is possible under modern Linux with DRI2. So what should I do?
Actually glXSwapBuffers
should return immediately. What's blocking however is the very next OpenGL command that introduces a so called synchronization point. Usually this is the next glClear
that follows the call of glXSwapBuffers
.
Note that it's actually desireable to somehow synchronize with the V-Blank, otherwise nasty tearing artifacts happen. But you're right, that in a naive implementation this introduces about one display refresh interval of latency.
The big problem here is, that double buffered windows redirected to an off-screen surface may still subjected to the active swap interval (i.e. V-Sync setting); and of course double buffering itself doesn't make a lot of sense in a composited setting.
So here's what you can do: Use a swap interval extension to set your compositor's swap interval to 0 (no V-Sync); depending on your system's settings this choice may actually not be honored (user configured all applications are forced to V-Sync). Unfortunately there are several swap interval extensions and what works with one display driver doesn't work with another. I suggest you look at the swap interval example programs of Mesa and the sources of glxgears of Mesa, which contain code that deals with pretty much every situation you may encounter.
It's also desireable to somehow turn off V-Sync in the clients too. I don't see better way than injecting a shared object into them, hooking glXSwapBuffers
, glXCreateContext
and the swap interval extensions to override them.
Finally you must use one of the available video synchronization GLX extensions to implement a timed buffer swap in your compositor (i.e. call an "unsynchronized" glXSwapBuffers
at just the right moment when the V-Blank happens). With a direct OpenGL context and a realtime scheduling policy applied to the compositor process you can do that.
Note that all these issues are shortcomings in the existing X11 protocol. But if you think Wayland would get rid of these issues, think again. While Wayland was originally intended to make "every frame perfect" and do away with the synchronization issues, in practice I encountered many of the problems, again. In this blog post the creator of Wayland talks about roundtrips and overhead, but he completely avoids the point of pipeline synchronization and buffer swap latency. The problems are inherent to the concept of stacked composition and buffer swap based V-Sync. To really solve the issue there must be some kind of screen associated V-Sync event that's independent from graphics operations and can be applied an arbitrary phase offset, so that applications can synchronize their rendering loops with display refresh. And there should be an additional "framebuffer commit" function that makes the whole composition chain consider the newly arrived frame. This would allow the compositor to sync applications to a few 100µs before the V-Blank happens, so that composition can happen in that margin between framebuffer commit and V-Blank.