I'm using a Direct2D(flip model) for rendering of two images. The config of my swap chain is standard except it's using 3 buffers, no VSync, 2 frames queue of Present(0, 0). And I noticed that in a windowed mode I have around 3k FPS(using Nvidia Exp Overlay), but when I'm switichg to a full screen mode(using IDXGISwapChain::SetFullscreenState
or mixing window's styles and size) the FPS is dropping to 500-600 frames/sec. I also noticed that the frame time doesn't change it's still around 0.1-0.2 ms. But the load of GPU is dropping as hell from 60% to 17%.
But I found that if I move the window up and left to 1 px - x: -1; y: -1. And add 1 px to the size, I still getting the fullscreen but it's keeping 3k FPS.
The code which I'm using for this case:
SetWindowLong(mainWinHandle, GWL_STYLE, ~(WS_CAPTION | WS_THICKFRAME | WS_MINIMIZE |
WS_MAXIMIZE | WS_SYSMENU | WS_VSCROLL | WS_HSCROLL));
MoveWindow(mainWinHandle, -1, -1, monitorRes.cx + 1, monitorRes.cy + 1, true);
I know that in DX12 there is no more exlusive fullscreen mode and I found this answer about fullscreen mode. Mics said:
we enhanced the DWM to recognize when a game is running in a borderless full screen window with no other applications on the screen. In this circumstance, the DWM gives control of the display and almost all the CPU/GPU power to the game.
and:
If you find that you are having trouble with Full Screen Optimizations, such as performance regression or input lag, we have some steps that can be useful. This includes how to disable the feature for any specific game, but also how to provide us with feedback regarding your gaming experience.
Disabling that "optimization" don't work for my app. So, I guess that moving the borderless fullscreen sized window out from the monitor to 1 px is a kind of trick to deceit DWM.
Does anyone know what is going wrong and how to fix it legally?
EDIT: I played with that MoveWIndow trick:
swapChain->SetFullscreenState(true, NULL);
MoveWindow(mainWinHandle, -1, -1, monitorRes.cx + 1, monitorRes.cy + 1, true);
Actually it doesnt matter which x, y is using in MoveWindow. It can be also 0;0 and size can be a real display resolution - no window moving at all.
And this code not only fixes the FPS and GPU load drops but it also increasing the performance and make 100% of GPU usage and double the framerate from 3k up to 6k.
So, this makes me to have another question: Why does IDXGISwapChain::SetFullscreenState
by default(Without MoveWindow) cause performance drop?
So, in MSDN samples I found how they handle fullscreen mode. Resuming all that info that I found:
DX12 doesn't support FullScreen Exlusive mode(FSE). The method IDXGISwapChain::SetFullscreenState
makes window to be in borderless maximized state BUT it has some issues.
When you wanna use tearing support to get unlimmited FPS you MUST NOT use IDXGISwapChain::SetFullscreenState
because of what? Because of horrible design of several Windows components. In DX12 they rewrote DWM which controls each application window. And now DWM can detect when a window goes to fullscreen - you just changing it's styles to no borders and so on.
m_windowStyle = WS_OVERLAPPEDWINDOW; // as default style.
m_windowStyle &= ~(WS_CAPTION | WS_MAXIMIZEBOX | WS_MINIMIZEBOX | WS_SYSMENU | WS_THICKFRAME);
So, if you are using tearing support you MUST use styles above. See their example. Despite the fact that DX12 doesn't have FSE, logically IDXGISwapChain::SetFullscreenState
should make borderless window for full screen and "emulate" that the window is in FSE for backward compatibility, but it doesn't happen or doesn't do it quite right - windows classic situation.
How I fixed the performance drop. This is a piece of code which triggers on ALT + ENTER and it's following to MS sample from the link above:
if (enableFullscreen)
{
// Save the old window rect so we can restore it when exiting fullscreen mode.
GetWindowRect(_winHandle, &m_windowRect);
// Make the window borderless so that the client area can fill the screen.
SetWindowLong(_winHandle, GWL_STYLE, m_windowStyle &
~(WS_CAPTION | WS_MAXIMIZEBOX |
WS_MINIMIZEBOX | WS_SYSMENU | WS_THICKFRAME));
// Get the settings of the display on which the app's window is currently displayed.
IDXGIOutput* pOutput;
_swapChain->GetContainingOutput(&pOutput);
DXGI_OUTPUT_DESC Desc;
pOutput->GetDesc(&Desc);
SetWindowPos(_winHandle, HWND_TOPMOST,
Desc.DesktopCoordinates.left,
Desc.DesktopCoordinates.top,
Desc.DesktopCoordinates.right,
Desc.DesktopCoordinates.bottom,
SWP_FRAMECHANGED | SWP_NOACTIVATE);
ShowWindow(_winHandle, SW_MAXIMIZE);
pOutput->Release();
}
else
{
// Restore the window's attributes and size.
SetWindowLong(_winHandle, GWL_STYLE, m_windowStyle);
SetWindowPos(_winHandle, HWND_NOTOPMOST,
m_windowRect.left, m_windowRect.top,
m_windowRect.right - m_windowRect.left,
m_windowRect.bottom - m_windowRect.top,
SWP_FRAMECHANGED | SWP_NOACTIVATE);
ShowWindow(_winHandle, SW_NORMAL);
}
Do IDXGISwapChain::ResizeBuffers call.
Following this practice you will have at least same performance in windowed and fullscreen modes. Now my GPU load is 77-80% in both modes.
1. UPDATE:
I tested the solution for two days and found that sometimes it still can have FPS drop in fullscreen. I also implemented WM_SIZE
case in WinProc and SetWindowBounds
, OnSizeChanged
function like in MS example. Now it seems to work.
It can't handle window's modes correctly if you are using PerMonitorHighDPIAware through build manifest in Visual Studio.
I needed to recreate a project to make it run properly. Old project still has FPS drop in fullscreen. Also it triggers memory leak debug events by DX(I didn't write any releases before the app exit) BUT in new project it just closes app without memory leak crash in debug mode. I compared two projects line by line in git, made some changes in old project, even made the same projects and either VS don't open the project or it don't affect on the runtime.
Thanks a lot microsoft for this bullshit.
The suggestion how to reach 100% GPU usage and boost your FPS is to optimize your CPU multithreading because now I have only 2/8 threads working on 80%. Of course there are some GPU limits like a frame queue and so on. I didnt learn CS of GPU but I guess there are some bottlenecks. For example if you are using 128 bit colors it will load the data bus of a graphics card. Also, having that frame queue(to sync with CPU and balance delay between two chips) we also have something like a "frames consuming". I.e GPU takes ~16 ms to display a frame it means that your GPU is consuming frames speed is 60 frames/sec. <= 1 ms then it's [~1000;~10k] frames/sec. NOTE that I'm meaning the delay in Present method.
If I was wrong in some cases, just correct me.