I am trying profiling CPU/GPU applications, using Nsight suite.
Currently trying to understand a stuttering problem, I added a range around the simulation step (taking place on the CPU):
#include "3rd/nvToolsExt.h"
int main()
{
// ...
nvtxRangePush("Simulation");
scene.update(gSimulationDelta);
nvtxRangePop();
// ...
return 0;
}
After configuring the VS solution and copying the DLL next to the .exe
, the application compiles, links and run as expected.
Using the Visual Studio extension NVIDIA Nsight Integration 2020.2.0.0
, I launch a Nsight Systems 2022.3.4
Trace.
In the prepopulated project, I check Collect NVTX trace.
I click Start and get a report. Yet the NVTXmarkers are absent from the Timeline View, and I have several NVTX related warnings in Diagnostics Summary.
Notably:
NVTX_INJECTION64_PATH variable is missing from the environment variables of the process. Make sure the process was appropriately launched.
No NVTX events collected. Does the process use NVTX?
As a last resort idea, I added the system-wide environment variable NVTX_INJECTION64_PATH
with the value C:\Program Files\NVIDIA Corporation\Nsight Systems 2022.3.4\target-windows-x64\ToolsInjection64.dll
, but after relaunching everything the issue stays the same, and all warnings are still present.
How to have Nsight Systems show NVTX markers ?
I work on Nsight Systems and NVTX. A few things:
I don't see anything wrong with what you're doing, so it's likely a bug in the tools. Can you try launching your program directly from the standalone Nsight Systems GUI program instead of using the Visual Studio integration? That will determine if the problem is with Nsight Systems itself or the Visual Studio integration. Be sure to check the box in the Nsight Systems project to "Collect NVTX Trace", as you mentioned doing in VS.
You should not set NVTX_INJECTION64_PATH
yourself. Nsight Systems will set this environment variable automatically for any processes it launches with NVTX capture enabled. The only time you'd ever want to set this variable manually is if you write your own tool that captures NVTX calls.
And one last thing: This shouldn't affect the problem you are having here, but it looks like you are using an older version of NVTX that required linking against the nvToolsExt DLL. While the old version works fine and is still supported, I recommend switching to version 3, which is a header-only implementation of the same API, so linking against the DLL is no longer required. The new version's headers are located in the nvtx3
subdirectory, so to ensure you're using v3, the best practice is to include the directory name in your #include:
#include <nvtx3/nvToolsExt.h>
...and adjust your project's include search paths appropriately. The NVTX v3 headers ship with Nsight Systems and the CUDA Toolkit, so you probably already have them. You can also download them from github:
https://github.com/NVIDIA/NVTX/tree/release-v3/c/include
The old NVTX version that requires a DLL will be deprecated in CUDA 12, so it's a good idea to migrate any projects using NVTX to v3 soon.