cudansight-compute

Using ncu to profile pagefault in unified memory


is there any option to profile unified virtual memory CUDA application with Nsight Compute (NCU)? For example, I want to know the time spending on handling page fault and migration.


Solution

  • Finally, I figure out the solution by myself. Just need to specify --export=json to output the profiling result into json file to get the detailed metrics of page fault. The overall profiling command looks like this.

    nsys profile \
         --force-overwrite=true \
         --cuda-um-gpu-page-faults=true \
         --cuda-um-cpu-page-faults=true \
         --export=json \
         ./yourapplication