I am having trouble finding out where the data for local memory usage is. Right now, I only know to look for STL instructions in the source. I wish I could find concrete numbers.
The very short answer is apparently that NSight Compute currently doesn’t show local memory spills.
However:
-Xptxas=“-v”
i.e. turn on verbose output from the assembler.cuFuncGetAttribute
API with the CU_FUNC_ATTRIBUTE_LOCAL_SIZE_BYTES
attribute if you have a handle to the functionlocal_size_bytes
attribute which is automagically populated after compilation.[answer assembled from comments and added as a community wiki entry]