pytorchconv-neural-networktensorautomatic-mixed-precision

I want to strictly use Tensor Cores for running inference of a pretrained full precision CNN model in Pytorch


I have been analyzing the maximum throughput I can get from my device for a specific CNN model using a GPU. My GPU has CUDA cores as well as Tensor cores. So I want to simultaneous run the model on both the type of cores simultaneously and check the maximum possible throughput I can get.

I did use with torch.cuda.amp.autocast() to make sure that the model leverages automatic mixed precision and hence, it should also be able to use tensor cores. However, when I ran the tests with inference running for full precision and amp simultaneously, it provided a throughput which was more than that with full precision (when run separately), but was lesser than that with amp (when run separately). This means that Pytorch is for sure not able to use Tensor Cores, because if that was the case, I would have gotten the throughput which is almost equivalent to the sum of both the cases.

Is there a way I could toggle the use of Tensor Cores so that Pytorch uses them?


Solution

  • So I thought to rather address this question myself, months after I figured it out myself, in case someone comes looking for an answer to this stupid sounding, yet critical question.

    AMP, or automatic mixed precision, as the name suggests, allows you to use whatever accelerators you have to run the workload, the catch being each accelerator would execute the workload in its own supported precision (FP32, FP16, mixed precision, etc.). So, to answer the question, until you are using cuda to program and assign the workloads directly to a hardware accelerator, it is not possible to assign it just to a specific hardware. The only thing you do is that you can specify what bare minimum precision you want and the NVIDIA compilers would themselves assign the workloads to all the compatible hardware in order to get the workload executed with the required precision.

    Final Verdict: You cannot individually toggle any portions of the NVIDIA Hardware on or off.