nvprof warning on CUDA_VISIBLE_DEVICES

When I use os.environ['CUDA_VISIBLE_DEVICES'] in pytorch, I get the following message

Warning: Device on which events/metrics are configured are different than the device on which it is being profiled. One of the possible reason is setting CUDA_VISIBLE_DEVICES inside the application.

What does this actually mean? How can I avoid this by using 'CUDA_VISIBLE_DEVICES' (not torch.cuda.set_device())?

Here is the code in pytorch test.py

import torch
import os
os.environ['CUDA_VISIBLE_DEVICES'] = '1'
g = 1
c1 = 512
c2 = 512
input = torch.randn(64, c1, 28, 28).cuda()
model = nn.Sequential(
      nn.Conv2d(c1,c2,1,groups=g),
      nn.ReLU(),
      nn.Conv2d(c1,c2,1,groups=g),
      nn.ReLU(),
      nn.Conv2d(c1,c2,1,groups=g),
      nn.ReLU(),
      nn.Conv2d(c1,c2,1,groups=g),
      nn.ReLU(),
      nn.Conv2d(c1,c2,1,groups=g),
      nn.ReLU(),
    ).cuda()
out = model(input)

and the command:

nvprof --analysis-metrics -o metrics python test.py

Solution

What does this actually mean?

It means that nvprof started profiling your code on a GPU context which you made unavailable by setting CUDA_VISIBLE_DEVICES.

How can I avoid this by using CUDA_VISIBLE_DEVICES (not torch.cuda.set_device())?

Probably like this:

import os
os.environ['CUDA_VISIBLE_DEVICES'] = '1'

import torch
....

I know nothing about pytorch, but I would guess that importing the library triggers a lot of CUDA activity you don't see. If you import the library after you set CUDA_VISIBLE_DEVICES, I suspect the whole problem will disappear.

If that doesn't work then you would have no choice but to not set CUDA_VISIBLE_DEVICES within the python code at all, and instead do this:

CUDA_VISIBLE_DEVICES=1 nvprof --analysis-metrics -o metrics python test.py