I am using a google deep learning VM from google marketplace and I opted for a NvdiaK80 GPU. I am trying to train an object detection model using object detection API. However, I notice that tensorflow is not using GPU by default(code to check is below)
My assumption here is that this instance comes with all the required NVIDIA drivers so it's not a driver related problem.
Further investigation showed that I had 2 installations of Tensorflow (tensorflow 1.12.0 and tensorflow-GPU 1.12.0). So I uninstalled the CPU version. However it still does not help.
I used the code below to check if tensorflow is using GPU
from tensorflow.python.client import device_lib
print(device_lib.list_local_devices())
For reference, I am using the below code for object detection training which is running fine on the deep learning VM but is not using GPU.
python $Tensor_path/legacy/train.py --logtostderr --
train_dir=$Train_path/training/ --
pipeline_config_path=$Train_path/training/
ssd_inception_v2_pets.config
Output(I would have expect the GPU device specifics that is being used)
[name: "/cpu:0"
device_type: "CPU"
memory_limit: 268435456
locality {
}
incarnation: 18292259467280600161
]
I was able to resolve this by deleting the old instance and starting fresh with a new instance. My guess is the tensorflow GPU installation got corrupted while installing object detection API. Followed the steps here to install https://cloud.google.com/solutions/creating-object-detection-application-tensorflow
And most likely this line is the culprit
pip install --upgrade
https://storage.googleapis.com/tensorflow/linux/cpu/tensorflow-1.1.0-cp27-none-
linux_x86_64.whl