tensorflowkerasnvidiacudnn

Tensorflow not running on GPU


I have aldready spent a considerable of time digging around on stack overflow and else looking for the answer, but couldn't find anything

Hi all,

I am running Tensorflow with Keras on top. I am 90% sure I installed Tensorflow GPU, is there any way to check which install I did?

I was trying to do run some CNN models from Jupyter notebook and I noticed that Keras was running the model on the CPU (checked task manager, CPU was at 100%).

I tried running this code from the tensorflow website:

# Creates a graph.
a = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[2, 3], name='a')
b = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[3, 2], name='b')
c = tf.matmul(a, b)
# Creates a session with log_device_placement set to True.
sess = tf.Session(config=tf.ConfigProto(log_device_placement=True))
# Runs the op.
print(sess.run(c))

And this is what I got:

MatMul: (MatMul): /job:localhost/replica:0/task:0/cpu:0
2017-06-29 17:09:38.783183: I c:\tf_jenkins\home\workspace\release-win\m\windows\py\35\tensorflow\core\common_runtime\simple_placer.cc:847] MatMul: (MatMul)/job:localhost/replica:0/task:0/cpu:0
b: (Const): /job:localhost/replica:0/task:0/cpu:0
2017-06-29 17:09:38.784779: I c:\tf_jenkins\home\workspace\release-win\m\windows\py\35\tensorflow\core\common_runtime\simple_placer.cc:847] b: (Const)/job:localhost/replica:0/task:0/cpu:0
a: (Const): /job:localhost/replica:0/task:0/cpu:0
2017-06-29 17:09:38.786128: I c:\tf_jenkins\home\workspace\release-win\m\windows\py\35\tensorflow\core\common_runtime\simple_placer.cc:847] a: (Const)/job:localhost/replica:0/task:0/cpu:0
[[ 22.  28.]
 [ 49.  64.]]

Which to me shows I am running on my CPU, for some reason.

I have a GTX1050 (driver version 382.53), I installed CUDA, and Cudnn, and tensorflow installed without any problems. I installed Visual Studio 2015 as well since it was listed as a compatible version.

I remember CUDA mentioning something about an incompatible driver being installed, but if I recall correctly CUDA should have installed its own driver.

Edit: I ran theses commands to list the available devices

from tensorflow.python.client import device_lib
print(device_lib.list_local_devices())

and this is what I get

[name: "/cpu:0"
device_type: "CPU"
memory_limit: 268435456
locality {
}
incarnation: 14922788031522107450
]

and a whole lot of warnings like this

2017-06-29 17:32:45.401429: W c:\tf_jenkins\home\workspace\release-win\m\windows\py\35\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE instructions, but these are available on your machine and could speed up CPU computations.

Edit 2

Tried running

pip3 install --upgrade tensorflow-gpu

and I get

Requirement already up-to-date: tensorflow-gpu in c:\users\xxx\appdata\local\programs\python\python35\lib\site-packages
Requirement already up-to-date: markdown==2.2.0 in c:\users\xxx\appdata\local\programs\python\python35\lib\site-packages (from tensorflow-gpu)
Requirement already up-to-date: html5lib==0.9999999 in c:\users\xxx\appdata\local\programs\python\python35\lib\site-packages (from tensorflow-gpu)
Requirement already up-to-date: werkzeug>=0.11.10 in c:\users\xxx\appdata\local\programs\python\python35\lib\site-packages (from tensorflow-gpu)
Requirement already up-to-date: wheel>=0.26 in c:\users\xxx\appdata\local\programs\python\python35\lib\site-packages (from tensorflow-gpu)
Requirement already up-to-date: bleach==1.5.0 in c:\users\xxx\appdata\local\programs\python\python35\lib\site-packages (from tensorflow-gpu)
Requirement already up-to-date: six>=1.10.0 in c:\users\xxx\appdata\local\programs\python\python35\lib\site-packages (from tensorflow-gpu)
Requirement already up-to-date: protobuf>=3.2.0 in c:\users\xxx\appdata\local\programs\python\python35\lib\site-packages (from tensorflow-gpu)
Requirement already up-to-date: backports.weakref==1.0rc1 in c:\users\xxx\appdata\local\programs\python\python35\lib\site-packages (from tensorflow-gpu)
Requirement already up-to-date: numpy>=1.11.0 in c:\users\xxx\appdata\local\programs\python\python35\lib\site-packages (from tensorflow-gpu)
Requirement already up-to-date: setuptools in c:\users\xxx\appdata\local\programs\python\python35\lib\site-packages (from protobuf>=3.2.0->tensorflow-gpu)

Solved: Check comments for solution. Thanks to all who helped!

I am new to this, so any help is greatly appreciated! Thank you.


Solution

  • To check which devices are available to TensorFlow you can use this and see if the GPU cards are available:

    from tensorflow.python.client import device_lib
    print(device_lib.list_local_devices())
    

    More info

    There are also C++ logs available controlled by the TF_CPP_MIN_VLOG_LEVEL env variable, e.g.:

    import os
    os.environ["TF_CPP_MIN_VLOG_LEVEL"] = "2"
    

    should allow them to be printed when running import tensorflow as tf.

    You should see this kind of logs if you use GPU-enabled tensorflow with proper access to the GPU machine:

    successfully opened CUDA library libcublas.so.*.* locally
    successfully opened CUDA library libcudnn.so.*.*  locally
    successfully opened CUDA library libcufft.so.*.*  locally
    

    On the other hand, if there are no CUDA libraries in the system / container, you will see:

    Could not find cuda drivers on your machine, GPU will not be used.
    

    and where CUDA are installed, but there is no GPU physically available, TF will import cleanly and error only later, when you run device_lib.list_local_devices() with this:

    failed call to cuInit: CUDA_ERROR_NO_DEVICE: no CUDA-capable device is detected