pythonjupyter-notebookcudatensorflow2.0numba

Freeing and Reusing GPU in Tensorflow


I would like to free and Reuse the GPU while using Tensorflow in a jupyter notebook.

I imagine a workflow like this:

  1. Make a TF calculation.
  2. Free the GPU
  3. Wait a while
  4. Step 1. again.

This is the code i use right no. Steps 1 to 3 are working step 4 is not:

import time

import tensorflow as tf
from numba import cuda 


def free_gpu():
    device = cuda.get_current_device()
    cuda.close()

def test_calc():
    a = tf.constant([[1.0, 2.0, 3.0], [4.0, 5.0, 6.0]])   
    b = tf.constant([[1.0, 2.0], [3.0, 4.0], [5.0, 6.0]])

    # Run on the GPU
    c = tf.matmul(a, b)

test_calc()
free_gpu()
time.sleep(10)
test_calc()

If i run this code in Jupyter Notebooks my kernel just dies. Is there a alternetiv to cuda.close() and cuda.close() that frees the GPU while not breaking TF?


Solution

  • Yes, building somewhat of what @talonmies said, do not bring numba into this whatsoever. It's basically incompatible with the TensorFlow API.

    Here is a solution where you completely free the GPU. Basically, you can launch the TF computations in a separate process, return any result that you care about, and then close the process. TensorFlow notably has issues regarding freeing GPU memory.

    from multiprocessing import Process, Queue
    import tensorflow as tf
    
    def test_calc(q):
        a = tf.constant([[1.0, 2.0, 3.0], [4.0, 5.0, 6.0]])
        b = tf.constant([[1.0, 2.0], [3.0, 4.0], [5.0, 6.0]])
    
        # Run on the GPU
        c = tf.matmul(a, b)
        q.put(c.numpy())
    
    q = Queue()
    p = Process(target=test_calc, args=(q,))
    p.start()
    p.join()
    result = q.get()
    

    Edit: To make this a little less invasive this could be written as a decorator.

    from multiprocessing import Process, Queue
    import tensorflow as tf
    
    def _queue_results(func, q, *args, **kwargs):
        result = func(*args, **kwargs)
        q.put(result)
    
    def free_gpu(func):
    
        def wrapper(*args, **kwargs):
            q = Queue(maxsize=1)
            p = Process(target=_queue_results,
                        args=(func, q, *args),
                        kwargs=kwargs)
            p.start()
            p.join()
            result = q.get()
            q.close()
            p.close()
            return result
    
        return wrapper
    
    @free_gpu
    def test_calc():
        a = tf.constant([[1.0, 2.0, 3.0], [4.0, 5.0, 6.0]])
        b = tf.constant([[1.0, 2.0], [3.0, 4.0], [5.0, 6.0]])
    
        # Run on the GPU
        c = tf.matmul(a, b)
        return c.numpy()  # NOTE: cannot return tf.Tensor objects (causes GPU not to free)
    
    result = test_calc()