python-3.xkerasgpufeature-extractiontesla

GPU performance during feature extraction (Tesla K80)


I am using the following code to extract features from about 4000 images divided over 30 classes.

 for i, label in enumerate(train_labels):
        cur_path = train_path + "/" + label
        count = 1
        for image_path in glob.glob(cur_path + "/*.jpg"):
            img = image.load_img(image_path, target_size=image_size)
            x = image.img_to_array(img)
            x = np.expand_dims(x, axis=0)
            x = preprocess_input(x)
            feature = model.predict(x)
            flat = feature.flatten()
            features.append(flat)
            labels.append(label)
            print ("[INFO] processed - " + str(count))
        count += 1
    print ("[INFO] completed label - " + label)

Although, my entire dataset is much larger and is up to 80,000 images. When looking at my GPU memory this bit of code works in Keras (2.1.2) for the 4000 images but takes almost up all my 5gig video RAM of my Tesla K80. I was wondering if I could improve my performance by changing the batch_size or is the way this code works just to heavy for my GPU and should I rewrite it?

Thanks!


Solution

  • There are two possible solutions.

    1)I'm assuming you're storing your images in Numpy array format. This is very memory intensive. Instead, store it as a normal list. When application demands convert it into numpy array. In my case, it reduced memory consumption by 10 times. If your already storing it as a list then 2 solution might solve your problem.

    2)Store the results in chunks and Use a generator when feeding it into another model.

    chunk_of_features=[]
    chunk_of_labels=[]
    i=0
    for i, label in enumerate(train_labels):
            cur_path = train_path + "/" + label
            count = 1
            for image_path in glob.glob(cur_path + "/*.jpg"):
                i+=1
                img = image.load_img(image_path, target_size=image_size)
                x = image.img_to_array(img)
                x = np.expand_dims(x, axis=0)
                x = preprocess_input(x)
                feature = model.predict(x)
                flat = feature.flatten()
                chunk_of_features.append(flat)
                chunk_of_labels.append(label)
                if i==4000:
                    with open('useSomeCountertoPreventNameConflict','wb') as output_file:
                        pickle.dump(chunk_of_features,output_file)
                    with open('useSomeCountertoPreventNameConflict','wb') as output_file:
                        pickle.dump(chunk_of_labels,output_file)
                    chunk_of_features=[]
                    chunk_of_labels=[]
                    i=0
    
                print ("[INFO] processed - " + str(count))
            count += 1
        print ("[INFO] completed label - " + label)