pythonmachine-learningimage-processingnumpy-ndarrayfeature-descriptor

How to remove specific data in image processing


I am having images data and I am using it for training my machine learning using SIFT, but my data have problems which some images contain 0 image descriptor. So my result when I am finishing my training and testing only reach 56% (Of course, it not the result I expected). To resolve this problem, I decide to remove some images which contain 0 image descriptor. However, I can only remove from images_descriptor array which contains images descriptor. The problem here is do not know what images I have to remove so I can remove their 'target'.
My data has a shape: (15000, 64, 64, 3)

my code so far for this :

X = data['data']
y = data['targets']

#Extract image descriptor using sift from X(Which is the data of your images)
images_descriptor = extract_sift_feature(X)

index_list = []
filter_images_descriptor = []
for i in range(len(images_descriptor)):
    if images_descriptor[i] is not None:
        filter_images_descriptor.append(images_descriptor[i])
        
    if images_descriptor[i] is None:
        index = np.where(images_descriptor == images_descriptor[i]) 
        index_list.append(index[0])     
filter_images_descriptor = np.array(filter_images_descriptor)

I am trying to create an index to get index from images_descriptor so that I can know the position of which image contain 0 image descriptor by using np.where. Then I can delete which image I have delete from images_descriptor in y. But the result that I receive for this is: (array([], dtype=int64).


Solution

  • For this problem that some of the images I have in the data contain no features. SO I give it the solution to delete any data that has no feature

    def extract_sift_feature(X, y):
        images_descriptor = []
        filter_images_descriptor = []
        NoneType_index_list = []
        sift = cv2.SIFT_create()
    
        for i in range(len(X)):
            _kp, des = sift.detectAndCompute(X[i], None)
            images_descriptor.append(des)
            #Check if there any image has 0 feature descriptor 
            if des is None:
              NoneType_index_list.append(i)
        images_descriptor = np.array(images_descriptor)
    
        #Filter image any image has 0 feature descriptor 
        for i in range(len(images_descriptor)):
            if images_descriptor[i] is not None:
                filter_images_descriptor.append(images_descriptor[i])
        filter_images_descriptor = np.array(filter_images_descriptor)
    
        new_y = np.delete(y, NoneType_index_list)
        return filter_images_descriptor, new_y