pythonmachine-learningscikit-learn

How fix "found array with dim 4"Error when using ML algorthims to classify image


I have a simple ML classification problem. I have 8 folder each one represent class so I have first load these images from folders and assign labels and then save it as csv file (code in below)

def load_images_from_folder(root_folder):
 image_paths = []
 images = []
 labels = []
    for label in os.listdir(root_folder):
        label_path = os.path.join(root_folder, label)
        if os.path.isdir(label_path):
            for filename in os.listdir(label_path):
                img_path = os.path.join(label_path, filename)
                if os.path.isfile(img_path) and (filename.endswith(".jpg"):
                img = Image.open(img_path)
                img = img.resize((128, 128))
                img_array = np.array(img)
                image_paths.append(img_path)
                images.append(img_array)
                labels.append(label)
 return image_paths, images, labels
if __name__ == "__main__":
root_folder_path = "./Datasets_1"
image_paths, images, labels = load_images_from_folder(root_folder_path)

I then convert images and labels to DataFrame and load it

data = {"Images": image_paths, "Labels": labels}
df = pd.DataFrame(data)
df.to_csv("original_data.csv", index=False)
csv_file = "original_data.csv"
df = pd.read_csv(csv_file)

I'm also add a new column 'Encoded_Labels' to the DataFrame with the encoded labels and convert 'Encoded_Labels' column to integers

df['Encoded_Labels'] = encoded_labels
df['Encoded_Labels'] = df['Encoded_Labels'].astype(int)

Finally I have split the dataset into training and testing sets and preprocess images for training

train_df, test_df = train_test_split(df, test_size=0.2, random_state=42)
def load_and_preprocess_images(file_paths, target_size=(128, 128)):
    images = []
    for file_path in file_paths:
        img = Image.open(file_path)
        img = img.resize(target_size)
        img_array = np.array(img) / 255.0  # Normalize pixel values
        images.append(img_array)
    return np.array(images)

X_train = load_and_preprocess_images(train_df['Images'].values)
y_train = train_df['Encoded_Labels'].values
X_test = load_and_preprocess_images(test_df['Images'].values)
y_test = test_df['Encoded_Labels'].values**your text**

And the output shape of X_train is

(20624, 128, 128, 3)

For this point I have no problem and I can use it with DL models with no problem but when try to use ML models such as KNN, SVM, DT, etc. For examples codes in below

from sklearn.svm import SVC
svc = SVC(kernel='linear',gamma='auto')
svc.fit(X_train, y_train)`

or

knn_clf = KNeighborsClassifier()
knn_clf.fit(X_train, y_train)
y_pred = knn_clf.predict(X_test)
accuracy = metrics.accuracy_score(y_test, y_pred)
print("Accuracy of KNN Classifier : %.2f" % (accuracy*100))

I got this error

ValueError: Found array with dim 4. SVC expected <= 2.

How to fix this error?


Solution

  • In the example using sklearn.svm.SVC.fit(), the input is expected to be of shape (n_samples, n_features) (thus being 2-dimensional).

    In your case, each sample would be an image. To make your code technically work, you would thus need to flatten your X_train input and make each "raw" pixel value a feature,

    X_train_flat = X_train.reshape(X_train.shape[0], -1)
    

    which, in your example, would produce a (20624, 49152)-shaped array (as 128·128·3=49152), where each row is a flattened version of the corresponding image.

    What is often done instead of using the "raw" pixels as an input to SVM and similar classifiers, however, is using a set of "features" extracted from the images, to reduce the dimensionality of the data (i.e., in your example, using a (20624, d)-shaped array instead, where d<49152). This could be HOG features, for example, or the result of any other dimensionality reduction technique (it could even be the output of a neural network) – you might want to also have a look at this related question and its answers.