opencvdatasetcomputer-visionsvmtraining-data

OpenCV SVM training dataset


Lets say I have a dataset of about 350 positive images and more than 400 negative images. They aren't the same size. Also their size is bigger than 640x320.

  1. What should I do to create a better dataset? Do I need the images to be smaller? If yes, why?

  2. Should I apply some normalization to the dataset? What should it be (contrast, noise reduction)?

  3. Can I create a bigger dataset using the existing one? If yes, how?


Solution

    1. Optimal size of images is that you can easily classify object by yourself.
    2. Yes, classifiers works better after normalization, there are options. Most popular ways is center dataset (subtract mean) and normalize range of values say in [-1:1] range. Other popular way of normalization is similar to previous but normalize standard deviation (preferable in most cases).
    3. Yes, you can create bigger dataset from existing on by adding distorsions and noise to your images from existing dataset.