I've been trying to implement a neural network in Matlab that is capable of recognizing images based on their features. I am attempting to use the Bag of features/words approach to obtain a discrete vector of features that I can then feed into my neural network.
I have been using this example as a guide - http://in.mathworks.com/help/vision/examples/image-category-classification-using-bag-of-features.html
One line in the code (featureVector = encode(bag, img);) counts the word occurrences in an image. Could I use this "featurevector" matrix to train my neural network? And would I have to encode every single image in my training set?
Yes that's certainly possible. By looking at the example, the training dataset is a set of images and you are finding a common vocabulary of 500 "words" / features that describes all of them with adequacy. By using featureVector = encode(bag, img);
, what you are doing is you are determining what fraction of each word exists to describe the input image img
. Specifically, if you look at the code in that example section, they plot a bar graph where the horizontal axis represents the word index and the vertical axis represents what fraction each word / feature in the vocabulary is used to represent that image.
Specifically, this is the bar graph that gets produced (taking from the link):
(source: mathworks.com)
Therefore, similar images should be described with similar features / words and so you could certainly use this as input into your neural network.
However, before you train your neural network, as you suspected, you must represent every image you wish to train with this feature vector. If you intend to use MATLAB's neural network toolbox, you must make sure that each column is an input sample and each row is a feature. featureVector
would actually return a 1 x N
vector where N
is the total number of features. However, if you want to do this more smartly, simply create an imageSet
of all of the images you want to transform: http://www.mathworks.com/help/vision/ref/imageset-class.html, then use one call to encode
to create this desired feature matrix:
imgFolder = '...'; %// Specify image folder here
imgSet = imageSet(imgFolder); %// Create image set
featureMatrix = encode(bag,imgSet).'; %// Encode the images - Make sure you transpose
The result will be a M x N
matrix where M
is the total number of input images you have and N
is the total number of features. To respect the neural networks toolbox, you must transpose this matrix because each column needs to be an input sample, not each row.