I need to classify small images in 4 different categories, +1 "background" for false detection.
While training the loss quickly drop to 0.7, but stay there even after 800k steps. In the end, the frozen graph seems to classify most images with the background label.
I'm probably missing something, I'll detail the steps I used below, and any feedback is welcomed. I'm new to tf-slim, so it can be an obvious mistake, maybe too little samples ? I'm not looking for top accuracy, just something working for prototyping.
Source materials can be found there : https://www.dropbox.com/s/k55xoygdzb2efag/TilesDataset.zip?dl=0
I used tensorflow-gpu 1.15.3 on windows 10.
I created the dataset using :
python ./createTfRecords.py --tfrecord_filename=tilesV2_40 --dataset_dir=.\tilesV2\Tiles_40
I added a dataset provider in models-master\research\slim\datasets based on the flowers provider.
I modified the mobilnet_v2.py in models-master\research\slim\nets\mobilenet, changed num_classes=5 and mobilenet.default_image_size = 40
I trained the net with : python ./models-master/research/slim/train_image_classifier.py --model_name "mobilenet_v2" --learning_rate 0.045 --preprocessing_name "inception_v2" --label_smoothing 0.1 --moving_average_decay 0.9999 --batch_size 96 --learning_rate_decay_factor 0.98 --num_epochs_per_decay 2.5 --train_dir ./weight --dataset_name Tiles_40 --dataset_dir .\tilesV2\Tiles_40
When I try this python .\models-master\research\slim\eval_image_classifier.py --alsologtostderr --checkpoint_path ./weight/model.ckpt-XXX --dataset_dir ./tilesV2/Tiles_40 --dataset_name Tiles_40 --dataset_split_name validation --model_name mobilenet_v2
I get eval/Recall_5[1]eval/Accuracy[1]
I then export the graph with python .\models-master\research\slim\export_inference_graph.py --alsologtostderr --model_name mobilenet_v2 --image_size 40 --output_file .\export\output.pb --dataset_name Tiles_40
And freeze it with freeze_graph --input_graph .\export\output.pb --input_checkpoint .\weight\model.ckpt-XXX --input_binary true --output_graph .\export\frozen.pb --output_node_names MobilenetV2/Predictions/Reshape_1
I then try the net with images from the dataset with python .\label_image.py --graph .\export\frozen.pb --labels .\tilesV2\Tiles_40\labels.txt --image .\tilesV2\Tiles_40\photos\lac\1_1.png --input_layer input --output_layer MobilenetV2/Predictions/Reshape_1
. This is where I get wrong classifications.,
like 0:background 0.92839915 2:lac 0.020171663 1:house 0.019106707 3:road 0.01677236 4:start 0.0155500565
for a "lac" image of the dataset
I tried changing the depth_multiplier, the learning rate, learning on a cpu, removing --preprocessing_name "inception_v2"
from the learning command. I don't have any idea left...
Change your learning rate, maybe start from the usual choice of 3e-5.