caffepycaffematcaffe

How to combine the probability (soft) output of different networks and get the hard output?


I have trained three different models separately in caffe, and I can get the probability of belonging to each class for semantic segmentation. I want to get an output based on the 3 probabilities that I am getting (for example, the argmax of three probabilities). This can be done by inferring through net model and deploy.prototxt files. And then based on the final soft output, the hard output shows the final segmentation. My questions are:

  1. How to get ensemble output of these networks?
  2. How to do end-to-end training of ensemble of three networks? Is there any resources to get help?
  3. How to get final segmentation based on the final probability (e.g., argmax of three probabilities), which is soft output?

My question may sound very basic question, and my apologies for that. I am still trying to learn step by step. I really appreciate your help.


Solution

  • There are two ways (at least that I know of) that you could do to solve (1):

    1. One is to use pycaffe interface, instantiate the three networks, forward an input image through each of them, fetch the output and perform any operation you desire to combine all three probabilites. This is specially useful if you intend to combine them using a more complex logic.

    2. The alternative (way less elegant) is to use caffe test and process all your inputs separately through each network saving the probabilities into files. Then combine the probabilities from the files later.

    Regarding your second question, I have never trained more than two weight-sharing CNNs (siamese networks). From what I understood, your networks don't share weights, only the architecture. If you want to train all three end-to-end please take a look at this tutorial made for siamese networks. The authors define in their prototxt both paths/branches, connect each branch's layers to the input Data layer and, at the end, with a loss layer.

    In your case you would define the three branches (one for each of your networks), connect with input data layers (check if each branch processes the same input or different inputs, for example, the same image pre-processed differently) and unite them with a loss, similarly to the tutorial.

    Now, for the last question, it seems Caffe has a ArgMax layer that may be what you are looking for. If you are familiar with python, you could also use a python layer that allows you to define with great flexibility how to combine the output probabilities.