I implemented a CBIR with SIFT combined with other feature-based algorithms (with OpenCV and Python3), now I have to evaluate how the combination of them (i.e. SIFT/SURF, ORB/BRISK...) perform.
I found that I can use Precision |TP| / (|TP| + |FP|) and Recall |TP| / (|TP| + |FN|). I know that the TP is the correct positive, that FN is the relevant documents that are not returned and that the FP is the documents that are returned but are not relevant
I calculate my matches with BF and I presume that:
matches=bf.knnMatch(descriptor1, descriptor2, k=2)
are my TP+FPHow can I calculate my FN? Such as the matches that are relevant but not returned?
Note that I'm just formulating a hypothesis, so please correct me if I'm wrong.
I would like to have some help on the concrete implementation, such as where are these data in a concrete case of images matching.
In alternative can you please suggest me how to evaluate a CBIR system based on feature detection and description?
I finally found an answer to my question, maybe it can help someone!
There is a difference between PRECISION and RECALL calculate in INFORMATION RETRIEVAL CONTEXT and in CLASSIFICATION CONTEXT.
For information retrieval:
precision = (relevant documents + retrieved documents) / retrieved documents
recall = (relevant documents + retrieved dcuments) / relevant documents
For classification context, it is possible to speak about confusion matrix:
precision = TP/TP+FP
recall = TP/TP+FN
In my case, for example, was not possible to use the confusion matrix.