matlabimage-processingcbirprecision-recall

Calculate precision and recall on WANG database


I have made an CBIR system in MATLAB and have used similarity measurement as euclidean distance. Using this for each query image I retrieve top 20 images.

I have used WANG Dataset for testing my system.
It contains 10 classes(like African people, Buses, Roses etc.) each containing 100 images.(1000 images in total).

My Method:
1. I am using Correlogram, Co-occurence Matrix(CCM) and Difference between Pixel Scan Pattern(DBPSP) for constructing my vector(64+196+28=288 dimensions respectively).

  1. Each of the 1000 db image I have its vector constructed beforehand.
  2. Now a query image comes and I construct it's vector too(228 dimensions again).
  3. I use Euclidean Distance for similarity and sort db image vectors in descending order of their euclid distance.
  4. Top 20 results are shown.

  5. In those 20 I can have TP or FP.

For a single query image I can easily calculate Precision and Recall and plot PR-curve using this link.

How can I do the same for whole class?

My Approach: For each image belonging to class A find top 20 images and it's respective TP(true positives) and FP (False Positive).

        TP   FP

Image1  17   3  
Image2  15   5  
...  
...  
Image100  10  10  
Total   1500 500  

Precision of Class A =1500/(2000) = .75 (Is it right??)
Recall of Class A ---> Stuck ??
PR-Curve ----> Stuck ?? Some links said I need a classifier for that and some not... I am really confused.


Solution

  • So as you noted, you can calculate precision as follows.

    P = TP ./ ( TP + FP );
    

    However, you NEED to have either have FN or the number of total falses to calculate recall. As discussed in chat, you need to find a way to determine your FN and FP data. Then you can use the following formula to calculate recall.

    R = TP ./ ( TP + FN )
    

    If you have the confusion matrix or data, you can use my custom confusionmat2f1.m to calculate precision, recall, and f1 score. This assumes that the confusion matrix is formatted as how Matlab defines it. An explanation of each line is inline. Please let me know if you want more clarification.

    function [F,P,R] = confusionmat2f1( C )
        %% confusionmat2f1( C )
        %
        % Inputs
        % C - Confusion Matrix
        %
        % Outputs
        % F - F1 score column vector
        % P - Precision column vector
        % R - Recall column vector
        
        %% 
        
        % Confusion Matrix to Probability
        M = sum( C, 3 );
        
        % Calculate Precision
        P = diag(M) ./ sum(M,1)';
        
        % Calculate Recall
        R = diag(M) ./ sum(M,2);
    
        % Calculate F1 Score
        F = f1( P, R );