image-processingsimilarityssim

SSIM (Structural similarity index measure) performance


I have a reference image A and 2 target images B and C , I tried to measure the SSIM as follows :

(from a human vision perception A & B are from the same class) and A & C from different class.

result1 = SSIM(A , B) = 4.71027%;  
result2 = SSIM(A , C) =  7.95047%; 

I used the code from opencv : SSIM CODE

I also tried LBP normalized histogram of the entire image by calculating KL divergence of the two histograms, but the results were worst.

Is there a way to measure the similarity without training?

Image A : enter image description here

Image B : enter image description here

Image C : enter image description here

EDIT :

After @Cris Luengo suggestion, these are the results of 2 LBP versions Circular, and Variance-based. It' seems like the choice of the method (features descriptor) is critical: (result = 0 means identical)

result1 = LPB_CIRCULAR_HIST_KL(A , B) =  0.66;
result2 = LPB_CIRCULAR_HIST_KL(A , C) =  0.64;

result1 = LPB_VAR_HIST_KL(A , B) = 0.49;
result2 = LPB_VAR_HIST_KL(A , C) = 3.74;

Solution

  • As comments suggest, SSIM will not work if the two images are not pixel-alingned. You can find similarity between two unaligned images in a variety of ways. Nowadays one of the most popular is using CLIP. CLIP is what Generative AI like Stable Diffusion is based on.

    I suggest you look at this repo which tells you how install CLIP for python and extract features and similarities. The example in there is for image-text similarity but you can extract image-image similarity by doing something like:

    import torch
    import clip
    from PIL import Image
    import torch.nn.functional as F
    
    device = "cuda" if torch.cuda.is_available() else "cpu"
    model, preprocess = clip.load("ViT-B/32", device=device)
    
    image1 = preprocess(Image.open("Image1.png")).unsqueeze(0).to(device)
    image2 = preprocess(Image.open("Image2.png")).unsqueeze(0).to(device)
    
    with torch.no_grad():
        image1_features = model.encode_image(image1)
        image2_features = model.encode_image(image2)
        
        sim = F.cosine_similarity(image1_features, image2_features)
        
    print("Cosine similarity: ", sim)  
    

    Note this might be quite slow depending on how many samples you have or what kind of task you want to run (brute force retrieval might not be feasible)