pythonoperating-systemcombinationshamming-distancephash

Get combinations of elements in different folders but not combine the elements in the same folder, python


I am a beginner in Python and tried hard to find an answer here before I ask this question. I have different designs that have a couple of photos, and I want to compare their hamming distances. But I don't wanna compare the images of same design file which are located in the same folder. I make the comparison based on a library called Imagehash. After comparing the different combination of images, I want to keep the ones with the highest hamming distance score. Let me explain what I want with a simple example:

In folder table there are three images: table_1.jpg, table_2.jpg, table_3.jpg In folder chair two images: chair_1.jpg, chair_2.jpg

What I want to get is the file path of the files(which I can do) to, later on, use Image.open() and imagehash.phash functions. Combinations should look like this:

(table_1.jpg, chair_1.jpg), (table_1.jpg, chair_2.jpg), (table_2.jpg, chair_1.jpg ), (table_2.jpg, chair_2.jpg), (table_3.jpg, chair_1.jpg), (table_3.jpg, chair_2.jpg)

Then I have to split after "_", and use groupby and itemgetter, I guess


Solution

  • You need itertools.product to calculate the tuples you want :

    from itertools import product
    
    table = ['table_1.jpg', 'table_2.jpg', 'table_3.jpg']
    chair = ['chair_1.jpg', 'chair_2.jpg']
    
    print(list(product(table, chair)))
    # [('table_1.jpg', 'chair_1.jpg'), ('table_1.jpg', 'chair_2.jpg'), ('table_2.jpg', 'chair_1.jpg'), ('table_2.jpg', 'chair_2.jpg'), ('table_3.jpg', 'chair_1.jpg'), ('table_3.jpg', 'chair_2.jpg')]
    

    If the fillenames are all in the same list, you can use combinations and check that the elements don't have the same beginning :

    from itertools import combinations
    filenames = ['table_1.jpg', 'table_2.jpg', 'table_3.jpg', 'chair_1.jpg', 'chair_2.jpg']
    
    print [(x,y) for x,y in combinations(filenames, 2) if x.split('_')[0] != y.split('_')[0]]
    # [('table_1.jpg', 'chair_1.jpg'), ('table_1.jpg', 'chair_2.jpg'), ('table_2.jpg', 'chair_1.jpg'), ('table_2.jpg', 'chair_2.jpg'), ('table_3.jpg', 'chair_1.jpg'), ('table_3.jpg', 'chair_2.jpg')]