I am a beginner in Python and tried hard to find an answer here before I ask this question. I have different designs that have a couple of photos, and I want to compare their hamming distances. But I don't wanna compare the images of same design file which are located in the same folder. I make the comparison based on a library called Imagehash. After comparing the different combination of images, I want to keep the ones with the highest hamming distance score. Let me explain what I want with a simple example:
In folder table there are three images: table_1.jpg, table_2.jpg, table_3.jpg In folder chair two images: chair_1.jpg, chair_2.jpg
What I want to get is the file path of the files(which I can do) to, later on, use Image.open() and imagehash.phash functions. Combinations should look like this:
(table_1.jpg, chair_1.jpg), (table_1.jpg, chair_2.jpg), (table_2.jpg, chair_1.jpg ), (table_2.jpg, chair_2.jpg), (table_3.jpg, chair_1.jpg), (table_3.jpg, chair_2.jpg)
Then I have to split after "_", and use groupby and itemgetter, I guess
You need itertools.product
to calculate the tuples you want :
from itertools import product
table = ['table_1.jpg', 'table_2.jpg', 'table_3.jpg']
chair = ['chair_1.jpg', 'chair_2.jpg']
print(list(product(table, chair)))
# [('table_1.jpg', 'chair_1.jpg'), ('table_1.jpg', 'chair_2.jpg'), ('table_2.jpg', 'chair_1.jpg'), ('table_2.jpg', 'chair_2.jpg'), ('table_3.jpg', 'chair_1.jpg'), ('table_3.jpg', 'chair_2.jpg')]
If the fillenames are all in the same list, you can use combinations
and check that the elements don't have the same beginning :
from itertools import combinations
filenames = ['table_1.jpg', 'table_2.jpg', 'table_3.jpg', 'chair_1.jpg', 'chair_2.jpg']
print [(x,y) for x,y in combinations(filenames, 2) if x.split('_')[0] != y.split('_')[0]]
# [('table_1.jpg', 'chair_1.jpg'), ('table_1.jpg', 'chair_2.jpg'), ('table_2.jpg', 'chair_1.jpg'), ('table_2.jpg', 'chair_2.jpg'), ('table_3.jpg', 'chair_1.jpg'), ('table_3.jpg', 'chair_2.jpg')]