I am creating a list containing images randomly sampled from a large master list. I am trying to replace any item in 'image_list1' with a new, randomly selected image if a range of characters is equal.
Example:
'AF05_AC.png'
'AF05_AO.png' <- replace since characters [0:5] are equal to image above
Not sure how to implement this since I don't want to replace it with a specific value, but keep sampling until n=20 without that range of characters being equal for any item in the list.
with open('Faces/negFaces.txt') as f:
negFaces= f.read().splitlines()
n=20
image_list1 = random.sample(negFaces, n)
Is there a particular reason you want to randomly sample? If not, then you can create buckets based on the substring. Then for each bucket, select a maximum of one image. For example, 'AF05_AC.png' and 'AF05_AO.png' would both go into the bucket of 'AF05_A'. In this way, you could randomly select a bucket and then randomly select a string from that bucket, but this would introduce a bias based on the number of images in each bucket.
If you really want it to be perfectly random, then I can't think of anything other than brute forcing it until your condition is satisfied, although depending on the substring and images, this could take a very very long time, or could even result in an infinite loop if you're not careful. I'm generally bad at probability though so there might be a way to accomplish this with perfect randomness.