I have an image, from which I would like to filter some colors: if a given pixel is in my set of colors, I would like to replace it by a white pixel. With an image called original
of shape (height, width, 3) and a color like ar=np.array([117,30,41])
, the following works fine:
mask = (original == ar).all(axis=2)
original[mask] = [255, 255, 255]
The trouble is, the set of colors I want to exclude is quite big (~37000). Using the previous code in a loop (for ar in colors
) works again, but is quite slow. Is there a faster way to do it?
For now, my set of colors is in a numpy array of shape (37000, 3). I'm sure that all of these colors are present on my image, and I'm also sure that they are unique.
A simple way to solve this would be a look up table. A look up table with a boolean for every color would only cost 256 * 256 * 256 * 1 bytes = 16 MiB, and would enable you to determine if a color is in your list of disallowed colors in constant time.
Here is an example. This code generates an image with multiple colors. It filters out some of those colors using two approaches. The first approach is the one you describe in the question. The second approach is the lookup table.
import numpy as np
# Only used for generating image. Skip this if you already have an image.
image_colors = np.array([
(100, 100, 100),
(200, 200, 200),
(255, 255, 0),
(255, 0, 0),
])
image_colors_to_remove = [
(255, 255, 0),
(255, 0, 0),
]
# Generate image
resolution = (800, 600)
np.random.seed(42)
image = np.random.randint(0, len(image_colors), size=resolution)
image = np.array(image_colors)[image].astype(np.uint8)
# image = np.random.randint(0, 256, size=(*resolution, 3))
# Slow approach
def remove_colors_with_for(image, image_colors_to_remove):
image = image.copy()
for c in image_colors_to_remove:
mask = (image == c).all(axis=2)
image[mask] = [255, 255, 255]
return image
# Fast approach
def remove_colors_with_lookup(image, image_colors_to_remove):
image = image.copy()
colors_remove_lookup = np.zeros((256, 256, 256), dtype=bool)
image_colors_to_remove = np.array(image_colors_to_remove).T
colors_remove_lookup[tuple(image_colors_to_remove)] = 1
image_channel_first = image.transpose(2, 0, 1)
mask = colors_remove_lookup[tuple(image_channel_first)]
image[mask] = [255, 255, 255]
return image
new_image = remove_colors_with_for(image, image_colors_to_remove)
new_image2 = remove_colors_with_lookup(image, image_colors_to_remove)
print("Same as for loop?", np.all(new_image2 == new_image))