For a fun project I want to analyze a few images, especially which colors (hue) are more visible than others. As I want to take the "visibility" of the colors in account, just counting the hue of pixels is not enough (e.g. perfect black would count as red as its hue is 0°). I came up with a formula which is IMO good enough for my project.
Currently I do the following:
The formula is color_visibility = sqrt(saturation * value)
. So a full-red RGB=255,0,0; HSV=0,1,1
would result in 1
while e.g. a light-red RGB=255,128,128; HSV=0,0.5,1
would result in 0.70
.
Here is the (full working) code I use:
import urllib
import cv2
import numpy as np
url = 'https://upload.wikimedia.org/wikipedia/commons/thumb/0/02/Leuchtturm_in_Westerheversand_crop.jpg/299px-Leuchtturm_in_Westerheversand_crop.jpg'
image = np.asarray(bytearray(urllib.urlopen(url).read()), dtype="uint8")
image = cv2.imdecode(image, cv2.IMREAD_COLOR)
d = {}
hsv = cv2.cvtColor(image, cv2.COLOR_BGR2HSV)
pixels = hsv.reshape((hsv.shape[0] * hsv.shape[1], 3))
for h,s,v in pixels:
d[h] = d.get(h, 0.) + (s/255. * v/255.) ** 0.5
As you might guess, the code gets really slow when the image have more pixels.
My question is, how can I do the calculation of my formula without the dict and for-loop? Maybe directly with numpy?
The magic you are looking for is in np.bincount
, as it translates pretty straight-forwardly to the loopy version using the h
values as the bins -
H,S,V = pixels.T
d_arr = np.bincount(H, ((S/255.0) * (V/255.0))**0.5 )
Note that the resultant array might have elements have zero valued counts