I want to train some models to work with grayscale images, which e.g. is useful for microscope applications (Source). Therefore I want to train my model on graysale imagenet, using the pytorch grayscale conversion (torchvision.transforms.Grayscale), to convert the RGB imagenet to a grayscale imagenet. Internally pytorch rotates the color space from RGB to YPbPr as follows:
Y' is the grayscale channel then, so that Pb and Pr can be neglected after transformation. Actually pytorch even only calculates
grayscale = (0.2989 * r + 0.587 * g + 0.114 * b)
To normalize the image data, I need to know grayscale-imagenet's mean pixel value, as well as the standard deviation. Is it possible to calculate those?
I had success in calculating the mean pixel intensity using
meanGrayscale = 0.2989 * r.mean() + 0.587 * g.mean() + 0.114 * b.mean()
Transforming an image and then calculating the grayscale mean, gives the same result as first calculating the RGB means and then transforming those to a grayscale mean.
However, I am clueless when it comes to calculating the variance or standard deviation now. Does somebody have any idea, or knows some good literature on the topic? Is this even possible?
I found a publication "Jianxin Gong - Clarifying the Standard Deviational Ellipse" ... There he does it in 2 dimensions (as far as I understand). I just could not figure out yet how to do it in 3D.
Okay, I wasn't able to calculate the standard deviation as planned, but did it using the code below. The grayscale imagenet's train dataset mean and standard deviation are (round it as much as you like):
Mean: 0.44531356896770125
Standard Deviation: 0.2692461874154524
import multiprocessing
import os
def calcSTD(d):
meanValue = 0.44531356896770125
squaredError = 0
numberOfPixels = 0
for f in os.listdir("/home/imagenet/ILSVRC/Data/CLS-LOC/train/"+str(d)+"/"):
if f.endswith(".JPEG"):
image = imread("/home/imagenet/ILSVRC/Data/CLS-LOC/train/"+str(d)+"/"+str(f))
###Transform to gray if not already gray anyways
if np.array(image).ndim == 3:
matrix = np.array(image)
blue = matrix[:,:,0]/255
green = matrix[:,:,1]/255
red = matrix[:,:,2]/255
gray = (0.2989 * red + 0.587 * green + 0.114 * blue)
else:
gray = np.array(image)/255
###----------------------------------------------------
for line in gray:
for pixel in line:
squaredError += (pixel-meanValue)**2
numberOfPixels += 1
return (squaredError, numberOfPixels)
a_pool = multiprocessing.Pool()
folders = []
[folders.append(f.name) for f in os.scandir("/home/imagenet/ILSVRC/Data/CLS-LOC/train") if f.is_dir()]
resultStD = a_pool.map(calcSTD, folders)
StD = (sum([intensity[0] for intensity in resultStD])/sum([pixels[1] for pixels in resultStD]))**0.5
print(StD)
During the process some errors like this occured:
/opt/conda/lib/python3.7/site-packages/PIL/TiffImagePlugin.py:771: UserWarning: Possibly corrupt EXIF data. Expecting to read 8 bytes but only got 4. Skipping tag 41486 "Possibly corrupt EXIF data. "
The repective images from the 2019 version of ImageNet were skipped.