pythonarraysimagenumpyconv-neural-network

Convert RGB to class / single integer value


I have a numpy array from a rgb image (64,64,3) and I need to convert each existing rgb combination to one class, represented by an integer value. So that in the end I have an array (64,64) which contains integer values (0-N). These values represent specific rgb combination from the picture. Of course every rgb combination just gets one value. In short: Every color is one class, and every pixel has one fitting class-value (0-N) :)

Obviously its not a big problem, I could just go through each pixel, check the rgb values and if they are not in an "already discovered RGB" tempList, I add these values and give those rgb values an integer value representing the class, otherwise I search in the tempList for the rgb values and give the integer value I wrote down in the List - or something like that.

But to be honest, I need to do this for a lot of images and I try to get better with python. So I want to know if someone has a more efficent way to do this? I scrolled through the libaries and boards and couldn't find a good approach.


Solution

  • You can convert three 8 bit integers into a 32bit integer and easily recover the three integer back. The idea is to use bitwise operations, this way each 8 bit represents one of the rgb colors. This way you already know the N = 16777215 (including zero) = 256**3.

    The following code can do this:

    def rgbtoint32(rgb):
        color = 0
        for c in rgb[::-1]:
            color = (color<<8) + c
            # Do not forget parenthesis.
            # color<< 8 + c is equivalent of color << (8+c)
        return color
    
    def int32torgb(color):
        rgb = []
        for i in range(3):
            rgb.append(color&0xff)
            color = color >> 8
        return rgb
    
    rgb = [32,253,200]
    color = rgbtoint32(rgb)
    rgb_c = int32torgb(color)
    
    print(rgb)
    print(color)
    print(rgb_c)
    

    This gives:

    [32, 253, 200]
    13172000
    [32, 253, 200]
    

    Update: Using "view" from numpy, as denoted below by "Mad Physicist ", one can efficiently do the above process as

    rgb = np.array([[[32,253,200], [210,25,42]]],dtype = np.uint8)
    size_d = list(rgb.shape)
    size_d[2] = -1
    
    # Converting to 2D int32 array
    colorint32 = np.dstack((rgb,np.zeros(rgb.shape[:2], 'uint8'))).view('uint32').squeeze(-1)
    
    # Converting back from the int32 array to RGB space
    rgb_c = colorint32.view('uint8').reshape(size_d)[:,:,:3]
    
    # Print results
    print(rgb)
    print(colorint32)
    print(rgb_c)
    

    Which gives

    [[[ 32 253 200]
      [210  25  42]]]
    [[13172000  2759122]]
    [[[ 32 253 200]
      [210  25  42]]]