python conv-neural-network bioinformatics semantic-segmentation imagedatagenerator

Inputs to Unet CNN(Unet) type structures

I've been going through my thesis (brain semantic segmentation & survival prediction with a splash of genomics). Tackling the imaging part, I've followed the literature and understood that some of the few decent ways to go about segmenting a brain is with Unets. I saw both 2D and 3D implementations of these with weird ways to make the datasets. Since this is my thesis I didn't want to outright copy someones work, so I got to doing stuff on my own. I'm stuck on a particular part where i cannot get my input to connect to the network. To my understanding the network needs to take a 2D image (H,W) , a channel for the amount of images you're trying to pass together , and another channel for the amount of classes you're trying to segment. In this case, I've taken the BraTS datasets from '18, '19, '20 . From the initial dataset i unpack the nifti files and perform a double step preprocessing with NLM filtering and N4BiasFieldCorrection, then i save the images in 2D slices across the Z axis (this translates into each modality (flair,t1,t1c,t2) getting it's own folder containing 155.png images. For the masks I just encode the 4 classes into [0,1,2,3] and also save them as 2D pngs across the Z axis.

I use the following code to create my custom generator.

import numpy as np
from skimage.io import imread
from keras.utils import to_categorical

def load_img(file_list):
    images = []
    for i in range(len(file_list)):
        x = imread(file_list[i])
        norm_img = (x - np.min(x)) / (np.max(x) - np.min(x))
        images.append(norm_img)
    images = np.array(images)

    return (images)


def load_mask(file_list):
    masks = []
    for i in range(len(file_list)):
        mask = imread(file_list[i])
        enc_mask = to_categorical(mask, num_classes=4)
        masks.append(enc_mask)
    masks = np.array(masks)

    return masks


def imageLoader(img_list, mask_list, batch_size):
    L = len(img_list)

    while True:

        batch_start = 0
        batch_end = batch_size

        while batch_start < L:
            limit = min(batch_end, L)

            X = load_img(img_list[batch_start:limit])
            Y = load_mask(mask_list[batch_start:limit])

            yield (X, Y)  # tuple

            batch_start += batch_size
            batch_end += batch_size

There is a problem with the 'to_categorical' step , and i think it's because that whenever it gets to an image that doesn't have 4 present classes it crashes.

The Unet architecture i approached is a slightly modified version of https://github.com/jordan-colman/DR-Unet104/blob/main/Dr_Unet104_model.py The modification i did to this is to change it's output to give me the multichannel semantic mask i'm after.

outputs = Conv2D(num_classes, (1, 1), name='output_layer', activation='softmax')(X)

My idea for the segmentation task , is to use this Unet and train four of them. Each one for each modality (flair, t1, t1c, t2) and then freeze their weights and connect them in an ensemble.

Input 0 of layer "conv2d_106" is incompatible with the layer: expected min_ndim=4, found ndim=3. Full shape received: (None, None, None)

Call arguments received by layer "model_5" (type Functional):
  • inputs=tf.Tensor(shape=(None, None, None), dtype=uint8)
  • training=True
  • mask=None

I understand that it asks me to swap around my inputs to fit it's input, but I'm unsure on how to proceed. I've been trying to expand the dimensions of the image 2D input with the tensorflow.expand_dims() command to no luck. Any pointers to solutions or reading materials would be appreciated.

Solution

I did some much needed reading & work on this and it worked:

fixed the categorical issue by doing it with a small script by hand
fixed the unknown py crash coming from tensorflow cause it couldn't find the zlibstat (pycharm DIDNT show the error it just returned a really long exit code which meant nothing, always try to run your scripts from a terminal kids, learn from my mistakes)
fixed another error coming from tensorflow,due to my GPU only having 4GB VRAM which is nowhere enough to run models like these, by requesting access to a better GPU because apparently CUDA won't use the shared memory.
fixed the problem with the inputs by properly expanding the axis of the input, in the end the input sample was a quadruple modality[flair,t1,t1ce,t2] 2D stacked numpy array as i saw 'Dr. Sreenivas Bhattiprolu' do in https://www.youtube.com/watch?v=oB35sV1npVI . The final input of the network was [H,W, X1, X2] , from left to right , H: height , W: width , X1: number of different modalities in the numpy stack(aka image channels), X2: number of different classes that needed to be annotated on the masks. For my case this was a [240,240,4,4] input shape.

On an off note i found out that to train a model with 74.6m parameters you need a serious amount of data, so i implore you to look into keras data augmentation techniques. I'm fairly sure my model over fitted at the 100 epochs i let it run.

I'll come back at some point after the publication and post a link the git repo I'll upload it in case anyone else finds themselves in a pickle.