pythonarraysnumpysimpleitknifti

Convert a list into a Numpy array lose two of its three axis only with one dataset


I have a python code that reads NIFTI images using SimpleITK library. Then it converts that images into a Numpy Array. Then, I extend the Numpy Array into a list.

I have 20 FLAIR.nii.gz files. Each of them have 48 slices.

When I have all the 48 slices of all the 20 patients, I convert the list into a Numpy Array.

I do it this way because I'm newbie with Python and I don't know any other way to do it.

The code is:

import os
import SimpleITK as sitk
import numpy as np

flair_dataset = []

# For each patient directory
# data_path is a list with all of the patient's directory.
for i in data_path:

    img_path = os.path.join(file_path, i, 'pre')
    mask_path = os.path.join(file_path, i)

    for name in glob.glob(img_path+'/FLAIR*'):
        # Reads images using SimpleITK.
        brain_image = sitk.ReadImage(name)
        # Get a numpy array from a SimpleITK Image.
        brain_array = sitk.GetArrayFromImage(brain_image)

        flair_dataset.extend(brain_array)

        if debug:
            print('brain_image size: ', brain_image.GetSize())
            print('brain_array Shape: ', brain_array.shape)
            print('flair_dataset length:', len(flair_dataset))

print('flair_dataset length: ', len(flair_dataset))
print('flair_dataset[1] type: ', print(type(flair_dataset[1])))
print('flair_dataset[1] shape: ', print(flair_dataset[1].shape))

flair_array = np.array(flair_dataset)
print('flair_array.shape: ', flair_array.shape)
print('flair_array.dtype: ', flair_array.dtype)

This code generates this output (all FLAIR.nii.gz files have the same shape):

data_path =  ['68', '55', '50', '61', '63', '52', '51', '60', '67', '58', '59', '53', '69', '64', '56', '65', '54', '62', '66', '57']
patient_data_path =  68
brain_image size:  (232, 256, 48)
brain_array Shape:  (48, 256, 232)
flair_dataset length: 48
Mask list length:  48

patient_data_path =  55
brain_image size:  (232, 256, 48)
brain_array Shape:  (48, 256, 232)
flair_dataset length: 96
Mask list length:  96

patient_data_path =  50
brain_image size:  (256, 232, 48)
brain_array Shape:  (48, 232, 256)
flair_dataset length: 144
WMH image Size:  (256, 232, 48)
WMH array Shape:  (48, 232, 256)
Mask list length:  144

patient_data_path =  61
brain_image size:  (232, 256, 48)
brain_array Shape:  (48, 256, 232)
flair_dataset length: 192
Mask list length:  192

patient_data_path =  63
brain_image size:  (232, 256, 48)
brain_array Shape:  (48, 256, 232)
flair_dataset length: 240
Mask list length:  240

patient_data_path =  52
brain_image size:  (232, 256, 48)
brain_array Shape:  (48, 256, 232)
flair_dataset length: 288
Mask list length:  288

patient_data_path =  51
brain_image size:  (256, 232, 48)
brain_array Shape:  (48, 232, 256)
flair_dataset length: 336
WMH image Size:  (256, 232, 48)
WMH array Shape:  (48, 232, 256)
Mask list length:  336

patient_data_path =  60
brain_image size:  (232, 256, 48)
brain_array Shape:  (48, 256, 232)
flair_dataset length: 384
Mask list length:  384

patient_data_path =  67
brain_image size:  (232, 256, 48)
brain_array Shape:  (48, 256, 232)
flair_dataset length: 432
Mask list length:  432

patient_data_path =  58
brain_image size:  (232, 256, 48)
brain_array Shape:  (48, 256, 232)
flair_dataset length: 480
Mask list length:  480

patient_data_path =  59
brain_image size:  (232, 256, 48)
brain_array Shape:  (48, 256, 232)
flair_dataset length: 528
Mask list length:  528

patient_data_path =  53
brain_image size:  (232, 256, 48)
brain_array Shape:  (48, 256, 232)
flair_dataset length: 576
Mask list length:  576

patient_data_path =  69
brain_image size:  (232, 256, 48)
brain_array Shape:  (48, 256, 232)
flair_dataset length: 624
Mask list length:  624

patient_data_path =  64
brain_image size:  (232, 256, 48)
brain_array Shape:  (48, 256, 232)
flair_dataset length: 672
Mask list length:  672

patient_data_path =  56
brain_image size:  (232, 256, 48)
brain_array Shape:  (48, 256, 232)
flair_dataset length: 720
Mask list length:  720

patient_data_path =  65
brain_image size:  (232, 256, 48)
brain_array Shape:  (48, 256, 232)
flair_dataset length: 768
Mask list length:  768

patient_data_path =  54
brain_image size:  (232, 256, 48)
brain_array Shape:  (48, 256, 232)
flair_dataset length: 816
Mask list length:  816

patient_data_path =  62
brain_image size:  (232, 256, 48)
brain_array Shape:  (48, 256, 232)
flair_dataset length: 864
Mask list length:  864

patient_data_path =  66
brain_image size:  (232, 256, 48)
brain_array Shape:  (48, 256, 232)
flair_dataset length: 912
Mask list length:  912

patient_data_path =  57
brain_image size:  (232, 256, 48)
brain_array Shape:  (48, 256, 232)
flair_dataset length: 960
Mask list length:  960

The final output from the code is:

flair_dataset length:  960
mask_dataset length:  960
flair_dataset[1] type:  <class 'numpy.ndarray'>
flair_dataset[1] shape:  (256, 232)
flair_array.shape:  (960,)
flair_array.dtype:  object

My problem:

I don't understand why flair_array has this shape: (960,). flair_array dtype is object.

I have tried the same code, without changing anything, and it works perfectly. It has 20 patients also, and 48 slices for each FLAIR.nii.gz file also.

Its output:

data_path =  ['39', '31', '2', '23', '35', '29', '17', '49', '27', '8', '33', '4', '19', '41', '37', '11', '25', '6', '0', '21']

patient_data_path =  39
brain_image size:  (240, 240, 48)
brain_array Shape:  (48, 240, 240)
flair_dataset length: 48
Mask list length:  48

patient_data_path =  31
brain_image size:  (240, 240, 48)
brain_array Shape:  (48, 240, 240)
flair_dataset length: 96
Mask list length:  96

patient_data_path =  2
brain_image size:  (240, 240, 48)
brain_array Shape:  (48, 240, 240)
flair_dataset length: 144
Mask list length:  144

patient_data_path =  23
brain_image size:  (240, 240, 48)
brain_array Shape:  (48, 240, 240)
flair_dataset length: 192
Mask list length:  192

patient_data_path =  35
brain_image size:  (240, 240, 48)
brain_array Shape:  (48, 240, 240)
flair_dataset length: 240
Mask list length:  240

patient_data_path =  29
brain_image size:  (240, 240, 48)
brain_array Shape:  (48, 240, 240)
flair_dataset length: 288
Mask list length:  288

patient_data_path =  17
brain_image size:  (240, 240, 48)
brain_array Shape:  (48, 240, 240)
flair_dataset length: 336
Mask list length:  336

patient_data_path =  49
brain_image size:  (240, 240, 48)
brain_array Shape:  (48, 240, 240)
flair_dataset length: 384
Mask list length:  384

patient_data_path =  27
brain_image size:  (240, 240, 48)
brain_array Shape:  (48, 240, 240)
flair_dataset length: 432
Mask list length:  432

patient_data_path =  8
brain_image size:  (240, 240, 48)
brain_array Shape:  (48, 240, 240)
flair_dataset length: 480
Mask list length:  480

patient_data_path =  33
brain_image size:  (240, 240, 48)
brain_array Shape:  (48, 240, 240)
flair_dataset length: 528
Mask list length:  528

patient_data_path =  4
brain_image size:  (240, 240, 48)
brain_array Shape:  (48, 240, 240)
flair_dataset length: 576
Mask list length:  576

patient_data_path =  19
brain_image size:  (240, 240, 48)
brain_array Shape:  (48, 240, 240)
flair_dataset length: 624
Mask list length:  624

patient_data_path =  41
brain_image size:  (240, 240, 48)
brain_array Shape:  (48, 240, 240)
flair_dataset length: 672
Mask list length:  672

patient_data_path =  37
brain_image size:  (240, 240, 48)
brain_array Shape:  (48, 240, 240)
flair_dataset length: 720
Mask list length:  720

patient_data_path =  11
brain_image size:  (240, 240, 48)
brain_array Shape:  (48, 240, 240)
flair_dataset length: 768
Mask list length:  768

patient_data_path =  25
brain_image size:  (240, 240, 48)
brain_array Shape:  (48, 240, 240)
flair_dataset length: 816
Mask list length:  816

patient_data_path =  6
brain_image size:  (240, 240, 48)
brain_array Shape:  (48, 240, 240)
flair_dataset length: 864
Mask list length:  864

patient_data_path =  0
brain_image size:  (240, 240, 48)
brain_array Shape:  (48, 240, 240)
flair_dataset length: 912
Mask list length:  912

patient_data_path =  21
brain_image size:  (240, 240, 48)
brain_array Shape:  (48, 240, 240)
flair_dataset length: 960
Mask list length:  960

This is the final output for this dataset:

flair_dataset length:  960
mask_dataset length:  960
flair_dataset[1] type:  <class 'numpy.ndarray'>
flair_dataset[1] shape:  (240, 240)
flair_array.shape:  (960, 240, 240)
flair_array.dtype:  float32

With this second dataset the flair_array is float32.

Why the first flair_array shape is (960,)?

UPDATE:
In both datasets, brain_array.dtype is float32 always.


Solution

  • In one case

    flair_array.shape:  (960,)
    flair_array.dtype:  object
    

    in the other

    flair_array.shape:  (960, 240, 240)
    flair_array.dtype:  float32
    

    You make these with:

    flair_array = np.array(flair_dataset)
    

    If all the elements of flair_dataset have the same shape, it can create a multidimensional array from them.

    But if one or more of the arrays in the list differ in shape, it has to give up on the multidimensional goal, and instead just makes an object dtype array, which is very much like a list - in contains references to the original arrays.

    In the original list most of the elements are

    brain_image size:  (232, 256, 48)
    brain_array Shape:  (48, 256, 232)
    

    but I also see some

    brain_image size:  (256, 232, 48)
    brain_array Shape:  (48, 232, 256)
    

    In the second set all are

    brain_image size:  (240, 240, 48)
    brain_array Shape:  (48, 240, 240)
    

    When people ask about a (n,) shape, when they expect (n,m,p), I suspect the first has an object dtype caused by a mix in element shapes. That's why I asked about dtype.