I'm trying to make a data generator to load DICOM images in batches to use in model.fit using Tensorflow:
dataset = tf.data.Dataset.from_tensor_slices((file_names, annotations))
dataset = dataset.map(find_path, num_parallel_calls=tf.data.AUTOTUNE)
dataset = dataset.map(read_dicom, num_parallel_calls=tf.data.AUTOTUNE)
dataset = dataset.map(augment_images, num_parallel_calls=tf.data.AUTOTUNE)
dataset = dataset.map(resize_images, num_parallel_calls=tf.data.AUTOTUNE)
dataset = dataset.map(normalize_bbox, num_parallel_calls=tf.data.AUTOTUNE)
dataset = dataset.batch(64, num_parallel_calls=tf.data.AUTOTUNE)
dataset = dataset.prefetch(buffer_size=tf.data.AUTOTUNE)
The first function (find_path
) convert file name to path.
The second function (read_dicom
) tries to load dicom images, it's something like this:
from pydicom import dcmread
...
def read_dicom(path, annotation):
raw_dicom = dcmread(path)
image = dcm.pixel_data_handlers.util.apply_voi_lut(raw_dicom.pixel_array, raw_dicom)
return image, annotation
I get an error in dcmread(path)
which say:
TypeError: dcmread: Expected a file path or a file-like, but got SymbolicTensor
I don't completely understand the situation but from what i know the reason for this is that when model.fit
running in graph mode, converts every function into graph. This makes every variables coming into functions SymbolicTensor
hence paths are SymbolicTensor
and can't be used with Pydicom or any other library.
I tried multiple methods to fix this problem but they ether not working or they are not suitable for the project i'm doing:
using tf.py_function
to prevent tensorflow from converting read_dicom
into graph
This method works but is not usable for me because i want to use Tensorflow threading to load multiple images simultaneously. if i use tf.py_function
it runs python code which has GIL and prevent threads from running at the same time.
using Tensorflow IO
library to load the image
import tensorflow as tf
import tensorflow_io as tfio
...
def read_dicom(path, annotation):
raw_image = tf.io.read_file(path)
image = tfio.image.decode_dicom_image(raw_image)
return image, annotation
This method doesn't work as the tfio.image.decode_dicom_image
in Tensorflow IO
library does not work. You can check out the colab notebook which tensorflow provided in a dicom tutorial, it doesn't work there too!
https://www.tensorflow.org/io/tutorials/dicom https://colab.research.google.com/github/tensorflow/io/blob/master/docs/tutorials/dicom.ipynb
I should note that i want to use tensorflow built-in multithreading.
Do you have any idea how can i fix this problem?
Actually tfio.image.decode_dicom_image
works. You only need to ensure tensorflow-io
is compatible with your tensorflow
version. See the markdown from here.
Test run:
import tensorflow as tf
import matplotlib.pyplot as plt
file_name = np.array(['dicom_00000001_000.dcm'])
annotation = np.array([1])
dataset = tf.data.Dataset.from_tensor_slices((file_name, annotation))
def read_dicom(path, annotation):
raw_image = tf.io.read_file(path)
image = tfio.image.decode_dicom_image(raw_image)
return image, annotation
dataset = dataset.map(read_dicom)
for x, y in dataset:
plt.imshow(x.numpy().squeeze(), cmap='gray')