tensorflowtensorflow-datasetsrun-length-encoding

Decoding RLE (run-length encoding) mask with Tensorflow Datasets


I have been experimenting with Datasets but I cannot figure out how to efficiently create RLE-masks. FYI, I am using data from the Airbus Ship Detection Challenge in Kaggle: https://www.kaggle.com/c/airbus-ship-detection/data

I know my RLE-decoding function works (borrowed) from one of the kernels:

def rle_decode(mask_rle, shape=(768, 768)):
'''
mask_rle: run-length as string formated (start length)
shape: (height,width) of array to return
Returns numpy array, 1 - mask, 0 - background
'''
if not isinstance(mask_rle, str):
    img = np.zeros(shape[0]*shape[1], dtype=np.uint8)
    return img.reshape(shape).T

s = mask_rle.split()
starts, lengths = [np.asarray(x, dtype=int) for x in (s[0:][::2], s[1:][::2])]
starts -= 1
ends = starts + lengths
img = np.zeros(shape[0]*shape[1], dtype=np.uint8)
for lo, hi in zip(starts, ends):
    img[lo:hi] = 1
return img.reshape(shape).T

.... BUT it does not seem to play nicely with the pipeline:

list_ds = tf.data.Dataset.list_files(train_paths_abs)
ds = list_ds.map(parse_img)

With the following parse function, everything works fine:

def parse_img(file_path,new_size=[128,128]):    
    img_content = tf.io.read_file(file_path)
    img = tf.image.decode_jpeg(img_content)
    img = tf.image.convert_image_dtype(img, tf.float32)    
    img = tf.image.resize(img,new_size)
    return img

But things go rogue if I include the mask:

def parse_img(file_path,new_size=[128,128]):
    
    # Image
    img_content = tf.io.read_file(file_path)
    img = tf.image.decode_jpeg(img_content)
    img = tf.image.convert_image_dtype(img, tf.float32)    
    img = tf.image.resize(img,new_size)
    
    # Mask
    file_id = tf.strings.split(file_path,'/')[-1]
    objects = [rle_decode(m) for m in df2[df.ImageId==file_id]]
    mask = np.sum(objects,axis=0)
    mask = np.expand_dims(mask,3)   # Force mask to have 3 channels, necessary for resize step
    mask = tf.image.convert_image_dtype(mask, tf.int8)
    mask = tf.clip_by_value(mask,0,1)
    mask = tf.image.resize(mask,new_size)
    mask = tf.squeeze(mask)     # squeeze back
    mask = tf.image.convert_image_dtype(mask, tf.int8)
    
    return img, mask

Although my parse_img function works fine (I have checked it on a sample, it takes 271 µs ± 67.9 µs per run); the list_ds.map step takes forever (>5 minutes) before hanging. I can't figure out what's wrong and it drives me crazy! Any idea?


Solution

  • You can rewrite the function rle_decode with like this (here I do not do the final transposition to keep it more general, but you can do it later):

    import tensorflow as tf
    
    def rle_decode_tf(mask_rle, shape):
        shape = tf.convert_to_tensor(shape, tf.int64)
        size = tf.math.reduce_prod(shape)
        # Split string
        s = tf.strings.split(mask_rle)
        s = tf.strings.to_number(s, tf.int64)
        # Get starts and lengths
        starts = s[::2] - 1
        lens = s[1::2]
        # Make ones to be scattered
        total_ones = tf.reduce_sum(lens)
        ones = tf.ones([total_ones], tf.uint8)
        # Make scattering indices
        r = tf.range(total_ones)
        lens_cum = tf.math.cumsum(lens)
        s = tf.searchsorted(lens_cum, r, 'right')
        idx = r + tf.gather(starts - tf.pad(lens_cum[:-1], [(1, 0)]), s)
        # Scatter ones into flattened mask
        mask_flat = tf.scatter_nd(tf.expand_dims(idx, 1), ones, [size])
        # Reshape into mask
        return tf.reshape(mask_flat, shape)
    

    A small test (TensorFlow 2.0):

    mask_rle = '1 2 4 3 9 4 15 5'
    shape = [4, 6]
    # Original NumPy function
    print(rle_decode(mask_rle, shape))
    # [[1 0 0 1]
    #  [1 0 0 0]
    #  [0 1 1 0]
    #  [1 1 1 0]
    #  [1 1 1 0]
    #  [1 1 1 0]]
    # TensorFlow function (transposing is done out of the function)
    tf.print(tf.transpose(rle_decode_tf(mask_rle, shape)))
    # [[1 0 0 1]
    #  [1 0 0 0]
    #  [0 1 1 0]
    #  [1 1 1 0]
    #  [1 1 1 0]
    #  [1 1 1 0]]