image-processingkerasdeep-learningannotationsretinanet

COWC Dataset annotation


I'm new to deep learning. Currently, I am doing a project to detect cars in aerial imagery using the Retinanet model for that I have planned to use COWC Dataset. I have doubt in the annotation part, for now I am using labelImg annotation tool to annotate cars in aerial images. Since labelImg generates annotation in xml format I have converted that in a format required by Retinanet model that is mentioned below.

(imagename) (bounding_box_coordinates) (class_name)

Is there any other way to make annotation easier in COWC dataset?

Thanks in advance:)


Solution

  • The COWC dataset comes with annotations where each cars is labeled with a single point. A PNG file contains the annotations. Here's how I find the annotation locations in the PNG file.

    import numpy as np
    from PIL import Image
    
    annotation_path = 'cowc/datasets/ground_truth_sets/Toronto_ISPRS/03553_Annotated_Cars.png'
    im = Image.open(annotation_path)
    data = np.asarray(im)
    

    The problem here is that both of these values will be indexed as nonzero but we only need one of them. The COWC dataset marks cars with a red dot and negative with a blue dot, we don't need the alpha channel so the new array needs be sliced so that we don't count the alpha channel and get duplicate index values.

    data = data[:,:,0:3]
    y_ind, x_ind, rgba_ind = data.nonzero()
    

    You now have an index to all the points in the annotation file. y_ind corresponds to the height dimension, x_ind to the width. This means at the first x, y position we should see an array that looks like this [255, 0, 0]. This is what I get when I look up the first x, y position from the index

    >>> data[y_ind[0], x_ind[0]]
    array([255,   0,   0], dtype=uint8)
    

    Here the author decides to create a bounding box that is 20 pixels on a side centered on the annotation provided in the dataset. To create a single bounding box for the first annotation in this image you can try this.

    # define bbox given x, y and ensure bbox is within image bounds
    def get_bbox(x, y, x_max, y_max):
        x1 = max(0, x - 20)     # returns zero if x-20 is negative
        x2 = min(x_max, x + 20) # returns x_max if x+20 is greater than x_max
        y1 = max(0, y - 20)
        y2 = min(y_max, y + 20)
        return x1, y1, x2, y2
    
    x1, y1, x2, y2 = get_bbox(x_ind[0], y_ind[0], im.width, im.height) 
    
    

    You'll have to loop through all the x, y values to make all the bounding boxes for the image. Here's a rough and dirty way to create a csv file for a single image.

    img_path = 'cowc/datasets/ground_truth_sets/Toronto_ISPRS/03553.png'
    with open('anno.csv', 'w') as f:
        for x, y in zip(x_ind, y_ind):
            x1, y1, x2, y2 = get_bbox(x, y, im.width, im.height)
            line = f'{img_path},{x1},{y1},{x2},{y2},car\n'
            f.write(line)
    

    I plan on breaking up a huge image into much smaller ones which will change the values of the bounding boxes. I hope you find this helpful and like a good place to start.