Serialize/deserialize image mask in FiftyOne App

I'd like to use the FiftyOne application for evaluating instance segmentation predictions. I used the the downloadable quickstart dataset for start and I created my own samples.json file for the ground truth and prediction data, which can be loaded by Dataset.from_dir. The json file in the tutorial contains only bounding boxes, but no masks. In the documentation I found only masks that are set to None, but I would need them.

{
  "samples": [
    {
      "filepath": "some_image.png",
      "tags": [
        "validation"
      ],
      "metadata": null,
      "uniqueness": 0.998,
      "ground_truth": {
        "_cls": "Detections",
        "detections": [
          {
            "_id": {
              "$oid": "efd81c917c9a49dc9f8aca3f"
            },
            "_cls": "Detection",
            "label": "1",
            "bounding_box": [
              0.62609375,
              0.27044921875,
              0.030078124999999956,
              0.009550781250000001
            ],
            "mask": ???,
          }
        ]
      },
      "predictions": {
        "_cls": "Detections",
        "detections": [
          {
            "_id": {
              "$oid": "6db44c71c1414b8989c92255"
            },
            "_cls": "Detection",
            "label": "1",
            "bounding_box": [
              0.3303889036178589,
              0.4432219862937927,
              0.07914596796035767,
              0.02226179838180542
            ],
            "mask": ???,
            "confidence": 0.999962329864502
          },
        ]
      }
    }
  ]
}

The problem I have is that how should I create the mask property for the segmentation mask? It is a numpy array and not serializable by default. The Dataset.from_dir uses the core.utils.deserialize_numpy_array function for loading the dataset, so I tried to use serialize_numpy_array with no success for saving the dataset.

So what would be the best way to write the mask into the json file that is deserializable? Thanks!

Solution

The Dataset.from_dir() syntax is generally used for well-defined dataset types like a dataset stored on disk in the COCODetectionDataset format.

In your case, when loading a custom dataset that does not directly correspond to one of the related dataset types, then the recommended approach is to write a simple Python loop and construct your dataset one sample at a time.

In this case, you do just need to load your mask as a numpy array and store it in the mask attribute of your detection object.

import glob
import fiftyone as fo

images_patt = "/path/to/images/*"

# Ex: your custom label format
annotations = {
    "/path/to/images/000001.jpg": [
        {"bbox": ..., "label": ..., "mask": ...},
        ...
    ],
    ...
}

# Create samples for your data
samples = []
for filepath in glob.glob(images_patt):
    sample = fo.Sample(filepath=filepath)

    # Convert detections to FiftyOne format
    detections = []
    for objects in annotations[filepath]:
        label = obj["label"]

        # Bounding box coordinates should be relative values
        # in [0, 1] in the following format:
        # [top-left-x, top-left-y, width, height]
        bounding_box = obj["bbox"]

        # Boolean or 0/1 Numpy array 
        mask = obj["mask"]
        
        detection = fo.Detection(
            label=label,
            bounding_box=bounding_box,
            mask=mask,
        )

        detections.append(detection)

    # Store detections in a field name of your choice
    sample["ground_truth"] = fo.Detections(detections=detections)

    samples.append(sample)

# Create dataset
dataset = fo.Dataset("my-detection-dataset")
dataset.add_samples(samples)

Predictions could either be loaded at the same time as the ground truth, or you can always iterate over your dataset in the future and add predictions at a later time:

import fiftyone as fo

# Ex: your custom predictions format
predictions = {
    "/path/to/images/000001.jpg": [
        {"bbox": ..., "label": ..., "mask": ..., "score": ...},
        ...
    ],
    ...
}

# Add predictions to your samples
for sample in dataset:
    filepath = sample.filepath

    # Convert predictions to FiftyOne format
    detections = []
    for obj in predictions[filepath]:
        label = obj["label"]
        confidence = obj["score"]

        # Bounding box coordinates should be relative values
        # in [0, 1] in the following format:
        # [top-left-x, top-left-y, width, height]
        bounding_box = obj["bbox"]
        
        # Boolean or 0/1 Numpy array 
        mask = obj["mask"]

        detection = fo.Detection(
            label=label,
            bounding_box=bounding_box,
            confidence=confidence,
        )

        detection["mask"] = mask

        detections.append(detection)

    # Store detections in a field name of your choice
    sample["predictions"] = fo.Detections(detections=detections)

    sample.save()

Note that the exact structure of how to parse your custom label format will depend on how you are storing the data. This is just one example that stores labels in a dictionary keyed by media filepaths. You may first need to parse and convert your labels, for example into the expected bounding box format.