jsonobject-detectionfiftyone

Serialize/deserialize image mask in FiftyOne App


I'd like to use the FiftyOne application for evaluating instance segmentation predictions. I used the the downloadable quickstart dataset for start and I created my own samples.json file for the ground truth and prediction data, which can be loaded by Dataset.from_dir. The json file in the tutorial contains only bounding boxes, but no masks. In the documentation I found only masks that are set to None, but I would need them.

{
  "samples": [
    {
      "filepath": "some_image.png",
      "tags": [
        "validation"
      ],
      "metadata": null,
      "uniqueness": 0.998,
      "ground_truth": {
        "_cls": "Detections",
        "detections": [
          {
            "_id": {
              "$oid": "efd81c917c9a49dc9f8aca3f"
            },
            "_cls": "Detection",
            "label": "1",
            "bounding_box": [
              0.62609375,
              0.27044921875,
              0.030078124999999956,
              0.009550781250000001
            ],
            "mask": ???,
          }
        ]
      },
      "predictions": {
        "_cls": "Detections",
        "detections": [
          {
            "_id": {
              "$oid": "6db44c71c1414b8989c92255"
            },
            "_cls": "Detection",
            "label": "1",
            "bounding_box": [
              0.3303889036178589,
              0.4432219862937927,
              0.07914596796035767,
              0.02226179838180542
            ],
            "mask": ???,
            "confidence": 0.999962329864502
          },
        ]
      }
    }
  ]
}

The problem I have is that how should I create the mask property for the segmentation mask? It is a numpy array and not serializable by default. The Dataset.from_dir uses the core.utils.deserialize_numpy_array function for loading the dataset, so I tried to use serialize_numpy_array with no success for saving the dataset.

So what would be the best way to write the mask into the json file that is deserializable? Thanks!


Solution

  • The Dataset.from_dir() syntax is generally used for well-defined dataset types like a dataset stored on disk in the COCODetectionDataset format.

    In your case, when loading a custom dataset that does not directly correspond to one of the related dataset types, then the recommended approach is to write a simple Python loop and construct your dataset one sample at a time.

    In this case, you do just need to load your mask as a numpy array and store it in the mask attribute of your detection object.

    import glob
    import fiftyone as fo
    
    images_patt = "/path/to/images/*"
    
    # Ex: your custom label format
    annotations = {
        "/path/to/images/000001.jpg": [
            {"bbox": ..., "label": ..., "mask": ...},
            ...
        ],
        ...
    }
    
    # Create samples for your data
    samples = []
    for filepath in glob.glob(images_patt):
        sample = fo.Sample(filepath=filepath)
    
        # Convert detections to FiftyOne format
        detections = []
        for objects in annotations[filepath]:
            label = obj["label"]
    
            # Bounding box coordinates should be relative values
            # in [0, 1] in the following format:
            # [top-left-x, top-left-y, width, height]
            bounding_box = obj["bbox"]
    
            # Boolean or 0/1 Numpy array 
            mask = obj["mask"]
            
            detection = fo.Detection(
                label=label,
                bounding_box=bounding_box,
                mask=mask,
            )
    
            detections.append(detection)
    
        # Store detections in a field name of your choice
        sample["ground_truth"] = fo.Detections(detections=detections)
    
        samples.append(sample)
    
    # Create dataset
    dataset = fo.Dataset("my-detection-dataset")
    dataset.add_samples(samples)
    

    Predictions could either be loaded at the same time as the ground truth, or you can always iterate over your dataset in the future and add predictions at a later time:

    import fiftyone as fo
    
    # Ex: your custom predictions format
    predictions = {
        "/path/to/images/000001.jpg": [
            {"bbox": ..., "label": ..., "mask": ..., "score": ...},
            ...
        ],
        ...
    }
    
    # Add predictions to your samples
    for sample in dataset:
        filepath = sample.filepath
    
        # Convert predictions to FiftyOne format
        detections = []
        for obj in predictions[filepath]:
            label = obj["label"]
            confidence = obj["score"]
    
            # Bounding box coordinates should be relative values
            # in [0, 1] in the following format:
            # [top-left-x, top-left-y, width, height]
            bounding_box = obj["bbox"]
            
            # Boolean or 0/1 Numpy array 
            mask = obj["mask"]
    
            detection = fo.Detection(
                label=label,
                bounding_box=bounding_box,
                confidence=confidence,
            )
    
            detection["mask"] = mask
    
            detections.append(detection)
    
        # Store detections in a field name of your choice
        sample["predictions"] = fo.Detections(detections=detections)
    
        sample.save()
    

    Note that the exact structure of how to parse your custom label format will depend on how you are storing the data. This is just one example that stores labels in a dictionary keyed by media filepaths. You may first need to parse and convert your labels, for example into the expected bounding box format.