I'd like to use the FiftyOne application for evaluating instance segmentation predictions. I used the the downloadable quickstart dataset for start and I created my own samples.json
file for the ground truth and prediction data, which can be loaded by Dataset.from_dir
. The json file in the tutorial contains only bounding boxes, but no masks. In the documentation I found only masks that are set to None
, but I would need them.
{
"samples": [
{
"filepath": "some_image.png",
"tags": [
"validation"
],
"metadata": null,
"uniqueness": 0.998,
"ground_truth": {
"_cls": "Detections",
"detections": [
{
"_id": {
"$oid": "efd81c917c9a49dc9f8aca3f"
},
"_cls": "Detection",
"label": "1",
"bounding_box": [
0.62609375,
0.27044921875,
0.030078124999999956,
0.009550781250000001
],
"mask": ???,
}
]
},
"predictions": {
"_cls": "Detections",
"detections": [
{
"_id": {
"$oid": "6db44c71c1414b8989c92255"
},
"_cls": "Detection",
"label": "1",
"bounding_box": [
0.3303889036178589,
0.4432219862937927,
0.07914596796035767,
0.02226179838180542
],
"mask": ???,
"confidence": 0.999962329864502
},
]
}
}
]
}
The problem I have is that how should I create the mask
property for the segmentation mask? It is a numpy array and not serializable by default. The Dataset.from_dir
uses the core.utils.deserialize_numpy_array
function for loading the dataset, so I tried to use serialize_numpy_array
with no success for saving the dataset.
So what would be the best way to write the mask into the json file that is deserializable? Thanks!
The Dataset.from_dir()
syntax is generally used for well-defined dataset types like a dataset stored on disk in the COCODetectionDataset
format.
In your case, when loading a custom dataset that does not directly correspond to one of the related dataset types, then the recommended approach is to write a simple Python loop and construct your dataset one sample at a time.
In this case, you do just need to load your mask as a numpy array and store it in the mask
attribute of your detection object.
import glob
import fiftyone as fo
images_patt = "/path/to/images/*"
# Ex: your custom label format
annotations = {
"/path/to/images/000001.jpg": [
{"bbox": ..., "label": ..., "mask": ...},
...
],
...
}
# Create samples for your data
samples = []
for filepath in glob.glob(images_patt):
sample = fo.Sample(filepath=filepath)
# Convert detections to FiftyOne format
detections = []
for objects in annotations[filepath]:
label = obj["label"]
# Bounding box coordinates should be relative values
# in [0, 1] in the following format:
# [top-left-x, top-left-y, width, height]
bounding_box = obj["bbox"]
# Boolean or 0/1 Numpy array
mask = obj["mask"]
detection = fo.Detection(
label=label,
bounding_box=bounding_box,
mask=mask,
)
detections.append(detection)
# Store detections in a field name of your choice
sample["ground_truth"] = fo.Detections(detections=detections)
samples.append(sample)
# Create dataset
dataset = fo.Dataset("my-detection-dataset")
dataset.add_samples(samples)
Predictions could either be loaded at the same time as the ground truth, or you can always iterate over your dataset in the future and add predictions at a later time:
import fiftyone as fo
# Ex: your custom predictions format
predictions = {
"/path/to/images/000001.jpg": [
{"bbox": ..., "label": ..., "mask": ..., "score": ...},
...
],
...
}
# Add predictions to your samples
for sample in dataset:
filepath = sample.filepath
# Convert predictions to FiftyOne format
detections = []
for obj in predictions[filepath]:
label = obj["label"]
confidence = obj["score"]
# Bounding box coordinates should be relative values
# in [0, 1] in the following format:
# [top-left-x, top-left-y, width, height]
bounding_box = obj["bbox"]
# Boolean or 0/1 Numpy array
mask = obj["mask"]
detection = fo.Detection(
label=label,
bounding_box=bounding_box,
confidence=confidence,
)
detection["mask"] = mask
detections.append(detection)
# Store detections in a field name of your choice
sample["predictions"] = fo.Detections(detections=detections)
sample.save()
Note that the exact structure of how to parse your custom label format will depend on how you are storing the data. This is just one example that stores labels in a dictionary keyed by media filepaths. You may first need to parse and convert your labels, for example into the expected bounding box format.