I have a custom trained detectron2 model for instance segmentation that I load and use for inference as below:
cfg = get_cfg()
cfg.merge_from_file(config_path)
cfg.MODEL.WEIGHTS = weights_path
cfg.MODEL.ROI_HEADS.NUM_CLASSES = 1
cfg.MODEL.ROI_HEADS.SCORE_THRESH_TEST = 0.7
cfg.MODEL.DEVICE='cpu'
predictor = DefaultPredictor(cfg)
outputs = predictor(im)
I have followed this tutorial from OpenVino in order to convert it to an OV model:
model = build_model(cfg)
DetectionCheckpointer(model).load(cfg.MODEL.WEIGHTS)
model.eval()
ov_model = convert_detectron2_model(model, im)
core = ov.Core()
ov_model = core.read_model("../model/model.xml")
compiled_model = ov.compile_model(ov_model)
results = compiled_model(sample_input[0]["image"])
However, I don't obtain the expected result from the complied ov model, ie. I normally should obtain one instance in the outputs and this is the case when using the classic detectron2 model, but when using the complied ov model, I don't get any instance detected.
Instances(num_instances=0, image_height=3024, image_width=4032, fields=[pred_boxes: Boxes(tensor([], size=(0, 4))), scores: [], pred_classes: [], pred_masks: tensor([], size=(0, 3024, 4032), dtype=torch.bool)])
Here is the information about the OV model:
<Model: 'Model4'
inputs[
<ConstOutput: names[args] shape[?,?,?] type: u8>
]
outputs[
<ConstOutput: names[tensor, 2193, 2195, 2190, 2188, 2172, 2167] shape[..100,4] type: f32>,
<ConstOutput: names[] shape[..100] type: i64>,
<ConstOutput: names[] shape[?,1,28,28] type: f32>,
<ConstOutput: names[] shape[..100] type: f32>,
<ConstOutput: names[image_size] shape[2] type: i64>
]>
UPDATE: As I really need to convert it to another framework in order to use model serving tools, I also tried to convert it to Torchscript. However, here again I have an empty output.
This time I followed the tutorial in this issue:
def inference_func(model, image):
inputs= [{"image": image}]
return model.inference(inputs, do_postprocess=False)[0]
wrapper= TracingAdapter(model, sample_input[0]["image"], inference_func)
wrapper.eval()
traced_script_module= torch.jit.trace(wrapper, (sample_input[0]["image"],))
traced_script_module.save("torchscript.pt")
The conversion is then done after a LOT of warnings, and here is a snippet of the resulting model:
RecursiveScriptModule(
original_name=TracingAdapter
(model): RecursiveScriptModule(
original_name=GeneralizedRCNN
(backbone): RecursiveScriptModule(
original_name=FPN
(fpn_lateral2): RecursiveScriptModule(original_name=Conv2d)
(fpn_output2): RecursiveScriptModule(original_name=Conv2d)
(fpn_lateral3): RecursiveScriptModule(original_name=Conv2d)
(fpn_output3): RecursiveScriptModule(original_name=Conv2d)
(fpn_lateral4): RecursiveScriptModule(original_name=Conv2d)
(fpn_output4): RecursiveScriptModule(original_name=Conv2d)
(fpn_lateral5): RecursiveScriptModule(original_name=Conv2d)
(fpn_output5): RecursiveScriptModule(original_name=Conv2d)
(top_block): RecursiveScriptModule(original_name=LastLevelMaxPool)
(bottom_up): RecursiveScriptModule(
original_name=ResNet
(stem): RecursiveScriptModule(
original_name=BasicStem
(conv1): RecursiveScriptModule(
original_name=Conv2d
(norm): RecursiveScriptModule(original_name=FrozenBatchNorm2d)
)
...
I then proceed the inference:
outputs = torchscript_model(sample_input[0]["image"])
And the resulting outputs are empty:
(tensor([], size=(0, 4), grad_fn=<ViewBackward0>),
tensor([], dtype=torch.int64),
tensor([], size=(0, 1, 28, 28), grad_fn=<SplitWithSizesBackward0>),
tensor([], grad_fn=<IndexBackward0>),
tensor([3024, 4032]))
I also have tried to export the model directly using the detectron2 functions documented here, but again I have the same issue in the resulting model's inference.
python ../detectron2-main/tools/deploy/export_model.py --config-file ../detectron2-main/configs/COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x.yaml --output torchscript_model --format torchscript --sample-image ../data/test.JPG --export-method tracing MODEL.DEVICE cpu MODEL.WEIGHTS ../model/custom_trained_model.pth
I'll take any idea!
Follow the Prepare input data steps in tutorial to apply preprocessing steps if you did not apply for your defined "im".
import detectron2.data.transforms as T
from detectron2.data import detection_utils
image_file = "example_image.jpg"
def get_sample_inputs(image_path, cfg):
# get a sample data
original_image = detection_utils.read_image(image_path, format=cfg.INPUT.FORMAT)
# Do same preprocessing as DefaultPredictor
aug = T.ResizeShortestEdge([cfg.INPUT.MIN_SIZE_TEST, cfg.INPUT.MIN_SIZE_TEST], cfg.INPUT.MAX_SIZE_TEST)
height, width = original_image.shape[:2]
image = aug.get_transform(original_image).apply_image(original_image)
image = torch.as_tensor(image.astype("float32").transpose(2, 0, 1))
inputs = {"image": image, "height": height, "width": width}
# Sample ready
sample_inputs = [inputs]
return sample_inputs
sample_input = get_sample_inputs(image_file, cfg)
ov_model = convert_detectron2_model(model, sample_input)
core = ov.Core()
ov_model = core.read_model("../model/model.xml")
compiled_model = ov.compile_model(ov_model)
results = compiled_model(sample_input[0]["image"])