computer-visionpytorchweb-deploymentobject-detectionfaster-rcnn

How to prune a Detectron2 model?


I'm a teacher who is studying computer vision for months. I was very excited when I was able to train my first object detection model using Detectron2's Faster R-CNN model. And it works like a charm! Super cool!

But the problem is that, in order to increase the accuracy, I used the largest model in the model zoo.

Now I want to deploy this as something people can use to ease their job. But, the model is so large that it takes ~10 seconds to infer a single image on my CPU which is Intel i7-8750h.

Therefore, it's really difficult to deploy this model even on a regular cloud server. I need to use either GPU servers or latest model CPU servers which are really expensive and I'm not sure if I can even compensate for server expenses for months.

I need to make it smaller and faster for deployment.

So, yesterday I found that there's something like pruning the model!! I was very excited (since I'm not a computer or data scientists, don't blame me (((: )

I read official pruning documentation of PyTorch, but it's really difficult for me to understand.

I found global pruning is of the easiest one to do. enter image description here

But the problem is, I have no idea what parameters should I write to prune.

Like I said, I used Faster R-CNN X-101 model. I have it as "model_final.pth". And it uses Base RCNN FPN.yaml and its meta architecture is "GeneralizedRCNN".

It seems like an easy configuration to do. But like I said, since it's not my field it's very hard for a person like me.

I'd be more than happy if you could help me on this step by step.

I'm leaving my cfg.yaml which I used training the model and I saved it using "dump" method in Detectron2 config class just in case. Here's the Drive link.

Thank you very much in advance.


Solution

  • So I guess, you are trying to optimize inference time and achieving satisfactory accuracy. Without knowing details about your object types, training size, image size, it will be hard to provide suggestions. However, as you know, ML project development is an iterative process, you can have a look at the following page and check inference and accuracy.

    https://github.com/facebookresearch/detectron2/blob/master/MODEL_ZOO.md#coco-object-detection-baselines

    I would suggest, you try R50-FPN backbone and see how your accuracy comes. Then, you will get a better understanding of what to do next.