pythonprofilingopencvzynq

Profiling C++ OpenCV functions being called from Python code


I am using OpenCV 4.0.0 to do image processing using the Python bindings in the cv2 module. I have used the cProfile library, which tells me that (obviously) the OpenCV functions I call directly are taking up the most time, but cannot see deeper because they are calling C++ functions from a compiled library. I would like to profile the OpenCV code to determine which functions are taking up the majority of the execution time.

I have tried the built in OpenCV profiling described here, but I get a warning

[ WARN:0] Trace: Total skipped events: 2117

and no OpenCVTrace.txt. I have tried the "yep" module on pypi which wraps google-perftools, but I get buggy behavior as described here, and the proposed fix does not work for me. I have tried ltrace and latrace, but both appear to be broken. I am not sure what I can try next or if this is even a possible task.

For some background, this code is for my senior design project at college. I am implementing facial detection/recognition with OpenCV running on the ARM processor of a Zynq-7000 SOC and then accelerating the bottlenecks using the FPGA fabric. That is of course dependent on being able to identify the hotspots by profiling.


Solution

  • I have had success with the perf tool, which lets me know which functions are taking the most time. On my Pynq board specifically, the executable is found in /usr/lib/linux-tools-4.15.0-20 which is not by default in PATH. I also used FlameGraph for excellent visualization of the call graph.