[SOLVED] are there any methods to get per layer inference time of an onnx model?

are there any methods to get per layer inference time of an onnx model?

The code is like

import onnxruntime

onnx_input= np.random.normal(size=[1, 3, 224, 224]).astype(np.float32)
ort_sess = onnxruntime.InferenceSession('model.onnx')
ort_inputs = {ort_sess.get_inputs()[0].name: onnx_input}
ort_outs = ort_sess.run(None, ort_inputs)

I can get the network output from ort_outs, but how can I get the infernce time of each layer of the model?

I can get the model graph info by

import onnx
model = onnx.load("model.onnx")
print(onnx.helper.printable_graph(model.graph))

or get the total inference time by

import time

start = time.time()
ort_outs = ort_sess.run(None, ort_inputs)
end = time.time()
print(end - start)

but I don't know how to get the inference time per layer of the neural network. Thanks!

Solution

Please see https://onnxruntime.ai/docs/performance/tune-performance/profiling-tools.html for details on enabling profiling of individual nodes.

Note that the overall inference time will be meaningless when it is measuring per-node performance due to the overhead of outputting the profiling data.