The code is like
import onnxruntime
onnx_input= np.random.normal(size=[1, 3, 224, 224]).astype(np.float32)
ort_sess = onnxruntime.InferenceSession('model.onnx')
ort_inputs = {ort_sess.get_inputs()[0].name: onnx_input}
ort_outs = ort_sess.run(None, ort_inputs)
I can get the network output from ort_outs, but how can I get the infernce time of each layer of the model?
I can get the model graph info by
import onnx
model = onnx.load("model.onnx")
print(onnx.helper.printable_graph(model.graph))
or get the total inference time by
import time
start = time.time()
ort_outs = ort_sess.run(None, ort_inputs)
end = time.time()
print(end - start)
but I don't know how to get the inference time per layer of the neural network. Thanks!
Please see https://onnxruntime.ai/docs/performance/tune-performance/profiling-tools.html for details on enabling profiling of individual nodes.
Note that the overall inference time will be meaningless when it is measuring per-node performance due to the overhead of outputting the profiling data.