The code is like
import onnxruntime
onnx_input= np.random.normal(size=[1, 3, 224, 224]).astype(np.float32)
ort_sess = onnxruntime.InferenceSession('model.onnx')
ort_inputs = {ort_sess.get_inputs()[0].name: onnx_input}
ort_outs =, ort_inputs)
I can get the network output from ort_outs, but how can I get the infernce time of each layer of the model?
I can get the model graph info by
import onnx
model = onnx.load("model.onnx")
or get the total inference time by
import time
start = time.time()
ort_outs =, ort_inputs)
end = time.time()
print(end - start)
but I don't know how to get the inference time per layer of the neural network. Thanks!
Please see for details on enabling profiling of individual nodes.
Note that the overall inference time will be meaningless when it is measuring per-node performance due to the overhead of outputting the profiling data.