I try to use a tensorflow model trained on python in WinML. I successfully convert protobuf to onnx. The following performance result are obtained :
The inference on CPU take arround 86s.
On performance tools WinML doesn't seem to correctly use the GPU in comparison of other. It's seemed WinML use DirectML as backend (We observe DML prefix on Nvidia GPU profiler). Is it possible to use Cuda inference Engine with WinML ? Did anyone observe similar result, WinML being abnormally slow on GPU ?
I got some answer about this WinML performance. My network use LeakyRelu that was supported by DirectML only in Windows 2004. On Windows previous version, this issue disable the use of DirectML Metacommand thus bad performance. With the new windows version I got good performance with WinML.