I've see some benchmark about tensorflow
and pytorch
. Tensorflow
maybe faster but seems not that faster even sometimes slower.
Is there any benchmark on specifically testing on static graph and dynamic graph demonstrating that static graph is much faster that dynamic graph?
To be more precise, the speed benefit comes from "deferred execution with graph rewriting."
It's typically associated with explicit graph frameworks (Theano/TF), but with enough engineering you could add it to execution models like numpy/PyTorch which don't have explicit graph. See Bohrium for an example of hacking numpy to do rewriting.
Note that presence of this feature makes the framework less friendly for prototyping, so if you add this to PyTorch, you'll get the same problems that people complain about in TensorFlow
As far as performance, here's a toy benchmark in TensorFlow showing 5x speed-up when you turn on graph rewriting.
I crafted the example to be bottlenecked by memory bandwidth, so it's a no-brainer that graph rewriting (cwise fusion), will give significant speed-boost there. For production LSTM model Google reported 1.8 speed-up when turning on graph optimizations (through XLA)