I've been learning about PoseNet in order to use it in my health-related research work.
I was impressed how mobilenet enables to keep high accuracy while reducing CPU (or GPU/NPU) dependency by adapting few parameters where my questions sprouted.
I've noticed that in mobilenet official papers, there were two multipliers introduced: alpha and rho. I'll skip the explanation of both parameters.
I wonder what is each value of alpha and rho for the mobilenet for the newest PoseNet model. Also, I'm wondering if there is a guideline for parameters(especially alpha and rho) tuning, and how the values of both are set and validated before training the model.
Like, if the selected value of alpha is 0.5, I wonder why the value is better than 0.75 or 0.25 .
My questions are:
The one in the https://www.tensorflow.org/lite/models/pose_estimation/overview uses alpha=1.0. The alpha multiplies number of input/output channels for each convolutions, and for alpha=1.0, first convolution layer has 32 channels. Nevertheless there are PoseNets with other backbones, which you can easily try from TF.js example. https://github.com/tensorflow/tfjs-models/tree/master/posenet
rho value is somewhat more theoretical, and in the original paper it says
In practice we implicitly set ρ by setting the input resolution.