Is there any simple way-library to efficient (max possible speed) implement linear algebra on an ARM CortexA9 dual core using Xilinx SDK?
I am using a zybo z7 developememt board with a dual core Arm proccesor and i want to implement a simple neural network with one convolution layer followed by a dense one, on Xilinx SDK. Specificaly, to tranfer a python numpy based model on Arm. I read some manuals for ARM and SIMD library but i don't want to dive so deep.
An easy way for me is to use a library and do the multiplication/dot product/convolve etc by itself (fast) like numpy in python and avoid pure for...loop syntax. An example would be nice!
Thank for your time
You can try the Eigen library used by Tensorflow to implement the matrix calculations, or you can even try to use TensorFlow lite which is already tested with the ARM-Cortex M series of processors.