assemblyinline-assemblyarm64neon

ARM Assembly Vector addition


I have to realise vector addition in C++ program by using inline ARM Assembly.

I've written this code:

#include <iostream>
#include <stdio.h>
#include <arm_neon.h>

using namespace std;

int main(){
float v1[4] = {1.0f, 2.1f, -3.1f, 2.5f};
float v2[4] = {2.0f, 1.0f, 1.1f, -2.5f};
float result[4] = { };

asm(
"ldr q31, [%[vec1]]\n"
"ldr q30, [%[vec2]]\n"
"FADD v31.4S, v31.4S, v30.4S\n"
"str q31, [%[r]]\n"
:[r]"=r"(result): [vec1]"r"(&v1), [vec2]"r"(&v2)
);

for (float i: result) cout << " " << i;
cout << "\n";
}

but the result is something like: -3.33452e+38 9.18341e-41 -2.23081e+25 9.18341e-41

I am really new to assembly. Where's the problems in my code and how to fix them? Thank you.


Solution

  • Let me prefix this with a big caveat: getting GCC inline asm right is hard, especially for a beginner. The default advice is don't use it. There are some more general resources at https://stackoverflow.com/tags/inline-assembly/info, but if at all possible, I would start by writing standalone assembly functions (in their own .s file). If you start with inline assembly, you basically put yourself in the position of having to learn assembly language simultaneously with advanced (and poorly documented) compiler design.

    That said, your code is pretty good for a beginner, as it only has three bugs in its six lines of code. The bugs are the following:

    So a fixed version would look like:

    asm("ldr q31, [%[vec1]]\n"
        "ldr q30, [%[vec2]]\n"
        "FADD v31.4S, v31.4S, v30.4S\n"
        "str q31, [%[r]]\n"
        : // no outputs                                                                                                                                                                       
        : [r]"r"(result), [vec1]"r"(&v1), [vec2]"r"(&v2)
        : "q30", "q31", "memory");