Let's say I have a 3-element array of 64-bit data:
src DCQ 0x0200200AD00236DD
DCQ 0x00003401AAC4D097
DCQ 0X0001FC219AC931BE
assuming that I know the address of "src" (named srcAdr), I can load the lower 32-bit content of an element of src at a certain index into a register named srcLo by saying:
LDR srcLo, [srcAdr, index, LSL#3]
In order to get the higher 32-bit content of this element, I know I can:
ADD srcAdrHi, srcAdr, #4
LDR srcHi, [srcAdrHi, index, LSL#3]
Question is, is there a more elegant way to do this? Say, for example, in one instruction?
Following my comment: I do not think you can do without an extra instruction here if for whatever reason you must work with the data as with a uint64_t array, using an index.
For a 'C' function:
int foo(unsigned long long *srcT, int index) {
unsigned int temp=0;
temp = (unsigned int)(srcT[index]);
temp += (unsigned int)(srcT[index] >> 32);
return temp;
}
The compiler (ARM gcc 8.2 -O3 -mcpu=arm7tdmi) produced:
foo:
add r3, r0, r1, lsl #3
ldr r3, [r3, #4]
ldr r0, [r0, r1, lsl #3]
add r0, r0, r3
bx lr
As you can see it also produced an extra instruction (add
) to access the 'high half'.
The exact sequence of instructions would, of course, depend on what manipulations are performed on the array. If you'd walk through it in a loop you'd most likely get ldm
+ add Rx,#8
, etc.