Do I need to split up a very large number to move it into a 32-bit register in ARM64

I am trying to write a program in ARMv8 AArch64 (ARM64 if you prefer) and I want to move a very large number into a 32-bit register. From what I know, the correct format is

mov w20, 36383899

However, I saw somewhere that this is incorrect as "the instructions are 32 bits which means that there are some bits taken up by the instruction as well as the target register address". Basically, what they said was that after the bits taken up for the instruction and target register address, there may not be enough space for the number you are trying to move in.

They are saying that the correct form should be

mov w19, 13880 //first part of 36383899
lsl w19, 16    //logical shift left with shift count 16
mov w20, 14489 //second part of 36383899
orr w19, w19, w20 //this should be 36383899

Now, the program I am running plans to loop from -50,000,000 to 50,000,000 and I believe that since it since the values I am working with are within the range of -2,474,483,648 to 2,474,483,647 (which is the range for 32-bit registers), I should be fine.

However, since it is unfeasible to single step through 100,000,000 loops, I want to ask if I do need to break it up or am I fine just moving the values in as usual?

If it turns out that I need to break it up, is there a methodology to splitting the numbers?

Solution

It's true that not every 32-bit number can be moved into a register in a single instruction. As you say, that would be fundamentally impossible because instructions are 32 bits.

However, the exact criteria for which numbers work are complicated. Every 16-bit number works (either zero- or sign-extended), but some other ones do as well. Rather than try to work it out for yourself, you might as well just try and assemble mov w20, 36383899. If there is a way to encode it in one instruction, your assembler will figure it out; otherwise you will get an error and can proceed to the next step.

If it can't be done in one instruction, then you will need more than one. However the sequence proposed in your answer is inefficient. The preferred way is to use the movk instruction, which will load an arbitrary 16-bit value into a specified halfword of the register, leaving the other bits unchanged. So to load 0xdeadbeef, you could do

mov  w20, 0xbeef
movk w20, 0xdead, lsl 16

In your case, rather than computing the high and low bits by hand, you could let your assembler do it:

mov  w20, (36383899 & 0xffff)
movk w20, (36383899 >> 16), lsl 16