stackcompiler-constructionvirtual-machineendianness

Register based VM - byte order - push/load


I need to write a 32-bit simulator for a processor (or a register based VM). For simplicity reasons, the RAM is a byte array, which I will cast to (int32_t*) (int16_t*) and (int8_t*) when I need to load words, half-wordy or bytes.

(I know that this forces me to align my variables according to their type.)

Now I want to push parameters to the stack when I call a function. And load them within the function when needed.

The source would look something like:

int8_t x = 1;
int8_t y = 2;
foo(x, y);

And the assembler shall look something like:

    PUSH R1
    PUSH R2
    CALL _FUNC_FOO

_FUNC_FOO:
    ; prologoue
    PUSH LR
    PUSH FP
    MOV FP SP

    ; loading parameters
    LOAD_B R1 FP #8
    LOAD_B R2 FP #12

    ; code
    ; ...

    ; epilogue
    MOVE SP FP
    POP FP
    POP LR
    RETURN

My problem is that this code works fine for 32 bit words and even with 16, and 8 bit words on machines with little endian architecture.

The 8/16 bit value will be pushed in RAM like 01-00-00-00 and 02-00-00-00. The LOAD instruction will load the first byte(s) of this 32 bit word, which happens to contain the correct value.

If I would port the VM to any machine using little endian architecture, the LOAD instruction would load the wrong byte(s).

00-00-00-01
~~~~~

How sould I solve this issue?

I would strongly prefer a solution, which is processor-architectur independent.

Thanks a lot!


Solution

  • "this code works fine for 32 bit words on machines with little endian architecture".

    I suspect that's more of a case that it works on machines where the native endianness matches the simulated endianness. That's the root of your problem. You're using a native 32 bits load instead of the simulated 32 bits load. That means you're not simulating endianness, nor alignment restrictions.

    Note that alignment can go wrong in two ways: either your simulation has stricter alignment (you fail to simulate alignment exceptions) or the native environment has stricter alignment (your simulation breaks on valid load/stores).

    The solution is simple: you need to correctly simulate loads and stores. How do do this will depend on your programming language.