cx86-64interruptcortex-mthumb

Can an x86_64 and/or armv7-m mov instruction be interrupted mid-operation?


I am wondering whenever I would need to use a atomic type or volatile (or nothing special) for a interrupt counter:

uint32_t uptime = 0;

// interrupt each 1 ms
ISR()
{
    // this is the only location which writes to uptime
    ++uptime;
}

void some_func()
{
    uint32_t now = uptime;
}

I myself would think that volatile should be enough and guarantee error-free operation and consistency (incremental value until overflow).

But it has come to my mind that maybe a mov instruction could be interrupted mid-operation when moving/setting individual bits, is that possible on x86_64 and/or armv7-m?

for example the mov instruction would begin to execute, set 16 bits, then would be pre-empted, the ISR would run increasing uptime by one (and maybe changing all bits) and then the mov instruction would be continued. I cannot find any material that could assure me of the working order.

Would this also be the same on armv7-m?

Would using sig_atomic_t be the correct solution to always have an error-free and consistent result or would it be "overkill"?

For example the ARM7-M architecture specifies:

In ARMv7-M, the single-copy atomic processor accesses are:
• All byte accesses.
• All halfword accesses to halfword-aligned locations.
• All word accesses to word-aligned locations.

would a assert with &uptime % 8 == 0 be sufficient to guarantee this?


Solution

  • You have to read the documentation for each separate core and/or chip. x86 is a completely separate thing from ARM, and within both families each instance may vary from any other instance, can be and should expect to be completely new designs each time. Might not be but from time to time are.

    Things to watch out for as noted in the comments.

    typedef unsigned int uint32_t;
    
    uint32_t uptime = 0;
    
    void ISR ( void )
    {
        ++uptime;
    }
    void some_func ( void )
    {
        uint32_t now = uptime;
    }
    

    On my machine with the tool I am using today:

    Disassembly of section .text:
    
    00000000 <ISR>:
       0:   e59f200c    ldr r2, [pc, #12]   ; 14 <ISR+0x14>
       4:   e5923000    ldr r3, [r2]
       8:   e2833001    add r3, r3, #1
       c:   e5823000    str r3, [r2]
      10:   e12fff1e    bx  lr
      14:   00000000    andeq   r0, r0, r0
    
    00000018 <some_func>:
      18:   e12fff1e    bx  lr
    
    Disassembly of section .bss:
    
    00000000 <uptime>:
       0:   00000000    andeq   r0, r0, r0
    

    this could vary, but if you find a tool on one machine one day that builds a problem then you can assume it is a problem. So far we are actually okay. because some_func is dead code the read is optimized out.

    typedef unsigned int uint32_t;
    
    uint32_t uptime = 0;
    
    void ISR ( void )
    {
        ++uptime;
    }
    uint32_t some_func ( void )
    {
        uint32_t now = uptime;
        return(now);
    }
    

    fixed

    00000000 <ISR>:
       0:   e59f200c    ldr r2, [pc, #12]   ; 14 <ISR+0x14>
       4:   e5923000    ldr r3, [r2]
       8:   e2833001    add r3, r3, #1
       c:   e5823000    str r3, [r2]
      10:   e12fff1e    bx  lr
      14:   00000000    andeq   r0, r0, r0
    
    00000018 <some_func>:
      18:   e59f3004    ldr r3, [pc, #4]    ; 24 <some_func+0xc>
      1c:   e5930000    ldr r0, [r3]
      20:   e12fff1e    bx  lr
      24:   00000000    andeq   r0, r0, r0
    

    Because of cores like mips and arm tending to have data aborts by default for unaligned accesses we might assume the tool will not generate an unaligned address for such a clean definition. But if we were to talk about packed structs, that is another story you told the compiler to generate an unaligned access and it will...If you want to feel safe remember a "word" in ARM is 32 bits so you can assert address of variable AND 3.

    x86 one would also assume a clean definition like that would result in an aligned variable, but x86 doesnt have the data fault issue by default and as a result compilers are a bit more free...focusing on arm as I think that is your question.

    Now if I do this:

    typedef unsigned int uint32_t;
    
    uint32_t uptime = 0;
    
    void ISR ( void )
    {
        if(uptime)
        {
            uptime=uptime+1;
        }
        else
        {
            uptime=uptime+5;
        }
    }
    uint32_t some_func ( void )
    {
        uint32_t now = uptime;
        return(now);
    }
    
    00000000 <ISR>:
       0:   e59f2014    ldr r2, [pc, #20]   ; 1c <ISR+0x1c>
       4:   e5923000    ldr r3, [r2]
       8:   e3530000    cmp r3, #0
       c:   03a03005    moveq   r3, #5
      10:   12833001    addne   r3, r3, #1
      14:   e5823000    str r3, [r2]
      18:   e12fff1e    bx  lr
      1c:   00000000    andeq   r0, r0, r0
    

    and adding volatile

    00000000 <ISR>:
       0:   e59f3018    ldr r3, [pc, #24]   ; 20 <ISR+0x20>
       4:   e5932000    ldr r2, [r3]
       8:   e3520000    cmp r2, #0
       c:   e5932000    ldr r2, [r3]
      10:   12822001    addne   r2, r2, #1
      14:   02822005    addeq   r2, r2, #5
      18:   e5832000    str r2, [r3]
      1c:   e12fff1e    bx  lr
      20:   00000000    andeq   r0, r0, r0
    

    the two reads results in two reads. now there is a problem here if the read-modify-write can get interrupted, but we assume since this is an ISR it cant? If you were to read a 7, add a 1 then write an 8 if you were interrupted after the read by something that is also modifying uptime, that modification has limited life, its modification happens, say a 5 is written, then this ISR writes an 8 on top if it.

    if a read-modify-write were in the interruptable code then the isr could get in there and it probably wouldnt work the way you wanted. This is two readers two writers you want one responsible for writing a shared resource and the others read-only. Otherwise you need a lot more work not built into the language.

    Note on an arm machine:

    typedef int __sig_atomic_t;
    ...
    typedef __sig_atomic_t sig_atomic_t;
    

    so

    typedef unsigned int uint32_t;
    typedef int sig_atomic_t;
    volatile sig_atomic_t uptime = 0;
    void ISR ( void )
    {
        if(uptime)
        {
            uptime=uptime+1;
        }
        else
        {
            uptime=uptime+5;
        }
    }
    uint32_t some_func ( void )
    {
        uint32_t now = uptime;
        return(now);
    }
    

    Isnt going to change the result. At least not on that system with that define, need to examine other C libraries and/or sandbox headers to see what they define, or if you are not careful (happens often) the wrong headers are used, the x6_64 headers are used to build arm programs with the cross compiler. seen gcc and llvm make host vs target mistakes.

    going back to a concern though which based on your comments you appear to already understand

    typedef unsigned int uint32_t;
    uint32_t uptime = 0;
    void ISR ( void )
    {
        if(uptime)
        {
            uptime=uptime+1;
        }
        else
        {
            uptime=uptime+5;
        }
    }
    void some_func ( void )
    {
        while(uptime&1) continue;
    }
    

    This was pointed out in the comments even though you have one writer and one reader

    00000020 <some_func>:
      20:   e59f3018    ldr r3, [pc, #24]   ; 40 <some_func+0x20>
      24:   e5933000    ldr r3, [r3]
      28:   e2033001    and r3, r3, #1
      2c:   e3530000    cmp r3, #0
      30:   012fff1e    bxeq    lr
      34:   e3530000    cmp r3, #0
      38:   012fff1e    bxeq    lr
      3c:   eafffffa    b   2c <some_func+0xc>
      40:   00000000    andeq   r0, r0, r0
    

    It never goes back to read the variable from memory, and unless someone corrupts the register in an event handler, this can be an infinite loop.

    make uptime volatile:

    00000024 <some_func>:
      24:   e59f200c    ldr r2, [pc, #12]   ; 38 <some_func+0x14>
      28:   e5923000    ldr r3, [r2]
      2c:   e3130001    tst r3, #1
      30:   012fff1e    bxeq    lr
      34:   eafffffb    b   28 <some_func+0x4>
      38:   00000000    andeq   r0, r0, r0
    

    now the reader does a read every time.

    same issue here, not in a loop, no volatile.

    00000020 <some_func>:
      20:   e59f302c    ldr r3, [pc, #44]   ; 54 <some_func+0x34>
      24:   e5930000    ldr r0, [r3]
      28:   e3500005    cmp r0, #5
      2c:   0a000004    beq 44 <some_func+0x24>
      30:   e3500004    cmp r0, #4
      34:   0a000004    beq 4c <some_func+0x2c>
      38:   e3500001    cmp r0, #1
      3c:   03a00006    moveq   r0, #6
      40:   e12fff1e    bx  lr
      44:   e3a00003    mov r0, #3
      48:   e12fff1e    bx  lr
      4c:   e3a00007    mov r0, #7
      50:   e12fff1e    bx  lr
      54:   00000000    andeq   r0, r0, r0
    

    uptime can have changed between tests. volatile fixes this.

    so volatile is not the universal solution, having the variable be used for one way communication is ideal, need to communicate the other way use a separate variable, one writer one or more readers per.

    you have done the right thing and consulted the documentation for your chip/core

    So if aligned (in this case a 32 bit word) AND the compiler chooses the right instruction then the interrupt wont interrupt the transaction. If it is an LDM/STM though you should read the documentation (push and pop are also LDM/STM pseudo instructions) in some cores/architectures those can be interrupted and restarted as a result we are warned about those situations in arm documentation.

    short answer, add volatile, and make it so there is only one writer per variable. and keep the variable aligned. (and read the docs each time you change chips/cores, and periodically disassemble to check the compiler is doing what you asked it to do). doesnt matter if it is the same core type (another cortex-m3) from the same vendor or different vendors or if it is some completely different core/chip (avr, msp430, pic, x86, mips, etc), start from zero, get the docs and read them, check the compiler output.