C++ decrementing an element of a single-byte (volatile) array is not atomic! WHY? (Also: how do I force atomicity in Atmel AVR MCUs/Arduino)

I just lost days, literally, ~25 hours of work, due to trying to debug my code over something simple that I didn't know while making a fireshooting hexacopter BattleBot (see it here and on my personal website here).

It turns out decrementing an element of a single-byte array in C++, on an AVR ATmega328 8-bit microcontroller (Arduino) is not an atomic operation, and requires atomic access guards (namely, turning off interrupts). Why is this??? Also, what are all of the C techniques to ensure atomic access to variables on an Atmel AVR microcontroller?

Here's a dumbed down version of what I did:

// Global variables:
const uint8_t NUM_INPUT_PORTS = 3;
volatile uint8_t numElementsInBuf[NUM_INPUT_PORTS];

ISR(PCINT0_vect) // External pin change interrupt service routine on input port 0
{
  // Do stuff here
  for (uint8_t i=0; i<NUM_INPUT_PORTS; i++)
    numElementsInBuf[i]++;
}

loop()
{
  for (uint8_t i=0; i<NUM_INPUT_PORTS; i++)
  {
    // Do stuff here
    numElementsInBuf[i]--; // <-- THIS CAUSES ERRORS!!!!! THE COUNTER GETS CORRUPTED.
  }
}

Here's the version of loop that's fine:

loop()
{
  for (uint8_t i=0; i<NUM_INPUT_PORTS; i++)
  {
    // Do stuff here
    noInterrupts(); // Globally disable interrupts
    numElementsInBuf[i]--; // Now it's OK...30 hours of debugging....
    interrupts(); // Globally re-enable interrupts
  }
}

Notice the "atomic access guards", i.e., disabling interrupts before decrementing, then re-enabling them after.

Since I was dealing with a single byte here, I didn't know I'd need atomic access guards. Why do I need them for this case? Is this typical behavior? I know I'd need them if this was an array of 2-byte values, but why for 1-byte values???? Normally for 1-byte values atomic access guards are not required here...

Read the "Atomic access" section here: http://www.gammon.com.au/interrupts. This is a great source.

Related (answer for STM32 MCUs):

So we know that reading from or writing to any single-byte variable on AVR 8-bit MCUs is an atomic operation, but what about STM32 32-bit MCUs? Which variables have automatic atomic reads and writes on STM32? The answer is here: Which variable types/sizes are atomic on STM32 microcontrollers?.

Solution

Update 10 May 2023: the problem in the question was related to my first ever ring buffer implementation I wrote 7 years ago in 2016. I finally wrote a really good ring buffer implementation that is lock-free when used on any system which supports C11 or C++11 atomic types. It is the best implementation I've ever written, and also the best I've ever seen. It solves a lot of the problems of other implementations. Full details are in the top of the file. It runs in both C and C++. You can see the full implementation here: containers_ring_buffer_FIFO_GREAT.c in my eRCaGuy_hello_world repo.

Ok, the answer to "Why is incrementing/decrementing a single byte variable NOT atomic?" is answered very well here by Ishamael here, and Michael Burr here.

Essentially, on an 8-bit AVR mcu, 8-bit reads are atomic, and 8-bit writes are atomic, and that's it! Increment and decrement are never atomic, nor are multi-byte reads and writes on this architecture!

Now that I got my answer that -- decrement and ++ increment operations are never atomic, even when done on byte values (see answers above and Nick Gammon's link here), I'd like to ensure the follow-up question of how do I force atomicity on Atmel AVR microcontrollers is also answered so this question becomes a good resource.

Here are all techniques I am aware of to force atomicity in Atmel AVR microcontrollers, such as Arduino:

Option 1 (the preferred method):

uint8_t SREG_bak = SREG; // save global interrupt state
noInterrupts();          // disable interrupts (for Arduino only; this is 
                         // an alias of AVR's "cli()")
// your atomic variable-access code goes here
SREG = SREG_bak;         // restore interrupt state

Option 2 (the less-safe, not recommended method, since it can cause you to inadvertently enable nested interrupts if you accidentally use this approach in a code block or library which gets called inside an ISR):

Macros offered by Arduino in Arduino.h at "arduino-1.8.13/hardware/arduino/avr/cores/arduino/Arduino.h", for instance:
```
noInterrupts();  // disable interrupts (Arduino only; this is an alias to 
                 // AVR's "cli()")
// your atomic variable-access code goes here
interrupts();    // enable interrupts (Arduino only; this is an alias to 
                 // AVR's "sei()")
```
Alternative option 2:

AVRlibc Macros directly to the AVR cli assembly instruction. These macros are defined in interrupt.h at "arduino-1.8.13/hardware/tools/avr/avr/include/avr/interrupt.h", for instance:
```
cli();  // clear (disable) the interrupts flag; `noInterrupts()` is simply 
        // a macro to this macro
// your atomic variable-access code goes here
sei();  // set (enable) the interrupts flag; `interrupts()` is simply a 
        // macro to this macro
```
Option 3 [BEST] (essentially the same as option 1; just using a macro held in an avr-libc library instead, and with variable scope applied within the braces of course)

Super fancy macros offered by AVRlibc in atomic.h at "arduino-1.8.13/hardware/tools/avr/avr/include/util/atomic.h", for example.
```
#include <util/atomic.h> // (place at the top of your code)

ATOMIC_BLOCK(ATOMIC_RESTORESTATE)
{
    // your atomic variable-access code goes here
}
```
These macros rely on the gcc extension __cleanup__ attribute (see here: https://gcc.gnu.org/onlinedocs/gcc/Common-Variable-Attributes.html, and search the page for "cleanup"), which runs "runs a function when the variable goes out of scope". Essentially, this allows you to create object or variable destructors (a C++-like concept) in C.

See:
1. The official AVRlibc documentation on the ATOMIC_BLOCK() macro: http://www.nongnu.org/avr-libc/user-manual/group__util__atomic.html.
2. gcc cleanup attribute documentation: https://gcc.gnu.org/onlinedocs/gcc/Common-Variable-Attributes.html
3. My very thorough answer where I go into this a lot: Which Arduinos support ATOMIC_BLOCK?. I cover:
  1. Which Arduino's support the ATOMIC_BLOCK macros?
  2. How are the ATOMIC_BLOCK macros implemented in C with the gcc compiler, and where can I see their source code?
  3. How could you implement the ATOMIC_BLOCK functionality in Arduino in C++ (as opposed to avrlibc's gcc C version)? - including writing a version functionally similar to C++'s std::lock_guard object.

Why not just use the `atomic_*` types offered by C11 and C++11 or later?

You may be aware of the atomic types in C and C++ as of their 2011 versions or later. In both languages, you have aliases to them like atomic_bool and atomic_uint_fast32_t.

In C, atomic_uint_fast32_t is an alias to _Atomic uint_fast32_t. You must include the <stdatomic.h> header file to use them.
1. See the cppreference community wiki documentation on this for C here: https://en.cppreference.com/w/c/thread#Atomic_operations
In C++, atomic_uint_fast32_t is an alias to std::atomic<std::uint_fast32_t>. You must include the <atomic> header file to use them.
1. See the cppreference community wiki documentation on this for C++ here: https://en.cppreference.com/w/cpp/atomic/atomic

However, these types are not available on 8-bit Atmel/Microchip ATmega328 mcus! See my comments below this answer.

I just checked. In Arduino 1.8.13, when I do #include <stdatomic.h> and then atomic_uint_fast32_t i = 0;, I get: error: 'atomic_uint_fast32_t' does not name a type; did you mean 'uint_fast32_t'? This is for the ATmega328 mcu. Arduino was building with C++ using avr-g++. So, the 8-bit AVR gcc/g++ toolchain does not yet support atomic types. It's probably because AVRlibc isn't well supported nor well-updated anymore as the language standards progress, especially since it's on a voluntary basis, I believe, and is a lowly 8-bit microcontroller in the days of modern 32-bit microcontrollers ruling the world.

See also the comment discussion about this under my answer and @Michael Burr's answer.

Full example usage: how to efficiently, atomically, read shared `volatile` variables

So, instead, we must enforce atomicity using atomic access guards as described above. In our case on 8-bit AVR mcus, that means turning off interrupts to prevent being interrupted, then restoring the interrupt state when done. The best way to do this is usually to quickly atomically copy out your variable of interest, then use your copy in calculations which take more time. Here's the gist of it:

#include <util/atomic.h>

// shared variable shared between your ISR and main loop; you must *manually*
// enforce atomicity on 8-bit AVR mcus!
volatile uint32_t shared_variable;

ISR(PCINT0_vect)
{
    // interrupts are already off here, inside ISRs, by default

    // do stuff to get a new value for the shared variable

    // update the shared volatile variable
    shared_variable = 789;
}

// process data from the ISR
void process_data_from_isr()
{
    // our goal is to quickly atomically copy out volatile data then restore
    // interrupts as soon as possible
    uint32_t shared_variable_copy;
    ATOMIC_BLOCK(ATOMIC_RESTORESTATE)
    {
        // your atomic variable-access code goes here
        //
        // KEEP THIS SECTION AS SHORT AS POSSIBLE, TO MINIMIZE THE TIME YOU'VE
        // DISABLED INTERRUPTS!

        shared_variable_copy = shared_variable;
    }

    // Use the **copy** in any calculations, so that interrupts can be back ON
    // during this time!
    do_long_calculations(shared_variable_copy);
}

loop()
{
    process_data_from_isr();
}

int main()
{
    setup();

    // infinite main loop
    for (;;)
    {
        loop(); 
    }

    return 0;
}

[My Q&A] Which variable types/sizes are atomic on STM32 microcontrollers?
https://stm32f4-discovery.net/2015/06/how-to-properly-enabledisable-interrupts-in-arm-cortex-m/
***** [My answer] Which Arduinos support ATOMIC_BLOCK? [and how can I duplicate this concept in C with __attribute__((__cleanup__(func_to_call_when_x_exits_scope))) and in C++ with class constructors and destructors?]
For how to do this in STM32 microcontrollers instead, see my answer here: What are the various ways to disable and re-enable interrupts in STM32 microcontrollers in order to implement atomic access guards?

C++ decrementing an element of a single-byte (volatile) array is not atomic! WHY? (Also: how do I force atomicity in Atmel AVR MCUs/Arduino)

Read the "Atomic access" section here: http://www.gammon.com.au/interrupts. This is a great source.

Related (answer for STM32 MCUs):

Here are all techniques I am aware of to force atomicity in Atmel AVR microcontrollers, such as Arduino:

Why not just use the atomic_* types offered by C11 and C++11 or later?

Full example usage: how to efficiently, atomically, read shared volatile variables

Related:

Why not just use the `atomic_*` types offered by C11 and C++11 or later?

Full example usage: how to efficiently, atomically, read shared `volatile` variables