I have a piece of code with doubly linked lists (inspired by the linux lists) that works without compiler optimizations. When I activate compiler optimization, it fails.
My program adds an element to the list, then it loops over the list while it is not empty.
In the loop body the first element is fetched, then an assert checks that the element is in the list, then the element is removed.
This assert fails with compiler optimizations (-O2
). With a compiler barrier, there is no error.
I assume that in the error case the compiler does not reload the elements of the list in every while (!list_empty(&li))
loop iteration.
When I look at the compiled code of the main function, I see that there is no conditional code and __assert_fail
is called at the end.
But why? I always thought that in single-threaded code it is safe to program without barriers.
This is my Code:
#include <stdlib.h>
#include <stddef.h>
#include <stdio.h>
#include <assert.h>
struct list_elem {
struct list_elem *next;
struct list_elem *prev;
};
struct list {
struct list_elem *next;
struct list_elem *prev;
};
static inline void init_list_head(struct list *list)
{
list->next = (struct list_elem *)list;
list->prev = (struct list_elem *)list;
}
static inline void init_list_elem(struct list_elem *elem)
{
elem->next = NULL;
elem->prev = NULL;
}
static inline void *list_entry(struct list_elem *elem, size_t offset)
{
return (char *)(elem) - offset;
}
static inline int list_empty(struct list *list)
{
return list->next == (struct list_elem *)list;
}
static inline struct list_elem *first_list_elem(struct list *list)
{
return list->next;
}
static inline int elem_not_in_list(struct list_elem *elem)
{
return elem->next == NULL;
}
static inline void __list_add(struct list_elem *elem, struct list_elem *prev, struct list_elem *next)
{
next->prev = elem;
elem->next = next;
elem->prev = prev;
prev->next = elem;
}
static inline void list_add(struct list_elem *elem, struct list *list)
{
__list_add(elem, (struct list_elem *)list, list->next);
}
static inline void list_del(struct list_elem *elem)
{
elem->prev->next = elem->next;
elem->next->prev = elem->prev;
elem->next = NULL;
elem->prev = NULL;
}
struct record {
int value;
struct list_elem link;
};
struct record r1;
struct list li;
int main(int argc, char *argv[])
{
init_list_head(&li);
init_list_elem(&r1.link);
r1.value = 1;
list_add(&r1.link, &li);
while (!list_empty(&li)) {
struct record *r = list_entry(first_list_elem(&li), offsetof(struct record, link));
assert(!elem_not_in_list(&r->link));
list_del(&r->link);
r1.value += argc;
// compiler barrier
//asm volatile("": : :"memory");
}
return r1.value;
}
Compiler:
gcc version 11.2.0 (GCC)
Target: arm-linux-musleabihf
Compile command:
gcc -O2 -o testbarrier testbarrier.c
Compiled code:
00010290 <main>:
10290: e59f1030 ldr r1, [pc, #48] ; 102c8 <main+0x38>
10294: e3a0c000 mov ip, #0
10298: e2800001 add r0, r0, #1
1029c: e92d4010 push {r4, lr}
102a0: e3a02057 mov r2, #87 ; 0x57
102a4: e5810008 str r0, [r1, #8]
102a8: e5811000 str r1, [r1]
102ac: e5811004 str r1, [r1, #4]
102b0: e581c00c str ip, [r1, #12]
102b4: e581c010 str ip, [r1, #16]
102b8: e59f300c ldr r3, [pc, #12] ; 102cc <main+0x3c>
102bc: e59f100c ldr r1, [pc, #12] ; 102d0 <main+0x40>
102c0: e59f000c ldr r0, [pc, #12] ; 102d4 <main+0x44>
102c4: ebffffe8 bl 1026c <__assert_fail@plt>
102c8: 0002103c .word 0x0002103c
102cc: 000104dc .word 0x000104dc
102d0: 000104b0 .word 0x000104b0
102d4: 000104c0 .word 0x000104c0
The code has undefined behavior: you cast list
as a (struct list_elem *)
and access its fields via a different type. This is a violation of the strict aliasing rule. The compiler may generate code that appears to work as expected with no optimisations, but when you enable optimisations with -O2
, the compiler makes a deeper analysis of the code and assumes you do not violate this rule. As a consequence, the code generated may not work as expected anymore.
To avoid this undefined behavior, either
tell the compiler that you intent to break (or do not fully understand) the strict aliasing rule with -fno-strict-aliasing
. Your program still has undefined behavior, but the compiler will assume you might break this particular rule.
or modify the code according to n. m. will see y'all on Reddit's comment to avoid casts and type punning.