C23 introduced the attributes [[reproducible]]
and [[unsequenced]]
.
The motivating problem is that the compiler has no insight into functions when no definition of the function is available. This prevents almost all compiler optimizations (unless LTOs are used).
Consider the following example:
// note: this is redundant, because [[unsequenced]] implies [[reproducible]]
int square(int x) [[reproducible]] [[unsequenced]];
int arr[] = {
square(2),
square(2),
square(3),
square(3),
};
Even though the compiler has no definition of square
, it is allowed to perform two optimizations:
[[reproducible]]
, square(2)
yields the same result when called twice in a row, and the compiler can decide to call square(2)
only once[[unsequenced]]
, calls to square
can be made in any order, and the compiler could even decide to evaluate square(2)
just once at program startup. It can also decide to evaluate square(3)
before square(2)
, if this is somehow more efficient.Such optimizations can also be made by defining functions inline
in headers, and letting the compiler infer these properties on its own. However, for complicated functions, making everything inline
isn't feasible due to the added compilation slowdown.
For a more rigorous explanation, see the C23 standard working draft N3096 §6.7.12.7 Standard attributes for function types.
[[reproducible]]
This attribute asserts that a function is a reproducible function, which means that
Effectless restricts what state a function can modify. If any non-local state is modified, this can only happen through pointers passed to it. For example, a void to_upper_case(char *str)
function is effectless if it only modifies local variables and the contents of str
. (Intuitively, the function has no observable side effects.)
Idempotent means that calling the function multiple times has the same effect as calling it once. For example, we can call to_upper_case(s); to_upper_case(s);
, and it would have the same effect as calling it just once.
[[unsequenced]]
This attribute asserts that a function is an unsequenced function, which means that
Stateless means that static
or thread_local
local variables cannot be non-const
, and cannot be volatile
.
Independent means that all calls of the function will see the same values for global variables, won't change global state, and won't change any state through pointer parameters. to_upper_case
is not independent, but a function like strlen
can be.
Intuitively, an unsequenced function can be arbitrarily sequenced, and even sequenced in parallel between changes to its observed state: (see also footnote 196 in the standard)
char *str = /* ... */; // A
strlen(str);
global = 123;
strlen(str);
strcpy(str, /* ... */); // B
In this example, there can be one, two, or infinitely many calls to strlen
between points A
and B
. These can happen sequentially, or in parallel. No matter what, the outcome must be the same for an unsequenced function. The mutation of global
is not allowed to change the result of strlen
.
The GCC attributes pure
and const
are the inspiration for these standard attributes, and behave similarly. See N2956 5.8 Some differences with GCC const and pure for a comparison. In short:
pure
is more relaxed than [[reproducible]]
const
is more strict than [[unsequenced]]
These attributes are meant for advanced users who want to take advantage of compiler optimizations.
In general, you have to be quite careful with applying them. The program is ill-formed, no diagnostic required if you apply them to a function which doesn't have the asserted properties. Compilers are encouraged to detect such misuse of these attributes, but this isn't required.
printf
is obviously neitherstrlen
and memcmp
can be [[unsequenced]]
(Can strlen be [[unsequenced]]?)memcpy
can be [[reproducible]]
memmove
can't be either, because it isn't idempotent for overlapping memory regionsfabs
can be [[unsequenced]]
sqrt
can't be either, because it modifies the floating point environment and may set errno