c++gcccompiler-optimizationstrict-aliasingrestrict-qualifier

Why are the results of the optimization on aliasing different for char* and std::string&?


void f1(int* count, char* str) {
  for (int i = 0; i < *count; ++i) str[i] = 0;
}

void f2(int* count, char8_t* str) {
  for (int i = 0; i < *count; ++i) str[i] = 0;
}

void f3(int* count, char* str) {
  int n = *count;
  for (int i = 0; i < n; ++i) str[i] = 0;
}

void f4(int* __restrict__ count, char* str) { // GCC extension; clang also supports it
  for (int i = 0; i < *count; ++i) str[i] = 0;
}

According to this article, the compiler (almost) replaces f2() with a call to memset(), however, the compiler generates machine code from f1() that is almost identical to the above code. Because the compiler can assume that count and str do not point to the same int object (strict aliasing rule) in f2(), but cannot make such an assumption in f1() (C++ allow aliasing any pointer type with a char*).

Such aliasing problems can be avoided by dereferencing count in advance, as in f3(), or by using __restrict__, as in f4().

https://godbolt.org/z/fKTjcnW5f

The following functions are std::string/std::u8string version of the above functions:

void f5(int* count, std::string& str) {
  for (int i = 0; i < *count; ++i) str[i] = 0;
}

void f6(int* count, std::u8string& str) {
  for (int i = 0; i < *count; ++i) str[i] = 0;
}

void f7(int* count, std::string& str) {
  int n = *count;
  for (int i = 0; i < n; ++i) str[i] = 0;
}

void f8(int* __restrict__ count, std::string& str) {
  for (int i = 0; i < *count; ++i) str[i] = 0;
}

void f9(int* __restrict__ count, std::string& __restrict__ str) {
  for (int i = 0; i < *count; ++i) str[i] = 0;
}

https://godbolt.org/z/nsPdfhzoj

My questions are:

  1. f5() and f6() are the same result as f1() and f2(), respectively. However, f7() and f8() do not have the same result as f3() and f4() (memset() was not used). Why?
  2. The compiler replaces f9() with a call to memset() (that does not happen with f8()). Why?

Tested with GCC 12.1 on x86_64, -std=c++20 -O3.


Solution

  • I created a simplified demo for the string case:

    class String {
        char* data_;
    public:
        char& operator[](size_t i) { return data_[i]; }
    };
    
    void f(int n, String& s) {
        for (int i = 0; i < n; i++) s[i] = 0;
    }
    

    The problem here is that the compiler cannot know whether writing to data_[i] does not change the value of data_. With the restricted s parameter, you tell the compiler that this cannot happen.

    Live demo: https://godbolt.org/z/jjn9d3Mxe

    This is not necessary for passing a pointer, since it is passed in the register, so it cannot be aliased with the pointed-to data. However, if this pointer is a global variable, the same problem occurs.

    Live demo: https://godbolt.org/z/Y3nWvn6rW