c++visual-c++reverse-iterator

reverse_iterator weird behavior with 2D arrays


I have a 2D array. It's perfectly okay to iterate the rows in forward order, but when I do it in reverse, it doesn't work. I cannot figure out why.

I'm using MSVC v143 and the C++20 standard.

int arr[3][4];
for (int counter = 0, i = 0; i != 3; ++i) {
    for (int j = 0; j != 4; ++j) {
        arr[i][j] = counter++;
    }
}

std::for_each(std::begin(arr), std::end(arr), [](auto const& row) {
    for (auto const& i: row) {
        fmt::print("{} ", i);
    }
    fmt::print("\n");
});

std::for_each(std::rbegin(arr), std::rend(arr), [](auto const& row) {
    for (auto const& i: row) {
        fmt::print("{} ", i);
    }
    fmt::print("\n");
});

The output for the first for_each is fine:

0 1 2 3
4 5 6 7
8 9 10 11

Yet the second one is garbage:

-424412040 251 -858993460 -858993460
-424412056 251 -858993460 -858993460
-424412072 251 -858993460 -858993460

When I print their addresses up I couldn't understand it:

<Row addr=0xfbe6b3fc58/>
0 1 2 3
<Row addr=0xfbe6b3fc68/>
4 5 6 7
<Row addr=0xfbe6b3fc78/>
8 9 10 11
<Row addr=0xfbe6b3fb98/>
-424412040 251 -858993460 -858993460
<Row addr=0xfbe6b3fb98/>
-424412056 251 -858993460 -858993460
<Row addr=0xfbe6b3fb98/>
-424412072 251 -858993460 -858993460

What is happening here?


Solution

  • This is very likely a code generation bug of MSVC related to pointers to multidimensional arrays: The std::reverse_iterator::operator*() hidden in the range-based loop is essentially doing a *--p, where p is a pointer type to an int[4] pointing to the end of the array. Decrementing and dereferencing in a single statement causes MSVC to load the address of the local variable p instead of the address of the previous element pointed to by the decremented p, essentially resulting in the address of the local variable p being returned.

    You can observe the problem better in the following standalone example (https://godbolt.org/z/x9q5M74Md):

    #include <iostream>
    
    using Int4 = int[4]; // To avoid the awkward pointer-to-array syntax
    int arr[3][4] = {};
    
    Int4 & test1()
    {
      Int4 * p = arr;
      Int4 * pP1 = p + 1;
    
      // Works correctly
      --pP1;
      Int4 & deref = *pP1;
      return deref;
    }
    
    Int4 & test2()
    {
      Int4 * p = arr;
      Int4 * pP1 = p + 1;
    
      // msvc incorrectly stores the address of the local variable pP1 (i.e. &pP1) in deref
      Int4 & deref = *--pP1;
      return deref;
    }
    
    
    int main()
    {
      std::cout << "arr   = 0x" << &arr[0][0] << std::endl;
      std::cout << "test1 = 0x" << &test1() << std::endl; // Works
      std::cout << "test2 = 0x" << &test2() << std::endl; // Bad
    }
    

    In this example, &test1() correctly prints the address of the first element of arr. But &test2() actually prints the address of the local variable test2::pP1, i.e. it prints &test2::pP1. MSVC even warns that test2() returns the address of the local variable pP1 (C4172). clang and gcc work fine. Versions before MSVC v19.23 also compile the code correctly.

    Looking at the assembly output, clang and gcc emit the same code for test1() and test2(). But MSVC is doing:

    ; test1()
    mov     rax, QWORD PTR pP1$[rsp]
    mov     QWORD PTR deref$[rsp], rax
    
    ; test2()
    lea     rax, QWORD PTR pP1$[rsp]
    mov     QWORD PTR deref$[rsp], rax
    

    Notice the lea instead of the mov statement, meaning that test2() loads the address of pP1.

    MSVC seems to get confused somehow by pointers to multidimensional array.