c++c++17pointer-arithmeticxor-linkedlist

Are XOR linked lists still allowed in C++17?


XOR linked lists use pointer arithmetic in a way that looks suspicious to me given the changes in semantics introduced in C++17 (and discussed e.g. in Is a pointer with the right address and type still always a valid pointer since C++17?). Do they cause undefined behavior now? If so, can they be saved with launder?

EDIT:

The Wikipedia article contains just a short note about converting between pointers and integers. I tacitly assumed (and now am making it explicitly stated) that the pointers are first converted to and integer type of sufficient size to fit them, then XORing is done on the integers. The properties of XOR listed under Theory of operation thus guarantee that only integers obtained once from pointers will be converted back to them. The actual mapping from pointers to integers can be, per the standard, an arbitrary injection. I don't rely on any assumption beyond that.

Does the standard allow to use them and access the still existing objects? Before C++17? Since C++17?


Solution

  • It's implementation-defined, and still valid in C++17, at least for GCC. You cannot perform an xor operation between two pointers directly; you would have to go through reinterpret_cast<std::intptr_t>. The effect of this conversion (and back) is implementation-defined.

    Implementation-defined means that the compiler must document what happens. What GCC provides is:

    A cast from pointer to integer discards [...], otherwise the bits are unchanged.

    A cast from integer to pointer discards [...], otherwise the bits are unchanged.

    When casting from pointer to integer and back again, the resulting pointer must reference the same object as the original pointer, otherwise the behavior is undefined.

    See https://gcc.gnu.org/onlinedocs/gcc/Arrays-and-pointers-implementation.html

    From this description, we can conclude that:

    1. the user ensures that the object at the address stays the same between storing pointers in an XOR list and retrieving them
      • if the pointed-to object changes during this process, casting the integer back to a pointer is undefined behavior
    2. converting one past the end pointers, null pointers, and invalid pointers back and forth isn't explained here, but "preserves the value" according to the C++ standard

    Implications for an XOR-list

    Generally, this should make XOR-lists implementable, as long as we reproduce the same pointers that we stored, and don't "rug pull" nodes while there are XORed pointers to them.

    std::intptr_t ptr;
    
    // STORE:
    // - bit-cast both operands and XOR them
    // - store result in ptr
    ptr = reinterpret_cast<std::intptr_t>(prev) ^ reinterpret_cast<std::intptr_t>(next);
    
    // LOAD:
    // - XOR stored ptr and bit-cast to node*
    node* next = reinterpret_cast<node*>(ptr ^ reinterpret_cast<std::intptr_t>(prev));
    // valid dereference, because at the address 'next', we still store the same object
    *next;
    

    As stated in the documentation, next "must reference the same object as the original pointer", so we can assume that next is now a pointer to an object, if such a pointer was originally stored in ptr.

    However, it would be UB if we stored the XORed next pointer, began the lifetime of a new object where next points to, and then un-XORed the address and converted back to a pointer type.