cpointersc11c17

Casting pointer-to-intptr_t to pointer-to-pointer


A pointer can be safely cast to intptr_t (or uintptr_t) and back (on platforms where those types exist), but what about casting a pointer-to-intptr_t into pointer-to-pointer and using that to modify the underlying intptr_t? What I’d like to do is something like the following:

#include <stdint.h>
#include <stdio.h>

void foo(char **p) {
    *p = "foo";
}

int main() {
    uintptr_t uip;
    foo((char **) &uip);
    printf("it's... %s\n", (char *) uip);
    return 0;
}

I’ve verified that this has the desired effect in Clang (it prints “it's... foo”, and the generated LLVM IR indicates that it’s not a coincidence), and currently that’s sort of enough for me, but I’d like to know where it stands in relation to the C standard if someone can make that out.

My context, in case you’re interested, is that I’m using libunwind to get an address that should be modified from a register, and it’s supplied to me as a unw_word_t, which is typedef-ed as uintptr_t in libunwind.h, and I would like to process that address uniformly with pointers that I obtain directly.


Solution

  • This is neither safe nor well-defined. C actually only guarantees that the (u)intptr_t can be converted to/from void*.

    C23 7.22.2.5:

    The following type designates a signed integer type, other than a bit-precise integer type, with the property that any valid pointer to void can be converted to this type, then converted back to pointer to void, and the result will compare equal to the original pointer:

    intptr_t

    Now as it happens, void* can in turn also be converted to/from any other object pointer type. In order to fulfill that requirement in practice, it is very likely that the C implementation uses the same pointer format for all object pointers. Why it is quite safe in practice to cast between (u)intptr_t and any object pointer.

    But it isn't required by the standard - any two object pointers may use different representations internally, except void* versus character pointers that are required to be compatible. Some other rules for compatible object pointers apply as well (C23 6.2.5):

    A pointer to void shall have the same representation and alignment requirements as a pointer to a character type. Similarly, pointers to qualified or unqualified versions of compatible types shall have the same representation and alignment requirements. All pointers to structure types shall have the same representation and alignment requirements as each other. All pointers to union types shall have the same representation and alignment requirements as each other. Pointers to other types may not have the same representation or alignment requirements.

    C also guarantees nothing about going from one pointer type to another and then de-referecing the pointed-at type with the different pointer type. In most cases this would be undefined behavior because of strict aliasing and possibly also because of alignment.

    Furthermore, (u)intptr_t is large enough to hold a pointer, but that also means that it can in theory be larger than a specific pointer, so incorrectly writing to it using char** might in theory not set all value bytes. Some pointers may also have trap representations like forbidden addresses.


    Specifically in your example, uip; has the declared and effective type uintptr_t. You then access its memory location as if it held a char* and that's a strict aliasing violation - undefined behavior.

    Please note that the special rule allowing us to inspect any object with a char lvalue access using a de-referenced char*, does not somehow magically apply "recursively" to char**. Just as the special rule that any object pointer is implicitly assignable to/from void* does not magically apply to void** as well.

    gcc and clang have very little in the way of diagnostics for strict aliasing bugs, -Wstrict-aliasing being a broken option in general. -fsanitize=undefined doesn't seem to find these kind of bugs either.