cposixdlsymc-standard-library

Man page workaround for dlsym() still error prone?


I was reading the man page to dlopen(), and I stumbled on this block of code:

cosine = (double (*)(double)) dlsym(handle, "cos");

           /* According to the ISO C standard, casting between function
              pointers and 'void *', as done above, produces undefined results.
              POSIX.1-2001 and POSIX.1-2008 accepted this state of affairs and
              proposed the following workaround:

                  *(void **) (&cosine) = dlsym(handle, "cos");

              This (clumsy) cast conforms with the ISO C standard and will
              avoid any compiler warnings.

              The 2013 Technical Corrigendum 1 to POSIX.1-2008 improved matters
              by requiring that conforming implementations support casting
              'void *' to a function pointer.  Nevertheless, some compilers
              (e.g., gcc with the '-pedantic' option) may complain about the
              cast used in this program. */

I know that casting a function pointer to a void pointer and vice versa is undefined behavior. And that the standard's reasoning for making it undefined behavior is because of architectural differences where a function pointer may not be the same size as data pointer or in some cases a function pointer actually being represented with two values (so I've heard at least). I understand how the workaround avoids being undefined behavior since casting the address of the cosine to a void ** is really just casting a data pointer which points to a pointer to a function, to void **, which is perfectly valid, and of course it is then perfectly valid to dereference a void ** and assign to it the void * which dlsym() returns. However, wouldn't this code be equally error prone as just casting the function to a void pointer in the case that the previously mentioned architectural quirks are present? If that's the case, shouldn't the standard also specify that this workaround is also undefined behavior? Which further leads to the question of whether or not a non-error prone implementation of the dlsym() function could even be implemented to begin with?


Solution

  • *(void **) (&cosine) = dlsym(handle, "cos");
    
    This (clumsy) cast conforms with the ISO C standard and will
    avoid any compiler warnings.
    

    No, it does not “conform.” Specifically, it is not strictly conforming code and its behavior is not defined by the C standard.

    Inferring from previous code, cosine is defined to be double (*)(double), a pointer to a function taking a double and returning a double. The above code writes to it using an lvalue of type void *. This violates the aliasing rules in C 2018 6.5 7. That paragraph says which combinations of effective type and type used for access are defined by the C standard, and accessing a pointer to a function with a void * is not among them.

    Further, the C standard does not require double (*)(double) and void * to have the same size or the same representation, so writing the bytes could, in theory, completely mess up the pointer. (However, this is rarer than compilers for which optimization taking advantage of the aliasing rules will mess up the program, from the perspective of an unwary programmer.)

    One fix would be to create a dlsymf routine that returns a pointer to any function type, as the C standard defines the behavior of converting between pointers to different function types (as long as an appropriate type is used for the actual call).

    I know that casting a function pointer to a void pointer and vice versa is undefined behavior. And that the standard's reasoning for making it undefined behavior is because of architectural differences where a function pointer may not be the same size as data pointer or in some cases a function pointer actually being represented with two values (so I've heard at least)

    That is not the reason. The fact that two types have different sizes is not an impediment to converting between them, as evidence by the fact that we may easily convert int to char, long long int, or double, which commonly have sizes different from int. A conversion is allowed to perform computations to produce its result. (A conversion is effectively an operation that takes a value in one type and produces, to the extent feasible, the same value in another type. It is not merely taking the operand bytes to represent a value in a new type.) The standard also requires conversions between void * and any pointer-to-object type to work, provided alignment requirements are satisfied, but those pointers can also have different sizes.

    I do not know particularly what the actual reason was. Perhaps it was seen as onerous for some C implementations with separated data and instruction spaces and unusual addressing schemes to perform the conversion and little benefit to requiring them to do so was perceived. But it was not due to the sizes of the pointers.

    In any case, the solution is straightforward: Define the behavior of converting a void *, particularly a void * returned by dlsym to a pointer to a function type. Effectively, any C implementation supporting POSIX must do this, so that dlsym works. The fact that the C standard says this is undefined behavior does not mean we must leave it undefined. The standard’s meaning for “undefined behavior” is only that the standard does not impose any requirements. It does not require us to keep it undefined; we can add our own definition of what it does, and then the behavior will be defined for our C implementation.

    In fact, C 2018 J.5.7 1 notes this as a common extension:

    A pointer to an object or to void may be cast to a pointer to a function, allowing data to be invoked as a function (6.5.4).