Is the following valid in C23?
char** L = (char*[]){ "String", 0 };
char* s = * (void**) (void*) L;
The use case is that this sequence of conversions and dereference would allow me to use existing tools where I think I would otherwise need to write new tools.
If void* and char* were compatible types, it seems this would be OK. It also seems like in practice this would actually work OK. But I don't think it is standard compliant because void* and char* are not compatible types. Looking for confirmation or preferably correction on this conclusion.
EDIT 1
Based on the selected answer, it looks to me like the above is undefined but the following is defined, or at least arguably so:
typedef union type_cv { char* c; void* v; } type_cv;
char** L = (char*[]){ "String", 0 };
char* s = ( (type_cv*) (void*) L ) -> c;
I am using a void* as an intermediate conversion because the standard says practically nothing about what you get when you convert an object pointer directly to another object pointer, so this seems a little more defensible, although it still relies on some vague language.
EDIT 2
In the comments I have seen how to better articulate this question: Can I
Or do I have to:
EDIT 3
There are a few comments on the intermediate cast to void*. The motivation for this is that the standard, as far as I can tell, says nothing at all about what you get converting from a pointer from type T to one to type Q, unless T and Q are a specific case that is discussed in the standard (such as if Q is a qualified version of T) - except that you can convert back and compare equal when alignment requirements match. See 6.3.2.3.7.
So, you could be compliant if you set up an implementation that runs pointers thru arbitrary functions when going from one type to another, so long as the functions for conversions going in opposite directions are inverses of eachother. This would be a pathalogical implementation, but I am tasked with being fully standard compliant, so the fact that I don't have a standard guarantee is a problem.
The situation is not perfect even with an intermediate void*, but q=(type_Q*)(void*)(type_T*)t
guarantees (void*)p==(void*)q
(see 6.3.2.3.1 and 6.3.2.3.7 which seem to together require this result), which is not guaranteed by q=(type_Q*)t
as far as I can tell.
All that said, I am 99.9% no real implementation would cause this to be relevant, but that 0.1% chance that there might be some optimization assumption that gets corrupted and causes an error if I don't stick to the standard guarantees as closely as possible given the vagueness in the standard, is worth a void*
cast.
EDIT 4
Following is an example that shows where this idea can matter.
I have a suite of tools where a typical format is type_s* F( void** X, size_t X_l, long(*f)(void*) )
which takes an array of void*
objects given by X of length X_l and produces an object of type type_s
, returning a pointer to it.
I am adding a new tool to another tool set, where to follow the paradigm there I need to use the format type_s* G( char** S, size_t S_l )
for the new function I am writting. Within this tool, I want to call F(S,S_l,g)
where g can be whatever I want. If the other tool can take S, treat it as type void**, and send its void* elements to g, then g can convert them back to char* and do its thing.
This is an intersection between two different sets of tools, where it is not desirable to change the formatting conventions on either tool set, or to break from convention in the case of the tool I am currently working on. Like the poster of the selected answer pointed out in a comment, it would be best to have a consistent data type, but I am in a position where that is a major change that involves either breaking from an established convention or changing the convention on an entire set of tools just to facilitate the few cases where the tool sets overlap. So it would be ideal if I could just read some char*
elements as void*
elements and use what I already have.
The interchangeability of char *
and void *
is ill-defined by the C standard. It is stated in a footnote, but footnotes are not normative parts of the standard, and the footnote only mentions certain interchanges.
In some circumstances, char *
may be implicitly converted to void *
or vice-versa, such as in an assignment. However, * (void **) (void *) L
does not convert a char *
to a void *
; it attempts to use a char *
in memory as if it were a void *
. When an object is defined with one type in C and accessed with a different type, that is generally a problem in C. The rule about that in C 2024 6.5 says:
An object shall have its stored value accessed only by an lvalue expression that has one of the following types:
— a type compatible with the effective type of the object,
— a qualified version of a type compatible with the effective type of the object,
— the signed or unsigned type compatible with the underlying type of the effective type of the object,
— the signed or unsigned type compatible with a qualified version of the underlying type of the effective type of the object,
— an aggregate or union type that includes one of the aforementioned types among its members (including, recursively, a member of a subaggregate or contained union), or
— a character type.
In (void **) (void *) L
, L
is first automatically converted to a pointer to its first element, so the result of that conversion is a pointer to L[0]
, which is a char *
. Then (void **) (void *) L
gives us a void **
. Pointer conversions are not fully defined by the C standard, but let’s say this gives us a void **
that also points to L[0]
, just with a different type. Then * (void **) (void *) L
accesses L[0]
with the type void *
. (It forms a pointer of type void **
and then uses *
to access the object, so the pointed-to type, void *
, is used to interpret the memory of the object. Because of the *
, * (void **) (void *) L
is an lvalue expression with type void *
.)
Since L[0]
was defined with type char *
, that is its effective type. For an effective type of char *
, void *
is not any of the types listed above. It is not compatible with char *
, is not a qualified version of it, and so on. Therefore, * (void **) (void *) L
violates this rule. When a “shall” rule of the C standard is violated, the behavior of the program is not defined by the C standard (clause 4).
Getting back to that footnote, C 2024 6.2.5 says:
… A pointer to void shall have the same representation and alignment requirements as a pointer to a character type.41)…
and footnote 41 says:
The same representation and alignment requirements are meant to imply interchangeability as arguments to functions, return values from functions, and members of unions.
So, when we interpret the bytes of a char *
as a void *
, which should get the same value. That is what having the same representation means: The meaning ascribed to the bytes is the same. You could reinterpret the bytes of L
in a defined way by copying them to a void *
:
char **L = (char *[]){ "String", 0 };
void *T;
memcpy(&T, &L[0], sizeof T);
char *s = *T;
However, your code does not do that. It attempts to read the char *
directly using type void *
. The footnote does not cover that: If you passed a char *
to a function with variable arguments (...
in its declaration), and the function read it as a void *
using the standard va_arg
macro, that would be an interchange covered by the footnote. However, merely making the interchange in an expression is not a use as an argument to a function, a return value from a function, or a member of a union. So it is not covered by the footnote.
In conclusion, the behavior of a program using the code in the question is not defined by the C standard because:
It violates the aliasing rule in C 2024 6.5 quoted above.
It is not saved by the footnote.
Footnotes are not normative anyway.