c++language-lawyer

Where in their documents do implementations state they won't reorder black-box functions?


Consider this example:

extern void black_box_foo();
extern void black_box_bar();
int main(){
   black_box_foo();  // #1
   black_box_bar();  // #2
}

#1 and #2 are functions whose definitions are opaque to the implementations, for example, their definitions are in the static library.

[intro.abstract] p1 says:

In particular, they need not copy or emulate the structure of the abstract machine.Rather, conforming implementations are required to emulate (only) the observable behavior of the abstract machine as explained below.

And observable behaviors are listed in this exhaustive list([intro.abstract] p8)

The following specify the observable behavior of the program:

  • Accesses through volatile glvalues are evaluated strictly according to the rules of the abstract machine.
  • Data is delivered to the host environment to be written into files (See also: ISO/IEC 9899:2024, 7.23.3).
  • The input and output dynamics of interactive devices shall take place in such a fashion that prompting output is actually delivered before a program waits for input. What constitutes an interactive device is implementation-defined.

All the above rules can be concluded as the "as-if" rule. Under this rule, implementations can reorder anything, as long as they provide the reordering doesn't change the observable behavior.

Without seeing the definitions of the functions, we cannot know whether these functions comprise the observable behavior specified in the above list. AFAIK, almost all major implementations take the conservative approach for black box functions: preserve the program order.

There are two questions:

  1. Can conforming implementations eliminate the function call if these functions were written in other languages, for example, assembly, and do something that is not the observable behavior in terms of the C++ definition? If not, where are the relevant rules in the implementations' documents?

  2. Can conforming implementations reorder #1 and #2 when their definitions are opaque? If not, where are the relevant rules in the implementations' documents?


Solution

  • According to [dcl.link]/10...

    Linkage from C++ to entities defined in other languages and to entities defined in C++ from other languages is implementation-defined and language-dependent. Only where the object layout strategies of two language implementations are similar enough can such linkage be achieved.

    Therefore if the black box functions are not written in C++, then from the point of view of the C++ standard, it is implementation-defined what happens when a C++ implementation is used to translate a C++ program that calls such functions (therefore, the behaviour of such a program is also implementation-defined). It is valid for the implementation to translate such calls into no-ops, only if the implementation documents this behaviour; of course, this is purely hypothetical since no real implementation would make a choice that is so hostile to users. It is also theoretically possible that the implementation chooses to define the implementation-defined semantics of calling black_box_foo as "do whatever black_box_foo did in its original language, but later", in which case its side effects would be reordered after those of black_box_bar, but this, too would be a highly unusual and user-hostile implementation choice, and therefore not relevant in practice.

    But where do implementations actually document the effect of calling functions that are defined in languages other than C++ and C? I don't know the answer to that; Clang and GCC developers might either argue that "the source code is the documentation" or that the platform ABI documentation indirectly answers this question. The reality is that implementations do not really document most of what they're required to; in many cases the rule that something is "implementation-defined" is merely a strong suggestion that it be predictable (unlike operations that are described by the standard as having an "unspecified" result).

    In practice, calls to opaque foreign functions will be treated, as much as possible, as if the callees were written in C++, because that is the behaviour that users expect, and implementations are unlikely to document anything except for the deviations.

    For opaque functions that are written in C++ (and were simply translated earlier by the "same" implementation), the implementation can reorder calls if doing so doesn't change the observable behaviour, but arguably the adjective "opaque" implies that the implementation cannot know whether there is any observable behaviour. Since real-world implementations are themselves provided in the form of computer programs—which execute the behaviour that they were programmed to—there are only two likely behaviours for a real implementation: that it either always treats opaque function calls as potentially reorderable, or never. The first implementation strategy would of course violate the standard as soon as the implementation is used on a program in which the opaque functions do have observable behaviour.