c++gcclanguage-lawyerc++20c++-modules

Are forward declarations of entities to be imported from modules legal?


A module (mod.cppm):

export module TestModule;

export struct Foo
{
    char c;
};

is imported to main.cpp, where the import declaration is preceded with a forward-declaration of the imported entity:

#include <iostream>

struct Foo;

import TestModule;

int main()
{
    Foo f{'a'};
    std::cout << '\n' << f.c << '\n';
    return 0;
}

which doesn't compile with GCC 14.2.0, but compiles with both MSVC2022 and Clang 20.1.5.

The GCC error message is reference to 'Foo' is ambiguous with notes candidates are: 'struct Foo@TestModule' and 'struct Foo'. When the forward declaration struct Foo; is commented out of main.cpp, no problems with GCC, too.

I know that GCC support for modules is limited, but I couldn't find a reference to this issue on their page https://gcc.gnu.org/onlinedocs/gcc/C_002b_002b-Modules.html. Still, GCC is often more strict in following the standard than Clang or MSVC, so it possible that I'm missing something from the standard, not being a proper language lawyer.

So, who is right there: GCC or Clang with MSVC?

(Note: a semi-practical reason for this question is the ability to use module-only libraries with Qt. It is possible with forward declaration of all the entities to be imported in the headers of QObject-derived classes, while the actual imports go the corresponding .cpp file. It works with Clang and MSVC, but not with GCC).


Solution

  • Let's work backwards from GCC's error message. It claims that the "reference to Foo is ambiguous".

    [basic.lookup.general]/1 actually does include a provision for "ambiguous" name lookup:

    Otherwise, if the declarations found by name lookup do not all denote the same entity, they are ambiguous and the program is ill-formed.

    OK: do both of the Foo declarations (the one from the module and from the current TU) denote the same entity? If they do, then it's pretty clear that this shouldn't be ambiguous.

    So, how do two declarations denote the same entity? That's governed by [basic.link]/8, as follows:

    Two declarations of entities declare the same entity if, considering declarations of unnamed types to introduce their names for linkage purposes, if any ([dcl.typedef], [dcl.enum]), they correspond ([basic.scope.scope]), have the same target scope that is not a function or template parameter scope, neither is a name-independent declaration, and either

    • they appear in the same translation unit, or
    • they both declare names with module linkage and are attached to the same module, or
    • they both declare names with external linkage.

    They've got the same names, and they're in the same scope (global namespace scope; modules don't have "scopes"). So for them to name the same entity, one of the bulleted items must be triggered. #1 is clearly untrue. Module linkage only applies to entities that aren't exported (module linkage is basically private stuff for that module), so #2 is out. That only leaves #3 as an option.

    So, what exactly is the linkage for these declarations?

    [basic.link]/4 gives a very long list of rules for linkage that I'm not going to copy into there. The forward declaration doesn't fit any of the bulleted rules for linkage, so it gets external linkage by default. However... so does the module declaration. It's exported, so it cannot get module linkage. So that leaves external linkage.

    They have the same linkage, so #3 above is triggered: these declarations denote the same entity. So there's no ambiguity and therefore GCC is wrong.

    Kind of.

    GCC is actually not wrong to forbid the code. Why?

    Because of [basic.link]/10 :

    If two declarations of an entity are attached to different modules, the program is ill-formed; no diagnostic is required if neither is reachable from the other.

    Well, these are two declarations of an entity, as explained above. And they are attached to different modules ("not in a module" means the global module). And if you look at the reachability rules, you'll see that neither declaration is reachable from the other, so no diagnostic is required.

    But just because a diagnostic is not required doesn't mean a compiler cannot provide one. The literal text of GCC's error is incorrect, but the standard doesn't care about the text of the diagnostic. Giving a diagnostic is perfectly valid for a C++ implementation.

    So all of the compilers are right.