Skip to content

C++ import std still puts things in global namespace? #90101

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
rnikander opened this issue Apr 25, 2024 · 33 comments
Closed

C++ import std still puts things in global namespace? #90101

rnikander opened this issue Apr 25, 2024 · 33 comments
Assignees
Labels
clang:modules C++20 modules and Clang Header Modules libc++ libc++ C++ Standard Library. Not GNU libstdc++. Not libc++abi. worksforme Resolved as "works for me"

Comments

@rnikander
Copy link

This code fails to compile. Shouldn't it work?

import std;
namespace ns {
    double fog(double d) { return d; }
}
int fog;   // Okay - no conflict with ns::fog
int log;   // Fails - conflicts with std::log

I notice this because I'm always trying to declare these file-local log objects (writing messages to a file, not the mathematical function) and I hit this conflict.

@EugeneZelenko EugeneZelenko added clang:modules C++20 modules and Clang Header Modules and removed new issue labels Apr 25, 2024
@llvmbot
Copy link
Member

llvmbot commented Apr 25, 2024

@llvm/issue-subscribers-clang-modules

Author: None (rnikander)

This code fails to compile. Shouldn't it work? ``` import std; namespace ns { double fog(double d) { return d; } } int fog; // Okay - no conflict with ns::fog int log; // Fails - conflicts with std::log ```

I notice this because I'm always trying to declare these file-local log objects (writing messages to a file, not the mathematical function) and I hit this conflict.

@rnikander
Copy link
Author

I'm not finding the docs (if there are any) on how to prepare the std.pcm. Maybe I did it wrong and it needs a flag I didn't pass. I precompiled std.cppm with the same compiler flags I used for my own modules.

@cor3ntin cor3ntin added the libc++ libc++ C++ Standard Library. Not GNU libstdc++. Not libc++abi. label Apr 25, 2024
@mordante mordante self-assigned this Apr 25, 2024
@mordante
Copy link
Member

It fails for me too. Can you post your compiler output; I'd like to make sure we look at the same failure.

@rnikander
Copy link
Author

Here:

main.cc:6:5: error: redefinition of 'log' as different kind of symbol
    6 | int log;
      |     ^
/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX14.4.sdk/usr/include/math.h:374:15: note: previous definition is here
  374 | extern double log(double);
      |               ^
1 error generated.

By the way, MS's compiler is doing the same thing to me.

It seems to me that the right thing to do here would be to have log hidden in std:: when you import std, but exposed if you import std.compat. That seems to be the point of std.compat. ?

@ChuanqiXu9
Copy link
Member

This is slightly complex to explain. But my conclusion is, the behavior of the compiler to the reproducer is "correct" and I don't know how libc++ can fix this. Let's talk about the compiler first and the library second.

For this issue, the compiler are saying "you're defining two different '::log', this violates the ODR rule". This is correct. The reproducer defines two different entities with the same name in the same program (the std library is part of that program). Here the ::log in std module is defined in the global module fragment and it should always be there since std::log simply reuses that.

This may be annoying but image this in the header's era, if we include <cmath> in some header and we define ::log in some other TU without including <cmath>, then we would be in ODR violation silently. So I see this as an improvement with modules.

For libc++, it is hard to fix. The fundamental reason is that, the implementation of <cmath> depends on <math.h>. Then std module depends on <math.h>. The only technical solution may be, implement std::log actually in <cmath> and make <math.h> using std::log in global namespace. Then the std module can get rid of <math.h>. However, I feel this is not feasible for libc++.

@ChuanqiXu9
Copy link
Member

Also if you're interested, you can take a look at #90154 for a similar case

@philnik777
Copy link
Contributor

philnik777 commented Apr 26, 2024

If we wanted to fix this, I think we could relatively easily. We're already defining our own set of the math functions, which we could include directly and avoid the libcs math.h altogether. Then we'd change using ::log to using std::__math::log. I'm just not sure we really want to do that, since it's trivial to regress on this.

@cor3ntin
Copy link
Contributor

I am not sure the standard as specified is implementable because of the fact aliases don't introduce new names?

https://eel.is/c++draft/support.c.headers#other-1.sentence-2

It is unspecified whether these names are first declared or defined within namespace scope ([basic.scope.namespace]) of the namespace std and are then injected into the global namespace scope by explicit using-declarations ([namespace.udecl])

https://eel.is/c++draft/library#std.modules-2

The named module std exports declarations in namespace std that are provided by the importable C++ library headers (Table 24 or the subset provided by a freestanding implementation) and the C++ headers for C library facilities (Table 25). It additionally exports declarations in the global namespace for the storage allocation and deallocation functions that are provided by .
The named module std.compat exports the same declarations as the named module std, and additionally exports declarations in the global namespace corresponding to the declarations in namespace std that are provided by the C++ headers for C library facilities (Table 25), except the explicitly excluded declarations described in [support.c.headers.other].

The intent behind the std.compat split is that import std would not add names to the global namespace, but can we achieve that reliably?

@philnik777 solution would probably work but would not that cause ABI issues?
(alternatively, if maths functions are defined in std and aliased in c headers, I think it would work too)

@philnik777
Copy link
Contributor

@cor3ntin As I understand it, if you import std; you can't use ::log, but there is a symbol clash (I don't have things set up so I can't check). Not sure what ABI issues you foresee that don't exist already?

@ChuanqiXu9
Copy link
Member

@cor3ntin As I understand it, if you import std; you can't use ::log, but there is a symbol clash (I don't have things set up so I can't check). Not sure what ABI issues you foresee that don't exist already?

I guess what @cor3ntin says is, since that solution may change the implementation of cmath and math.h, the symbol of the log function in libc++.so will be different, IIUC.

@cor3ntin
Copy link
Contributor

@cor3ntin As I understand it, if you import std; you can't use ::log, but there is a symbol clash (I don't have things set up so I can't check).

The intent was not to pollute the global namespace at all. Presumably (in part) so that users could then put their own names there. The other motivation if i recall was to avoid ADL issues (which we achieve).
I don't recall implementation challenges to have been discussed in WG21. It might be worth to clarify.

Not sure what ABI issues you foresee that don't exist already?

You suggested putting the definition in eg std::__math::log. Wouldn't that affect mangling? I guess you have way to work around that?

@mordante
Copy link
Member

I'm a bit surprised this is considered "as intended". I thought modules were intended to encapsulate dependencies and not leak them to those who import modules. Should the module's ::log not be attached to the std module and not be available to code that imports the std module?

I don't think moving things to std::__math::log is not a real solution; every C library vendor has to agree.

@ChuanqiXu9
Copy link
Member

I'm a bit surprised this is considered "as intended". I thought modules were intended to encapsulate dependencies and not leak them to those who import modules.

Yes, but that is for declarations in module purview. For example,

export module a;
int a_impl = 43;
export int get_a() { return a_impl; }

// some.cc
import a;
double a_impl = 44.0; // should be fine

In the above example, the definition of a_impl in module 'a' (maybe it will be clear to say a_impl@a to note that this is not the same thing with a_impl) shouldn't affect the definition of a_impl in some.cc since they are different entities. (Although clang doesn't implement this well..)

But it is a different story of std modules in libc++ now since they actually a wrapper of headers.

Should the module's ::log not be attached to the std module and not be available to code that imports the std module?

From the current implementation of libc++, the ::log must be part of std module even if we ignore the implementation details in clang. Since from the spec's wording, the ::log here is considered as reachable. So it must be in the BMI of std module. Then the compiler can see, oh there is a ::log and here is another ::log, they should refer to the same entity but their definition is different, so the compiler need to give diagnostics.

I don't think moving things to std::__math::log is not a real solution; every C library vendor has to agree.

@ChuanqiXu9
Copy link
Member

I am wondering if we can fix it by:

// cmath
// #include <math.h>

namespace std {
double log(double);
}
// cmath.cc
#include <cmath>
#include <math.h>

double std::log(double) {
     return ::log(double);
}

Then point is, we don't need to touch <math.h> and avoid the header <cmath> to dependent on <math.h>. We introduced a new symbol std::log but didn't touch the previous one, which may not be ABI breaking.

Do you think if this is feasible? Or otherwise we can only say this may not be implementable and raise some complaints in WG21.

@philnik777
Copy link
Contributor

I'm pretty sure you've misunderstood what I suggested. We already have inline std::__math versions of all the <cmath> functions. They simply call the __builtin_ versions, which means we don't need any declaration in the global namespace. The only thing we'd have to do is change using std::whatever to using std::__math::whatever in the std module and not include <cmath>. No need for the libc to do anything or any problems with ABI.

@cor3ntin
Copy link
Contributor

@philnik777 Oh! excellent. Then it might be a good (or at least conforming) way forward

@mordante
Copy link
Member

mordante commented May 1, 2024

I'm a bit surprised this is considered "as intended". I thought modules were intended to encapsulate dependencies and not leak them to those who import modules.

Yes, but that is for declarations in module purview. For example,

export module a;
int a_impl = 43;
export int get_a() { return a_impl; }

// some.cc
import a;
double a_impl = 44.0; // should be fine

In the above example, the definition of a_impl in module 'a' (maybe it will be clear to say a_impl@a to note that this is not the same thing with a_impl) shouldn't affect the definition of a_impl in some.cc since they are different entities. (Although clang doesn't implement this well..)

But it is a different story of std modules in libc++ now since they actually a wrapper of headers.

Should the module's ::log not be attached to the std module and not be available to code that imports the std module?

From the current implementation of libc++, the ::log must be part of std module even if we ignore the implementation details in clang. Since from the spec's wording, the ::log here is considered as reachable. So it must be in the BMI of std module. Then the compiler can see, oh there is a ::log and here is another ::log, they should refer to the same entity but their definition is different, so the compiler need to give diagnostics.

Thanks that makes it clear why this happens. I'm still not convinced the std::__math is the way forward. This would fix ::log, but I expect similar issues with other named declarations in the C headers.

I want to investigate further what's possible.

@mordante
Copy link
Member

mordante commented May 9, 2024

I did some additional tests, including adding a layer of indirection. That however did not work.
@ChuanqiXu9 IIRC a while ago you mention discarding declarations was not implemented in Clang. Is that still an open task?

@ChuanqiXu9
Copy link
Member

ChuanqiXu9 commented May 10, 2024

I did some additional tests, including adding a layer of indirection. That however did not work. @ChuanqiXu9 IIRC a while ago you mention discarding declarations was not implemented in Clang. Is that still an open task?

We may have that by -fexperimental-modules-reduced-bmi (https://clang.llvm.org/docs/StandardCPlusPlusModules.html#reduced-bmi). But I don't think we can discard things like ::log since they are techniquely reachable and shouldn't be discarded. What is your design?

@iains
Copy link
Contributor

iains commented May 10, 2024

I did some additional tests, including adding a layer of indirection. That however did not work. @ChuanqiXu9 IIRC a while ago you mention discarding declarations was not implemented in Clang. Is that still an open task?

We may have that by -fexperimental-modules-reduced-bmi (https://clang.llvm.org/docs/StandardCPlusPlusModules.html#reduced-bmi). But I don't think we can discard things like ::log since they are techniquely reachable and shouldn't be discarded.

If my memory is correct, GMF decls can be elided if they are unused so whether ::log is reachable might not be the concern?

@ChuanqiXu9
Copy link
Member

If my memory is correct, GMF decls can be elided if they are unused so whether ::log is reachable might not be the concern?

But if we're using std::log in the module purview and std::log using the ::log, the ::log should be considered as used.

@iains
Copy link
Contributor

iains commented May 11, 2024

If my memory is correct, GMF decls can be elided if they are unused so whether ::log is reachable might not be the concern?

But if we're using std::log in the module purview and std::log using the ::log, the ::log should be considered as used.

yes, indeed - in that case it cannot be elided.

@mordante
Copy link
Member

I did some additional tests, including adding a layer of indirection. That however did not work. @ChuanqiXu9 IIRC a while ago you mention discarding declarations was not implemented in Clang. Is that still an open task?

We may have that by -fexperimental-modules-reduced-bmi (https://clang.llvm.org/docs/StandardCPlusPlusModules.html#reduced-bmi). But I don't think we can discard things like ::log since they are techniquely reachable and shouldn't be discarded. What is your design?

I had no real design. I just wanted to experiment with a separate library while reading the Standard to better understand what the real issue was. I just noticed some parts that were not reachable also caused compilation errors. These issues were indeed fixed by -fexperimental-modules-reduced-bmi.

So I agree the current design works as intended and I don't see anything libc++ or clang can do to fix this.
So I'll close this bug report.

@mordante mordante closed this as not planned Won't fix, can't repro, duplicate, stale May 14, 2024
@EugeneZelenko EugeneZelenko added the worksforme Resolved as "works for me" label May 14, 2024
@ChuanqiXu9
Copy link
Member

I just noticed some parts that were not reachable also caused compilation errors. These issues were indeed fixed by -fexperimental-modules-reduced-bmi.

But the reported case ::log should be reachable IIRC, right?

@mordante
Copy link
Member

I just noticed some parts that were not reachable also caused compilation errors. These issues were indeed fixed by -fexperimental-modules-reduced-bmi.

But the reported case ::log should be reachable IIRC, right?

yes as far as I can see it should be. Therefore I closed the report.

I was just very surprised by this behaviour, so I wanted to look at bit closer at the details and the exact wording in the Standard.

@cor3ntin
Copy link
Contributor

I am very unclear that this is the case. Let me ask LWG

@ChuanqiXu9
Copy link
Member

ChuanqiXu9 commented May 17, 2024

I just noticed some parts that were not reachable also caused compilation errors. These issues were indeed fixed by -fexperimental-modules-reduced-bmi.

But the reported case ::log should be reachable IIRC, right?

yes as far as I can see it should be. Therefore I closed the report.

I am not sure if I read correctly. I mean, if we enabled -fexperimental-modules-reduced-bmi, will the compiler accept the reproducer? If yes, I'll be surprised.


I just tried myself. With -fexperimental-modules-reduced-bmi, the reproducer still can't be accepted. And the behavior looks fine from the compiler side, since ::log is reachable then we shouldn't elide it.

In this case, I think we shouldn't close the issue. Especially we're still not very clear what's the standard behavior from the spec. (Although personally I think we should accept the reproducer.)

@mordante
Copy link
Member

Since our implementation is conforming (which is reaffirmed on the reflector) I feel we can keep this closed. Discarding unreachable declarations does not fix the issue. It requires changing the Standard.

(I still feel it would be great if this worked.)

@rnikander
Copy link
Author

Since our implementation is conforming (which is reaffirmed on the reflector) ...

Can you explain this a bit more? I've been using C++ a lot lately, but I don't understand it as deeply as you all do. I don't fully understand all the terms used in these posts. To me it feels like I have a thing with a "public" interface – the std module with the std:: namespace and a std::log function in that namespace. Then I go to import all that, and something that should be a private implementation detail pollutes and conflicts with my code. The private detail is the fact that std::log is implemented by a function named ::log.

To me, that should be hidden. I shouldn't have to think about it.

Is that accurate, or would you describe it differently?

@ChuanqiXu9
Copy link
Member

Since our implementation is conforming (which is reaffirmed on the reflector) I feel we can keep this closed. Discarding unreachable declarations does not fix the issue. It requires changing the Standard.

(I still feel it would be great if this worked.)

Can you elaborate on that? For example, I didn't see relevant discussion in EWG's reflector. Also it will be helpful to share the text about why it is conforming and why we need to change the Standard if we want to do that.

@jyknight
Copy link
Member

To me, that should be hidden. I shouldn't have to think about it.

That might be nice, but it's not the case.

Modules cannot prevent all conflicts -- in order to prevent conflicts, implementations must (still) use unique names for their global functions (generally achieved by putting everything into a unique namespace). Perhaps in an ideal world, the C++ standard library would also only depend on names in the "std" namespace, and not global names -- leaving the global names available for users. But, in the world we have, the C++ library is built upon a foundation laid by the C library -- and the C library defines a lot of names in the global namespace (such as "log").

So, even if the code in the initial example were to successfully compile, it would be broken at link-time/runtime, because your global variable symbol "log" would conflict with the C library's global function symbol "log". Nothing good can come of that.

@mordante
Copy link
Member

Since our implementation is conforming (which is reaffirmed on the reflector) I feel we can keep this closed. Discarding unreachable declarations does not fix the issue. It requires changing the Standard.
(I still feel it would be great if this worked.)

Can you elaborate on that? For example, I didn't see relevant discussion in EWG's reflector. Also it will be helpful to share the text about why it is conforming and why we need to change the Standard if we want to do that.

The discussion was on the LWG reflector.

The relevant part in the Standard is [extern.names]/4

Each function signature from the C standard library declared with external linkage is reserved to the implementation for use as a function signature with both extern "C" and extern "C++" linkage,159 or as a name of namespace scope in the global namespace.

Can you explain this a bit more? I've been using C++ a lot lately, but I don't understand it as deeply as you all do. I don't fully understand all the terms used in these posts.

So basically the program below is not conforming:

int log = 0; 
int main(){ return log; }

I expect this to work properly in practice, but it's not conforming. Which GCC is happy to tell you. I hope that makes it clear to you @rnikander .

@ChuanqiXu9
Copy link
Member

Oh, fine enough.

So we can see modules help clang to implement this check : )

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:modules C++20 modules and Clang Header Modules libc++ libc++ C++ Standard Library. Not GNU libstdc++. Not libc++abi. worksforme Resolved as "works for me"
Projects
None yet
Development

No branches or pull requests

9 participants