Skip to content

Conversation

@matts1
Copy link
Contributor

@matts1 matts1 commented Mar 11, 2025

When building libcxx as a module, it fails to build because it's missing various definitions.

Consider the following code for module A:

#include <module B that includes config_site>
#include <config>
#include <module C that includes config>

With the module map file:

module std {
   module A { header "a.h" }
   module B { header "b.h" }
   module C { header "c.h" }
}

Macro visibility rules state that "[macros] are visible if they are from the current submodule or translation unit, or if they were exported from a submodule that has been imported."

  • Module A has visibility of all macros exported by modules B and C (because they were exported from an imported submodule)
  • Module B and C have visibility over all macros exported by module A (because they are from the current submodule)
  1. When we #include the first line, module B successfully includes config_site.

Module B exports: config_site contents, config_site header guard

  1. Module A now includes directly

Because it can see module B's exports, it stops at the header guard for config_site and does not include config_site. This is not an issue, however, because it has visibility onto everything exported from config_site thanks to module B.

Module A exports: config contents, config header guard, config's includes' content and header guard (except config_site)

  1. Module C now includes config. Based on the above visibility rules, it can see everything exported from module A.

Because of that, we see the header guard for config and do not import it at all. This is mostly OK, as we can see the config that module A exported. However, if we ever try and access config_site, it will fail, as A did not export config_site.

@matts1 matts1 requested a review from a team as a code owner March 11, 2025 06:04
@github-actions
Copy link

Thank you for submitting a Pull Request (PR) to the LLVM Project!

This PR will be automatically labeled and the relevant teams will be notified.

If you wish to, you can add reviewers by using the "Reviewers" section on this page.

If this is not working for you, it is probably because you do not have write permissions for the repository. In which case you can instead tag reviewers by name in a comment by using @ followed by their GitHub username.

If you have received no comments on your PR for a week, you can request a review by "ping"ing the PR by adding a comment “Ping”. The common courtesy "ping" rate is once a week. Please remember that you are asking for valuable time from other developers.

If you have further questions, they may be answered by the LLVM GitHub User Guide.

You can also ask questions in a comment on this PR, on the LLVM Discord or on the forums.

@llvmbot llvmbot added the libc++ libc++ C++ Standard Library. Not GNU libstdc++. Not libc++abi. label Mar 11, 2025
@llvmbot
Copy link
Member

llvmbot commented Mar 11, 2025

@llvm/pr-subscribers-libcxx

Author: Matt (matts1)

Changes

When building libcxx as a module, it fails to build because it's missing various definitions.

Consider the following code for module A:

#include &lt;module B that includes __config_site&gt;
#include &lt;__config&gt;
#include &lt;module C that includes __config&gt;

With the module map file:

module std {
   module A { header "a.h" }
   module B { header "b.h" }
   module C { header "c.h" }
}

Macro visibility rules state that "[macros] are visible if they are from the current submodule or translation unit, or if they were exported from a submodule that has been imported."

  • Module A has visibility of all macros exported by modules B and C (because they were exported from an imported submodule)
  • Module B and C have visibility over all macros exported by module A (because they are from the current submodule)
  1. When we #include the first line, module B successfully includes __config_site.

Module B exports: __config_site contents, __config_site header guard

  1. Module A now includes <__config> directly

Because it can see module B's exports, it stops at the header guard for __config_site and does not include __config_site. This is not an issue, however, because it has visibility onto everything exported from __config_site thanks to module B.

Module A exports: __config contents, __config header guard, __config's includes' content and header guard (except __config_site)

  1. Module C now includes __config. Based on the above visibility rules, it can see everything exported from module A.

Because of that, we see the header guard for __config and do not import it at all. This is mostly OK, as we can see the __config that module A exported. However, if we ever try and access __config_site, it will fail, as A did not export __config_site.


Full diff: https://github.com/llvm/llvm-project/pull/130723.diff

1 Files Affected:

  • (modified) libcxx/include/module.modulemap (+8-7)
diff --git a/libcxx/include/module.modulemap b/libcxx/include/module.modulemap
index b9964dac84acd..b719f6d69273d 100644
--- a/libcxx/include/module.modulemap
+++ b/libcxx/include/module.modulemap
@@ -1,13 +1,14 @@
 // This module contains headers related to the configuration of the library. These headers
 // are free of any dependency on the rest of libc++.
 module std_config [system] {
-  textual header "__config"
-  textual header "__configuration/abi.h"
-  textual header "__configuration/availability.h"
-  textual header "__configuration/compiler.h"
-  textual header "__configuration/language.h"
-  textual header "__configuration/platform.h"
-  textual header "version"
+  header "__config"
+  header "__configuration/abi.h"
+  header "__configuration/availability.h"
+  header "__configuration/compiler.h"
+  header "__configuration/language.h"
+  header "__configuration/platform.h"
+  header "version"
+  export *
 }
 
 module std_core [system] {

@matts1 matts1 force-pushed the push-muqtuykpvmlp branch from 19092b4 to 7be9524 Compare March 11, 2025 06:05
@mordante mordante changed the title [libcxx] Fix libcxx config to be non-textual. [libcxx][clang-modules] Fix headers being marked as textual Mar 11, 2025
@mordante
Copy link
Member

I'm not too familiar with this part. I've adjusted the description, the previous one was looking odd with the naming and no context.

@matts1
Copy link
Contributor Author

matts1 commented Mar 17, 2025

@atetubou Could you please review this, it seems like no-one is taking a look.

@matts1 matts1 force-pushed the push-muqtuykpvmlp branch from 7be9524 to 21b6a64 Compare March 17, 2025 13:51
@atetubou
Copy link
Contributor

I'm still not sure what is the problem using textual specifiers in the module. And which of the following explanation is relevant to this change?

When building libcxx as a module, it fails to build because it's missing various definitions.

Consider the following code for module A:

#include <module B that includes config_site>
#include <config>
#include <module C that includes config>

With the module map file:

module std {
   module A { header "a.h" }
   module B { header "b.h" }
   module C { header "c.h" }
}

Macro visibility rules state that "[macros] are visible if they are from the current submodule or translation unit, or if they were exported from a submodule that has been imported."

  • Module A has visibility of all macros exported by modules B and C (because they were exported from an imported submodule)
  • Module B and C have visibility over all macros exported by module A (because they are from the current submodule)
  1. When we #include the first line, module B successfully includes config_site.

Module B exports: config_site contents, config_site header guard

  1. Module A now includes directly

Include directly what?

Because it can see module B's exports, it stops at the header guard for config_site and does not include config_site. This is not an issue, however, because it has visibility onto everything exported from config_site thanks to module B.

Module A exports: config contents, config header guard, config's includes' content and header guard (except config_site)

  1. Module C now includes config. Based on the above visibility rules, it can see everything exported from module A.

Because of that, we see the header guard for config and do not import it at all. This is mostly OK, as we can see the config that module A exported. However, if we ever try and access config_site, it will fail, as A did not export config_site.

@atetubou
Copy link
Contributor

@atetubou Could you please review this, it seems like no-one is taking a look.

I actually don't have commit access for llvm repository. So we anyway need to wait for review from someone who has commit access here.

@mordante
Copy link
Member

@atetubou Could you please review this, it seems like no-one is taking a look.

I actually don't have commit access for llvm repository. So we anyway need to wait for review from someone who has commit access here.

Not only that, but patches for libc++ should be reviewed by the libcxx-reviewers.

Copy link
Member

@ldionne ldionne left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the patch. I'm trying to understand the issue you're running into (I read the PR description a few times), and I'm struggling to understand how you run into this issue (especially why we don't run into it in our current modules build). Could you provide a reproducer for this issue? How do you come across it? You mention

When building libcxx as a module

but it's not clear to me what that means. Additionally, the modules CI job is now failing with issues like:

 /.../libcxx/test/libcxx/libcpp_version.gen.py/experimental/iterator.compile.pass.cpp:8:3: error: <experimental/iterator> does not seem to define _LIBCPP_VERSION
  # |     8 | # error <experimental/iterator> does not seem to define _LIBCPP_VERSION
  # |       |   ^
  # | 1 error generated.

There are also some XPASSes in tests like std/localization/locale.categories/category.monetary/locale.money.put/locale.money.put.members/put_long_double_ru_RU.pass.cpp. After investigating and reproducing locally (by applying your patch), what's happening with these XPASSes is that the testing configuration incorrectly defines the win32-broken-utf8-wchar-ctype Lit feature, because _LIBCPP_HAS_LOCALIZATION is not detected in the compiler macros here.

All in all, it seems like the net effect of making this change is that we don't export the __config_site macros out of libc++ anymore. Perhaps there's a simple fix, but I tried a couple of things locally with no success.

I don't have a problem with this patch as a matter of principle, but I'd like to understand what it fixes, add a test for that, and make sure that it doesn't break other stuff.

CC @ian-twilightcoder for general modules expertise

@ian-twilightcoder
Copy link
Contributor

None of these should be textual because they have declarations, which becomes a problem because their declarations build into all of their (transitive) includers' modules. However, we shouldn't fuse them by putting them all in the same submodule, and this will cause problems with __config_site I think, which is its own difficulty.

@matts1 matts1 force-pushed the push-muqtuykpvmlp branch from 21b6a64 to a050cf2 Compare March 20, 2025 00:53
When building libcxx as a module, it fails to build because it's missing various definitions.

Consider the following code for module A:
```
#include <module B that includes config_site>
#include <config>
#include <module C that includes config>
```

With the module map file:
```
module std {
   module A { header "a.h" }
   module B { header "b.h" }
   module C { header "c.h" }
}
```

Macro visibility rules state that "[macros] are visible if they are from the current submodule or translation unit, or if they were exported from a submodule that has been imported."
* Module A has visibility of all macros exported by modules B and C (because they were exported from an imported submodule)
* Module B and C have visibility over all macros exported by module A (because they are from the current submodule)

1) When we #include the first line, module B successfully includes config_site.

Module B exports: config_site contents, config_site header guard

2) Module A now `includes <config>` directly

Because it can see module B's exports, it stops at the header guard for config_site and does not include config_site. This is not an issue, however, because it has visibility onto everything exported from config_site thanks to module B.

Module A exports: config  contents, config header guard, config's includes' content and header guard (*except config_site*)

3) Module C now includes config. Based on the above visibility rules, it can see everything exported from module A.

Because of that, we see the header guard for config and do not import it at all. This is mostly OK, as we can see the config that module A exported. However, if we ever try and access config_site, it will fail, as A did not export config_site.
@matts1 matts1 force-pushed the push-muqtuykpvmlp branch from a050cf2 to 28e79c6 Compare March 20, 2025 01:42
@matts1
Copy link
Contributor Author

matts1 commented Mar 20, 2025

Could you provide a reproducer for this issue? How do you come across it?

I'm on the chrome build team, and we're trying resolve #127012. We want to build libc++ into .pcm files, but this is currently not possible. We have a (build file) that you can see the general idea in.

The reason this is not occuring in the modules CI builder is because the modules CI builder doesn't do this (I don't know what it does, but I know it's currently unsupported).

All in all, it seems like the net effect of making this change is that we don't export the __config_site macros out of libc++ anymore. Perhaps there's a simple fix, but I tried a couple of things locally with no success.

That's correct. I believe I've now fixed it though.

I don't have a problem with this patch as a matter of principle, but I'd like to understand what it fixes, add a test for that, and make sure that it doesn't break other stuff.

This is impossible to write a test for currently, since even with this fix, it still won't work (it needs several different patches, all of which need to be submitted before it starts working). I'm hoping you'll be satisfied with "it doesn't break anything existing, and @ian-twilightcoder it shouldn't be textual, so the current definition is incorrect".

FWIW, with this patch (and a few others, eg #132125), I've successfully built the whole of Chrome.

Copy link
Member

@ldionne ldionne left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

None of these should be textual because they have declarations, which becomes a problem because their declarations build into all of their (transitive) includers' modules.

@ian-twilightcoder Correction, they don't contain declarations -- macros only. That may not change the fact that these shouldn't be textual, but I still wanted to clarify.

I'm on the chrome build team, and we're trying resolve #127012. We want to build libc++ into .pcm files, but this is currently not possible. We have a (build file) that you can see the general idea in.

Thanks, that's very helpful context.

There's a few places where you folks mentioned "explicit" modules -- what do you mean by that? Do you mean just Clang's -fmodules flag, or something else? I usually hear -fmodules referred to as "just Clang Modules", and I wonder if "explicit" modules is something else I should pay attention to?

I'm hoping you'll be satisfied with "it doesn't break anything existing, and @ian-twilightcoder it shouldn't be textual, so the current definition is incorrect".

Having green CI tells me that this is at least not a regression as far as we can tell, so yes, that's helpful. And having someone else with modules expertise chime in and confirm that we want to eliminate textual headers as much as possible is also helpful. I still do think that if this doesn't fail in our CI, then there must be a way to compile libc++'s own code without this patch. For example, you could probably switch to using public includes only from libc++'s own source files and I believe that would fix this issue. That being said, I am also happy (probably happier, even) with actually fixing our modulemap, so this LGTM assuming CI passes.

@ian-twilightcoder
Copy link
Contributor

None of these should be textual because they have declarations, which becomes a problem because their declarations build into all of their (transitive) includers' modules.

@ian-twilightcoder Correction, they don't contain declarations -- macros only. That may not change the fact that these shouldn't be textual, but I still wanted to clarify.

Macros count as declarations, at least in the pcm/ast sense.

There's a few places where you folks mentioned "explicit" modules -- what do you mean by that? Do you mean just Clang's -fmodules flag, or something else? I usually hear -fmodules referred to as "just Clang Modules", and I wonder if "explicit" modules is something else I should pay attention to?

Presumably that means -fno-implicit-module-maps, described a bit at https://developer.apple.com/documentation/xcode/building-your-project-with-explicit-module-dependencies

@ian-twilightcoder
Copy link
Contributor

ian-twilightcoder commented Mar 20, 2025

I still feel like this is very likely to break something if we don't modularize __config_site though. I have no idea how we can deal with LIBCXX_GENERATED_INCLUDE_TARGET_DIR != LIBCXX_GENERATED_INCLUDE_DIR to make that happen. @matts1 do you think you could look at that? I would feel a lot more comfortable signing off on this if __config_site was modular.

@matts1
Copy link
Contributor Author

matts1 commented Mar 21, 2025

It's definitely not possible to do in this modulemap file (since you need a relative path). You might be able to do it by having __config_site.in generate both $DIR/__config_site and $DIR/module.modulemap, to keep the relative paths stable.

That being said, I believe it's perfectly safe since:

  • __config_site doesn't contain any non-macro declarations
  • __config_site doesn't include any other headers

Because of the above reasons, if you can ever see a header guard that blocks you from importing __config_site, you can always see all the other macros in __config_site as well.

And yes, @ian-twilightcoder is correct - when we say explicit modules, we're referring to -fno-implicit-module-maps.

Also FYI, I'm gonna be on holiday for the next 3 weeks, so will be unable to respond.

@ian-twilightcoder
Copy link
Contributor

It's definitely not possible to do in this modulemap file (since you need a relative path). You might be able to do it by having __config_site.in generate both $DIR/__config_site and $DIR/module.modulemap, to keep the relative paths stable.

That being said, I believe it's perfectly safe since:

It's definitely not safe. As a non-modular include, you'll get weird absorption behavior. It might work if it was textual but then you re-introduce the exact problem you're solving here. What is LIBCXX_GENERATED_INCLUDE_TARGET_DIR actually used for? The problem is that __config_site is in an extra directory in the build directory, but later gets installed next to all of the other headers. We would need the module map to have different contents in the build directory (with header "relpath(LIBCXX_GENERATED_INCLUDE_TARGET_DIR, LIBCXX_GENERATED_INCLUDE_DIR)/__config_site") than it gets when installed (header "__config_site")

@ldionne
Copy link
Member

ldionne commented Mar 25, 2025

What is LIBCXX_GENERATED_INCLUDE_TARGET_DIR actually used for?

It's used to ship differently-configured libc++'s on different targets. We have:

include/c++/v1/
              vector
              string
              etc..
include/c++/v1/aarch64-linux-whatever/
              __config_site
include/c++/v1/x86_64-linux-whatever/
              __config_site

The compiler then adds -isystem <ROOT>/include/c++/v1 -isystem <ROOT>/include/c++/v1/<target> based on the --target it gets passed. We don't use it on Apple platforms (yet).

Note that I am not concerned with the modulemap not working from the build directory. There might be a few challenges to deploying that (I think when @arichardson last tried he found some places in LLVM that relied on the structure of our build directory), but in principle we should never use anything from our build directory, only from the installed result.

@ian-twilightcoder
Copy link
Contributor

ian-twilightcoder commented Mar 25, 2025

What is LIBCXX_GENERATED_INCLUDE_TARGET_DIR actually used for?

It's used to ship differently-configured libc++'s on different targets. We have:

include/c++/v1/
              vector
              string
              etc..
include/c++/v1/aarch64-linux-whatever/
              __config_site
include/c++/v1/x86_64-linux-whatever/
              __config_site

The compiler then adds -isystem <ROOT>/include/c++/v1 -isystem <ROOT>/include/c++/v1/<target> based on the --target it gets passed. We don't use it on Apple platforms (yet).

Note that I am not concerned with the modulemap not working from the build directory. There might be a few challenges to deploying that (I think when @arichardson last tried he found some places in LLVM that relied on the structure of our build directory), but in principle we should never use anything from our build directory, only from the installed result.

Doesn't the build directory get used for running tests, at least locally? Is there a particular reason we don't just re-generate __config_site when you switch targets?

@ldionne
Copy link
Member

ldionne commented Apr 4, 2025

Doesn't the build directory get used for running tests, at least locally?

Not anymore. We now perform an installation at a fake location and the test suite is configured to use that instead. This ensures that what we test is faithful with what we ship (eg. the install name of the library, its path, etc).

Is there a particular reason we don't just re-generate __config_site when you switch targets?

I'm not certain I understand. But the idea is that on some platforms, we ship a single libc++ with multiple __config_site headers. The compiler then selects which __config_site to use based on the target selected by the user. This allows a single toolchain to support multiple targets, and for these multiple targets to have differently-configured libc++s. Since the user is free to select the target they are compiling for on the fly, it wouldn't make sense to require a "rebuild" of libc++ for that to work.

@ian-twilightcoder
Copy link
Contributor

Doesn't the build directory get used for running tests, at least locally?

Not anymore. We now perform an installation at a fake location and the test suite is configured to use that instead. This ensures that what we test is faithful with what we ship (eg. the install name of the library, its path, etc).

Ah, spiffy.

Is there a particular reason we don't just re-generate __config_site when you switch targets?

I'm not certain I understand. But the idea is that on some platforms, we ship a single libc++ with multiple __config_site headers. The compiler then selects which __config_site to use based on the target selected by the user. This allows a single toolchain to support multiple targets, and for these multiple targets to have differently-configured libc++s. Since the user is free to select the target they are compiling for on the fly, it wouldn't make sense to require a "rebuild" of libc++ for that to work.

Oh I see so it will be installed with multiple __config_site headers too. I think the easiest way to support that then is to have multiple top level modules for each __config_site. Unless we can easily map the target directory to a valid requires?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

libc++ libc++ C++ Standard Library. Not GNU libstdc++. Not libc++abi.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants