Skip to content

Conversation

@ian-twilightcoder
Copy link
Contributor

The C standard behavior of assert cannot be accomplished with clang modules, either as a normal modular header, or a textual header.

As a normal modular header:
#define NDEBUG
#include <assert.h>
This pattern doesn't work, NDEBUG has to be passed on the command line to take effect, and then will effect all asserts in the includer.

As a textual header:
#define NDEBUG
#include <modular_header_that_has_an_assert.h>
This pattern doesn't work for similar reasons, modular_header_that_has_an_assert.h captured the value of NDEBUG when its module built and won't pick it up from the includer. -DNDEBUG can be passed when building the module, but will similarly effect the entire module. This has the additional problem that every module will contain a declaration for assert, which can possibly conflict with each other if they use different values of NDEBUG.

So really <assert.h> just doesn't work properly with clang modules. Avoid the issue by not mentioning it in the Modules documentation, and use "X macros" as the example for textual headers.

Don't use [extern_c] in the example modules, that should very rarely be used. Don't put multiple header declarations in a submodule, that has the confusing effect of "fusing" the headers. e.g. <sys/errno.h> does not include <errno.h>, but if it's in the same submodule, then an #include <sys/errno.h> will mysteriously also include <errno.h>.

The C standard behavior of `assert` cannot be accomplished with clang modules, either as a normal modular header, or a textual header.

As a normal modular header:
#define NDEBUG
#include <assert.h>
This pattern doesn't work, NDEBUG has to be passed on the command line to take effect, and then will effect all `assert`s in the includer.

As a textual header:
#define NDEBUG
#include <modular_header_that_has_an_assert.h>
This pattern doesn't work for similar reasons, modular_header_that_has_an_assert.h captured the value of NDEBUG when its module built and won't pick it up from the includer. -DNDEBUG can be passed when building the module, but will similarly effect the entire module. This has the additional problem that every module will contain a declaration for `assert`, which can possibly conflict with each other if they use different values of NDEBUG.

So really <assert.h> just doesn't work properly with clang modules. Avoid the issue by not mentioning it in the Modules documentation, and use "X macros" as the example for textual headers.

Don't use [extern_c] in the example modules, that should very rarely be used.
Don't put multiple `header` declarations in a submodule, that has the confusing effect of "fusing" the headers. e.g. <sys/errno.h> does not include <errno.h>, but if it's in the same submodule, then an `#include <sys/errno.h>` will mysteriously also include <errno.h>.
@llvmbot llvmbot added the clang Clang issues not falling into any other category label Oct 24, 2025
@llvmbot
Copy link
Member

llvmbot commented Oct 24, 2025

@llvm/pr-subscribers-clang

Author: Ian Anderson (ian-twilightcoder)

Changes

The C standard behavior of assert cannot be accomplished with clang modules, either as a normal modular header, or a textual header.

As a normal modular header:
#define NDEBUG
#include <assert.h>
This pattern doesn't work, NDEBUG has to be passed on the command line to take effect, and then will effect all asserts in the includer.

As a textual header:
#define NDEBUG
#include <modular_header_that_has_an_assert.h>
This pattern doesn't work for similar reasons, modular_header_that_has_an_assert.h captured the value of NDEBUG when its module built and won't pick it up from the includer. -DNDEBUG can be passed when building the module, but will similarly effect the entire module. This has the additional problem that every module will contain a declaration for assert, which can possibly conflict with each other if they use different values of NDEBUG.

So really <assert.h> just doesn't work properly with clang modules. Avoid the issue by not mentioning it in the Modules documentation, and use "X macros" as the example for textual headers.

Don't use [extern_c] in the example modules, that should very rarely be used. Don't put multiple header declarations in a submodule, that has the confusing effect of "fusing" the headers. e.g. <sys/errno.h> does not include <errno.h>, but if it's in the same submodule, then an #include &lt;sys/errno.h&gt; will mysteriously also include <errno.h>.


Full diff: https://github.com/llvm/llvm-project/pull/165057.diff

1 Files Affected:

  • (modified) clang/docs/Modules.rst (+5-12)
diff --git a/clang/docs/Modules.rst b/clang/docs/Modules.rst
index acbe45e0be970..e45ee9ff9eac2 100644
--- a/clang/docs/Modules.rst
+++ b/clang/docs/Modules.rst
@@ -421,13 +421,7 @@ As an example, the module map file for the C standard library might look a bit l
 
 .. parsed-literal::
 
-  module std [system] [extern_c] {
-    module assert {
-      textual header "assert.h"
-      header "bits/assert-decls.h"
-      export *
-    }
-
+  module std [system] {
     module complex {
       header "complex.h"
       export *
@@ -440,7 +434,6 @@ As an example, the module map file for the C standard library might look a bit l
 
     module errno {
       header "errno.h"
-      header "sys/errno.h"
       export *
     }
 
@@ -673,14 +666,14 @@ of checking *use-declaration*\s, and must still be a lexically-valid header
 file. In the future, we intend to pre-tokenize such headers and include the
 token sequence within the prebuilt module representation.
 
-A header with the ``exclude`` specifier is excluded from the module. It will not be included when the module is built, nor will it be considered to be part of the module, even if an ``umbrella`` header or directory would otherwise make it part of the module.
+A header with the ``exclude`` specifier is excluded from the module. It will not be included when the module is built, nor will it be considered to be part of the module, even if an ``umbrella`` directory would otherwise make it part of the module.
 
-**Example:** The C header ``assert.h`` is an excellent candidate for a textual header, because it is meant to be included multiple times (possibly with different ``NDEBUG`` settings). However, declarations within it should typically be split into a separate modular header.
+**Example:** A "X macro" header is an excellent candidate for a textual header, because it is can't be compiled standalone, and by itself does not contain any declarations.
 
 .. parsed-literal::
 
-  module std [system] {
-    textual header "assert.h"
+  module MyLib [system] {
+    textual header "xmacros.h"
   }
 
 A given header shall not be referenced by more than one *header-declaration*.

@zygoloid
Copy link
Collaborator

As a textual header:
#define NDEBUG
#include <modular_header_that_has_an_assert.h>

This isn't a problem with <assert.h>. If <modular_header_that_has_an_assert.h> intends to pick up the state of the NDEBUG macro from the translation unit that includes it, then it's not a modular header, and shouldn't be declared as one.

@zygoloid
Copy link
Collaborator

This isn't a problem with <assert.h>.

Nonetheless I think X-macros are a better example to use here than <assert.h>. I'd suggest just undoing the first change (removing assert.h from the example std module) but the rest of the change LGTM.


.. parsed-literal::
module std [system] [extern_c] {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why remove the extern_c here? This seems like a pretty good example of a time when it'd make sense to use this attribute.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see you mentioned this in the PR description. I suppose whether this is appropriate depends on whether these headers are all pure C headers or if they themselves try to support inclusion from C++. I don't really mind whether we keep this or remove it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem with [extern_c] is it sticks all of the includes in the module inside of extern C which runs you into -Wmodule-import-in-extern-c which is promoted to an error by default. Plus you never know when an include will transitively include a C++ header like <stdint.h>, so you really don't want your includes to be extern C. It's kind of a flawed attribute, at some point I would propose that we remove it.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm.

It's definitely flawed in that it's viral and requires strong layering between your C headers and your C++ headers, but if you have proper layering, if your C library isn't including C++ headers, and if you can add the attribute to your C library and its transitive dependencies, it can work well. Not everyone has a setup where the C library includes C++ headers rather than being layered entirely beneath the C++ library.

But you make a good point that it's got enough problems that making a blanket recommendation to use it would be unreasonable. Yeah, I'm convinced we shouldn't be using it in this example.

@ian-twilightcoder
Copy link
Contributor Author

As a textual header:
#define NDEBUG
#include <modular_header_that_has_an_assert.h>

This isn't a problem with <assert.h>. If <modular_header_that_has_an_assert.h> intends to pick up the state of the NDEBUG macro from the translation unit that includes it, then it's not a modular header, and shouldn't be declared as one.

Sure, but if you can't include <assert.h> from a modular header, then that basically means <assert.h> can't itself be a modular header. <modular_header_that_has_an_assert.h> doesn't usually start its life as a modular header, it just has an inline function with an assert in it, and doesn't expect its behavior to change and start ignoring NDEBUG when it becomes a modular header. In a single module world, fine <assert.h> can function the same as textual, but when you add in a second world it starts behaving differently with and without modules.

@zygoloid
Copy link
Collaborator

Sure, but if you can't include <assert.h> from a modular header, then that basically means <assert.h> can't itself be a modular header.

What do you mean by "modular header" here? The terminology I'm familiar with considers textual headers and modular headers to be mutually exclusive -- modular headers are the ones that don't (intend to) depend on the state of translation at the point at which they're entered, and textual headers are the ones that do. I agree (using that definition) that <assert.h> can't (or at least, shouldn't) be a modular header.

The purpose of textual header declarations is to associate a header with a module so that it is found and its inclusion is permitted by strict use checking rules. (There was an idea that we might also store a pretokenized form of textual headers, but that never actually happened.) We want a use LibC; to permit including assert.h, and don't want it to be a modular import, so it should be listed as a textual header.

<modular_header_that_has_an_assert.h> doesn't usually start its life as a modular header, it just has an inline function with an assert in it, and doesn't expect its behavior to change and start ignoring NDEBUG when it becomes a modular header. In a single module world, fine <assert.h> can function the same as textual, but when you add in a second world it starts behaving differently with and without modules.

I'm not following something here -- I think you're suggesting that you'd see some kind of difference when <assert.h> is listed as a textual header versus when it's not listed at all (in a simple world without any use checking, I assume). What difference do you have in mind?

@ian-twilightcoder
Copy link
Contributor Author

Sure, but if you can't include <assert.h> from a modular header, then that basically means <assert.h> can't itself be a modular header.

What do you mean by "modular header" here? The terminology I'm familiar with considers textual headers and modular headers to be mutually exclusive -- modular headers are the ones that don't (intend to) depend on the state of translation at the point at which they're entered, and textual headers are the ones that do. I agree (using that definition) that <assert.h> can't (or at least, shouldn't) be a modular header.

We've (Apple) found textual headers to be very limited in practice. Basically they can't cause a declaration to exist in multiple modules. Swift will see those declarations as distinct because the module name is part of the type/function name, and as distinct things they're by definition incompatible. Sometimes it kind of works out through the type merging that clang does, but mostly it doesn't.

What even should happen if modules A and B include <assert.h>, but B defines NDEBUG before doing so? If I just import A and B and don't include <assert.h> myself, which definition of assert do I see?

The purpose of textual header declarations is to associate a header with a module so that it is found and its inclusion is permitted by strict use checking rules. (There was an idea that we might also store a pretokenized form of textual headers, but that never actually happened.) We want a use LibC; to permit including assert.h, and don't want it to be a modular import, so it should be listed as a textual header.

I thought even strict use checking rules don't complain about headers that aren't covered by any module, it only flags module imports that aren't listed?

<modular_header_that_has_an_assert.h> doesn't usually start its life as a modular header, it just has an inline function with an assert in it, and doesn't expect its behavior to change and start ignoring NDEBUG when it becomes a modular header. In a single module world, fine <assert.h> can function the same as textual, but when you add in a second world it starts behaving differently with and without modules.

I'm not following something here -- I think you're suggesting that you'd see some kind of difference when <assert.h> is listed as a textual header versus when it's not listed at all (in a simple world without any use checking, I assume). What difference do you have in mind?

I'm trying to say that <modular_header_that_has_an_assert.h> doesn't have a good way to know that using assert will be a bit weird with modules. If <assert.h> wasn't a modular header (textual or otherwise), then <modular_header_that_has_an_assert.h> could get a -Wnon-modular-include-in-module to let it know. Since the owner of <modular_header_that_has_an_assert.h> probably isn't aware of the subtle behavior difference, and probably never really thought about picking up NDEBUG, it's just what happens to happen in regular C without modules. Wheras if <assert.h> is textual modular, the behavior changes but it's not obvious.

@Bigcheese
Copy link
Contributor

The problem is that there's a lot of code that uses <assert.h> in a modular way, only defining NDEBUG on the command line, but there's also a lot of code that does #define NDEBUG. Only one of these ways can practicably work in a module.

Maybe it would be best if we had a way to keep <assert.h> textual, but error if the state of NDEBUG has changed in a context where that breaks things. I think we could do that without any compiler changes by defining the two versions of the assert macro in different modules which conflict each other, and then using <assert.h> to select which one to import. This wouldn't handle the case of only indirect import via other modules though. To handle that we would need a stronger form of config_macros that handles the transitive case, so:

#define NDEBUG // or #undef NDEBUG
#include <modular_header_that_has_an_assert.h>

can emit an error if NDEBUG doesn't match what's in the module.

@ian-twilightcoder
Copy link
Contributor Author

The problem is that there's a lot of code that uses <assert.h> in a modular way, only defining NDEBUG on the command line, but there's also a lot of code that does #define NDEBUG. Only one of these ways can practicably work in a module.

Maybe it would be best if we had a way to keep <assert.h> textual, but error if the state of NDEBUG has changed in a context where that breaks things. I think we could do that without any compiler changes by defining the two versions of the assert macro in different modules which conflict each other, and then using <assert.h> to select which one to import. This wouldn't handle the case of only indirect import via other modules though. To handle that we would need a stronger form of config_macros that handles the transitive case, so:

#define NDEBUG // or #undef NDEBUG
#include <modular_header_that_has_an_assert.h>

can emit an error if NDEBUG doesn't match what's in the module.

Wouldn't that just result in an error if I did this?

#include <modular_header_that_has_an_assert.h> // expects no NDEBUG - selects the assert_enabled module
#define NDEBUG
#include <assert.h> // selects the assert_disabled module, kaboom conflicting module error

@zygoloid
Copy link
Collaborator

We've (Apple) found textual headers to be very limited in practice. Basically they can't cause a declaration to exist in multiple modules. Swift will see those declarations as distinct because the module name is part of the type/function name, and as distinct things they're by definition incompatible. Sometimes it kind of works out through the type merging that clang does, but mostly it doesn't.

Conversely, we've (Google) found them to be essential for modeling dependency relationships between notional modules, where not all #included files are modular headers. And we do have situations where we have the same header in multiple modules (even cases where it's textual in one and modular in another), and it works in practice for us.

To what extent is this a Swift import problem rather than a Clang modules problem? (To be clear: I'm not suggesting that would make it any less severe, but it might allow us to better separate the concerns.) The treatment of the module name as part of the entity name sounds Swift-specific, at least.

What even should happen if modules A and B include <assert.h>, but B defines NDEBUG before doing so? If I just import A and B and don't include <assert.h> myself, which definition of assert do I see?

If you try to use the assert macro in that state, I believe you would get an error because it doesn't have a consistent definition. If you do include <assert.h> yourself, it will #undef and re-#define the macro, and you'll get that macro instead of the imported ones.

I thought even strict use checking rules don't complain about headers that aren't covered by any module, it only flags module imports that aren't listed?

That's the difference between -fmodules-decluse and -fmodules-strict-decluse -- the latter also rejects undeclared inclusions.

I'm trying to say that <modular_header_that_has_an_assert.h> doesn't have a good way to know that using assert will be a bit weird with modules. If <assert.h> wasn't a modular header (textual or otherwise), then <modular_header_that_has_an_assert.h> could get a -Wnon-modular-include-in-module to let it know. Since the owner of <modular_header_that_has_an_assert.h> probably isn't aware of the subtle behavior difference, and probably never really thought about picking up NDEBUG, it's just what happens to happen in regular C without modules. Wheras if <assert.h> is textual modular, the behavior changes but it's not obvious.

I mean, I suppose, but this would be a potential risk any time you include a textual header from a non-textual header in a module, just like it'd be a risk any time you include an entirely undeclared header from a modular header, because a textual header could likewise depend on enclosing state.

If you want a warning for this kind of issue, I wonder if really we have two different kinds of textual headers we should be distinguishing here -- those like <assert.h> that are unsafe to #include into a modular header (but fine to include from another textual header of the same kind or from a source file), and those like .def files that are largely safe to #include into a modular header. Maybe this should be expressed as an attribute on the textual header declaration?

That said, it seems that when you mark modular_header_that_has_an_assert.h as being a modular header, you are explicitly requesting that the state of NDEBUG in an includer of that header does not affect the behavior of the header, and that's the behavior you get. Being a modular header means you get compiled as-if from a clean preprocessor + AST state, so there simply isn't an NDEBUG that you could inherit from anywhere other than the command line. So the purpose of such checking would be to detect cases where your modularization accidentally did the wrong thing -- where you thought a header was modular but it wasn't. And maybe there's something more general we can do about that:

If we want some detection of cases where a modular build behaves differently from a textual inclusion build, I think we could probably build that without building too much new infrastructure, at least for local submodule visibility builds. In essence, we'd enable use of module maps but not precompiled modules, and textually enter all modular headers even if they're from a different top-level module. Then if we see a use of an identifier that has a macro definition that isn't visible, but would be visible if all modules were visible, we produce an error (and likewise for name lookup and reachability checks).

@ian-twilightcoder
Copy link
Contributor Author

Maybe this is a bigger discussion than this PR. Maybe we should just leave with "it's complicated" and not talk about assert.h at all in the documentation?

Copy link
Collaborator

@zygoloid zygoloid left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We had a really productive discussion about this (and other modules stuff) at the dev meeting. For me the most relevant takeaways were:

  • config_macros should be transitive -- if module A has a config macro, and B imports A, then that config macro is a config macro for B too.
  • With transitive config_macro checking, it would be reasonable for an implementation to choose to make <assert.h> a modular header, and to not support in-file setting of NDEBUG.

And with that in mind, it seems like we shouldn't be taking a stance on whether <assert.h> is treated as a modular header by a particular implementation (eg, by the libc or SDK's module map).

So yeah, I agree that the best path forward is to just not mention assert.h in our documentation at all. Thanks for your patience here!

@ian-twilightcoder ian-twilightcoder merged commit 6a10d1d into llvm:main Oct 30, 2025
13 checks passed
@ian-twilightcoder ian-twilightcoder deleted the assert_module branch October 30, 2025 17:54
luciechoi pushed a commit to luciechoi/llvm-project that referenced this pull request Nov 1, 2025
…lvm#165057)

The C standard behavior of `assert` cannot be accomplished with clang
modules, either as a normal modular header, or a textual header.

As a normal modular header:
#define NDEBUG
#include <assert.h>
This pattern doesn't work, NDEBUG has to be passed on the command line
to take effect, and then will effect all `assert`s in the includer.

As a textual header:
#define NDEBUG
#include <modular_header_that_has_an_assert.h>
This pattern doesn't work for similar reasons,
modular_header_that_has_an_assert.h captured the value of NDEBUG when
its module built and won't pick it up from the includer. -DNDEBUG can be
passed when building the module, but will similarly effect the entire
module. This has the additional problem that every module will contain a
declaration for `assert`, which can possibly conflict with each other if
they use different values of NDEBUG.

So really <assert.h> just doesn't work properly with clang modules.
Avoid the issue by not mentioning it in the Modules documentation, and
use "X macros" as the example for textual headers.

Don't use [extern_c] in the example modules, that should very rarely be
used. Don't put multiple `header` declarations in a submodule, that has
the confusing effect of "fusing" the headers. e.g. <sys/errno.h> does
not include <errno.h>, but if it's in the same submodule, then an
`#include <sys/errno.h>` will mysteriously also include <errno.h>.
DEBADRIBASAK pushed a commit to DEBADRIBASAK/llvm-project that referenced this pull request Nov 3, 2025
…lvm#165057)

The C standard behavior of `assert` cannot be accomplished with clang
modules, either as a normal modular header, or a textual header.

As a normal modular header:
#define NDEBUG
#include <assert.h>
This pattern doesn't work, NDEBUG has to be passed on the command line
to take effect, and then will effect all `assert`s in the includer.

As a textual header:
#define NDEBUG
#include <modular_header_that_has_an_assert.h>
This pattern doesn't work for similar reasons,
modular_header_that_has_an_assert.h captured the value of NDEBUG when
its module built and won't pick it up from the includer. -DNDEBUG can be
passed when building the module, but will similarly effect the entire
module. This has the additional problem that every module will contain a
declaration for `assert`, which can possibly conflict with each other if
they use different values of NDEBUG.

So really <assert.h> just doesn't work properly with clang modules.
Avoid the issue by not mentioning it in the Modules documentation, and
use "X macros" as the example for textual headers.

Don't use [extern_c] in the example modules, that should very rarely be
used. Don't put multiple `header` declarations in a submodule, that has
the confusing effect of "fusing" the headers. e.g. <sys/errno.h> does
not include <errno.h>, but if it's in the same submodule, then an
`#include <sys/errno.h>` will mysteriously also include <errno.h>.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

clang Clang issues not falling into any other category

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants