More closely mimic the Clang compilation #5543

ilya-biryukov · 2025-05-27T16:24:21Z

Instead of using ASTUnit and tooling APIs, directly mimic what Clang does during the compilation.

Run the Clang frontend until the translation unit parsing is done, at which point switch back to Carbon to finish the Check phase and interface with Clang through ASTContext and Sema. In lower, finish the corresponding Clang compilation phase, i.e. CodeGen.

Clang does not have the corresponding APIs and instead provides a callback-based mechanism. To map this back to Carbon APIs, we create a separate thread that gives control back to Carbon through the callbacks, and then finishes the compilation when the Carbon code is done.

Although essentially a hack, this allows to easily fit into the Carbon codebase quickly and we should have an option of refactoring Clang code in LLVM upstream if this approach proves fruitful.

Replace this paragraph with a description of what this PR is changing or
adding, and why.

Closes #ISSUE

danakj · 2025-05-27T16:27:07Z

Run the Clang frontend until the translation unit parsing is done, at which point switch back to Carbon to finish the Check face

Did you mean Check phase?

ilya-biryukov · 2025-05-27T16:27:39Z

This is very raw: tests fail, an approach to codegen needs to be updated, the documentation and PR description needs to be improved.

Still posting this to get early feedback about the feasibility of this approach, especially the dance with multiple threads.
Sharing the same CompilerInstance, AST, Sema, etc between multiple threads is not unheard of, but, like other use-cases (e.g. clangd), it's unusual in its own way. LLVM should be prepared to handle that, though. Even if there are few global variables / thread locals left, they should be easily fixable.

ilya-biryukov · 2025-06-03T15:54:45Z

Did you mean Check phase?

🤦 I did, thanks for pointing that out.

ilya-biryukov · 2025-06-17T16:52:09Z

I've picked this up again and it's shaping up now, should be close to something I'd like to land.
In particular, we now use the code generator from Clang rather than creating our own and take the llvm::Module Clang produces.

I want to do one last polish of the code before sending this out for review, but it's now roughly what I want it to be in terms of behavior. One last bit that I want to do a little later is to get rid of the use of CodeGenerator for marking which functions are used and instead go through Clang frontend interfaces. But since it's pretty independent, I am thinking of doing it as a follow-up.

ilya-biryukov · 2025-06-27T15:56:10Z

The code is now final, this is one rebase away from moving off the "draft" state.

ilya-biryukov · 2025-07-04T15:10:19Z

toolchain/check/import_cpp.cpp

+      // Flags to match LLVM IR output of default code generation options, which
+      // was used in an earlier version.
+      // TODO: remove them when sharing flags with ClangRunner.
+      "-fomit-frame-pointer", "-fno-unwind-tables",


I am really unsure if I should be adding this and it should definitely go away after the TODO about sharing with ClangRunner is finished.

Happy to take either path here and not sure which defaults Carbon uses now.

Looks like there's a little overlap between this and #5767. Sorry about that; I was hoping the parts I was touching were more separate from the parts you were touching here.

I think we can fully separate them if we make this PR do just a little bit less: instead of having InFlightClang take driver arguments and run the driver to from frontend arguments, could you make it take a shared_ptr<CompilerInvocation> instead, and build that here using clang::CompilerInvocation::CreateFromArgs, passing in the same frontend argument list we currently do?

With that done, we can deal with using the driver to map from driver arguments to frontend arguments in #5767 (including plumbing through the proper path to the clang binary so that we find the resource directory properly), and it should be straightforward to combine the two patches.

Regarding your question about keeping output the same: if you want to decouple this PR from changes to the output, one way to do that would be to overwrite the CodeGenOptions in the CompilerInvocation with a default-constructed CodeGenOptions for now (and just set the EmitVersionIdentMetadata flag to false). For things that we want full control over, changing them via CompilerInvocation seems better than trying to control them with driver arguments.

Instead of using ASTUnit and tooling APIs, directly mimic what Clang does during the compilation. As a result, we now correctly generate definitions for inline C++ functions called by Carbon and functions called recursively by them. Run the Clang frontend until the translation unit parsing is done, at which point switch back to Carbon to finish the `Check` face and interface with Clang through `ASTContext` and `Sema`. In lower, finish the corresponding Clang compilation phase (CodeGen) and consume the resulting `llvm::Module`. Clang does not have the corresponding APIs directly and instead provides a callback-based mechanism. To map this back to Carbon APIs, we create a separate thread that gives control back to Carbon through the callbacks, and then finishes the compilation when the Carbon code is done. Although essentially a hack, this allows to easily fit into the Carbon codebase quickly and we have an option of refactoring Clang code in LLVM upstream in the future to do this directly if this approach proves fruitful.

ilya-biryukov · 2025-07-04T15:13:32Z

toolchain/lower/testdata/interop/cpp/function_decl.carbon

@@ -112,7 +112,6 @@ fn MyF() {
 // CHECK:STDOUT: target triple = "x86_64-unknown-linux-gnu"
 // CHECK:STDOUT:
 // CHECK:STDOUT: define void @_CMyF.Main() !dbg !7 {
-// CHECK:STDOUT: entry:


Not sure why entry: got removed here (because of -fno-exceptions maybe?). It also doesn't really matter at this point, probably.

ilya-biryukov · 2025-07-04T15:15:05Z

This is now ready for a proper review.

chandlerc

There are several issues found with pre-commit (also flagged in the CI failures), and some other C++ style issues that I think clang-tidy checks should catch like naming conventions not matching Carbon code, and generally following the Carbon style guide.

The biggest other issue that jumps out at me on the initial pass is in the comment below -- the system for managing all of this should have some substantial comments explaining how it works and how to use it.

Last but not least, we'll want to resolve how to layer this PR and the one that just went out in the same area: #5767

chandlerc · 2025-07-08T00:31:59Z

toolchain/base/in_flight_clang.h

+#include "llvm/ADT/IntrusiveRefCntPtr.h"
+
+namespace Carbon {
+class InFlightClang {


Seems like this really should have some detailed class comments to explain how it works and what it does?

chandlerc · 2025-07-08T08:44:16Z

Looking at #5767, it goes seem to duplicate some of the better calling of clang that is here, but not all of it. Notably, moving it into a separate thread to manage stack size etc seems really nice, but also a bigger change. And there are a bunch of related fixes in 5767 too, so my suggestion would be to rebase this PR on that one if not too much trouble. Does that make sense?

chandlerc · 2025-07-08T17:24:02Z

Looking at #5767, it goes seem to duplicate some of the better calling of clang that is here, but not all of it. Notably, moving it into a separate thread to manage stack size etc seems really nice, but also a bigger change. And there are a bunch of related fixes in 5767 too, so my suggestion would be to rebase this PR on that one if not too much trouble. Does that make sense?

(Sorry, I didn't see Richard's comments on one of the threads in the files that crossed path with my looking at this -- happy for you and Richard to figure out whatever approach makes sense to you all.)

github-actions bot added the toolchain label May 27, 2025

bricknerb self-requested a review May 27, 2025 18:32

ilya-biryukov force-pushed the in_flight_clang branch from b44c5db to faa7b65 Compare June 27, 2025 15:55

ilya-biryukov force-pushed the in_flight_clang branch 3 times, most recently from 7de0bf0 to 8f31d64 Compare July 4, 2025 15:07

ilya-biryukov commented Jul 4, 2025

View reviewed changes

ilya-biryukov force-pushed the in_flight_clang branch from 8f31d64 to 1ac6fa2 Compare July 4, 2025 15:11

ilya-biryukov commented Jul 4, 2025

View reviewed changes

ilya-biryukov marked this pull request as ready for review July 4, 2025 15:14

github-actions bot requested a review from chandlerc July 4, 2025 15:15

chandlerc mentioned this pull request Jul 8, 2025

Properly set up C++ include paths and similar environment settings when parsing imported C++. #5767

Merged

chandlerc reviewed Jul 8, 2025

View reviewed changes

zygoloid mentioned this pull request Jul 12, 2025

Build the clang::CompilerInvocation in the driver. #5784

Merged

zygoloid mentioned this pull request Aug 5, 2025

Generate and use a C++ thunk to call non simple ABI C++ functions #5850

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

More closely mimic the Clang compilation #5543

More closely mimic the Clang compilation #5543

Uh oh!

ilya-biryukov commented May 27, 2025 •

edited

Loading

Uh oh!

danakj commented May 27, 2025

Uh oh!

ilya-biryukov commented May 27, 2025

Uh oh!

ilya-biryukov commented Jun 3, 2025

Uh oh!

ilya-biryukov commented Jun 17, 2025

Uh oh!

ilya-biryukov commented Jun 27, 2025

Uh oh!

ilya-biryukov Jul 4, 2025

Uh oh!

zygoloid Jul 8, 2025

Uh oh!

zygoloid Jul 8, 2025

Uh oh!

ilya-biryukov Jul 4, 2025

Uh oh!

ilya-biryukov commented Jul 4, 2025

Uh oh!

chandlerc left a comment

Uh oh!

chandlerc Jul 8, 2025

Uh oh!

chandlerc commented Jul 8, 2025

Uh oh!

chandlerc commented Jul 8, 2025

Uh oh!

Uh oh!

More closely mimic the Clang compilation #5543

Are you sure you want to change the base?

More closely mimic the Clang compilation #5543

Uh oh!

Conversation

ilya-biryukov commented May 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

danakj commented May 27, 2025

Uh oh!

ilya-biryukov commented May 27, 2025

Uh oh!

ilya-biryukov commented Jun 3, 2025

Uh oh!

ilya-biryukov commented Jun 17, 2025

Uh oh!

ilya-biryukov commented Jun 27, 2025

Uh oh!

ilya-biryukov Jul 4, 2025

Choose a reason for hiding this comment

Uh oh!

zygoloid Jul 8, 2025

Choose a reason for hiding this comment

Uh oh!

zygoloid Jul 8, 2025

Choose a reason for hiding this comment

Uh oh!

ilya-biryukov Jul 4, 2025

Choose a reason for hiding this comment

Uh oh!

ilya-biryukov commented Jul 4, 2025

Uh oh!

chandlerc left a comment

Choose a reason for hiding this comment

Uh oh!

chandlerc Jul 8, 2025

Choose a reason for hiding this comment

Uh oh!

chandlerc commented Jul 8, 2025

Uh oh!

chandlerc commented Jul 8, 2025

Uh oh!

Uh oh!

ilya-biryukov commented May 27, 2025 •

edited

Loading