Skip to content

Conversation

@glebov-andrey
Copy link

@glebov-andrey glebov-andrey commented Oct 29, 2024

Adds a serialized HasCopyConstructorForExceptionObject bit to CXXRecordDecl which is used to inform ASTReader that this class's copy constructor needs to be added to the MicrosoftCXXABI::RecordToCopyCtor LUT.
Includes a regression test for the original issue with a PCH.

UPD: The implementation has changed based on #114075 (comment):
Adds a serialized AST record which stores the MicrosoftCXXABI::RecordToCopyCtor table.
The reason the table can't be restored without it is that choosing the exception copying constructor potentially requires template instantiation, overload resolution, etc.

Fixes #53486

I'm not sure if this is the best (or even correct) way of fixing the issue, but it does appear to work in simple test cases with PCHs and modules.
The existence of this issue also raises the question of other tables which are stored in the MicrosoftCXXABI AST - do they also need to be serialized?

@github-actions
Copy link

Thank you for submitting a Pull Request (PR) to the LLVM Project!

This PR will be automatically labeled and the relevant teams will be notified.

If you wish to, you can add reviewers by using the "Reviewers" section on this page.

If this is not working for you, it is probably because you do not have write permissions for the repository. In which case you can instead tag reviewers by name in a comment by using @ followed by their GitHub username.

If you have received no comments on your PR for a week, you can request a review by "ping"ing the PR by adding a comment “Ping”. The common courtesy "ping" rate is once a week. Please remember that you are asking for valuable time from other developers.

If you have further questions, they may be answered by the LLVM GitHub User Guide.

You can also ask questions in a comment on this PR, on the LLVM Discord or on the forums.

@llvmbot llvmbot added clang Clang issues not falling into any other category clang:frontend Language frontend issues, e.g. anything involving "Sema" clang:modules C++20 modules and Clang Header Modules labels Oct 29, 2024
@llvmbot
Copy link
Member

llvmbot commented Oct 29, 2024

@llvm/pr-subscribers-clang-modules

@llvm/pr-subscribers-clang

Author: Andrey Glebov (glebov-andrey)

Changes

Adds a serialized HasCopyConstructorForExceptionObject bit to CXXRecordDecl which is used to inform ASTReader that this class's copy constructor needs to be added to the MicrosoftCXXABI::RecordToCopyCtor LUT.
Includes a regression test for the original issue with a PCH.

Fixes #53486

I'm not sure if this is the best (or even correct) way of fixing the issue, but it does appear to work in simple test cases with PCHs and modules.
The existence of this issue also raises the question of other tables which are stored in the MicrosoftCXXABI AST - do they also need to be restored in ASTReader?


Full diff: https://github.com/llvm/llvm-project/pull/114075.diff

7 Files Affected:

  • (modified) clang/include/clang/AST/CXXRecordDeclDefinitionBits.def (+5)
  • (modified) clang/include/clang/AST/DeclCXX.h (+8)
  • (modified) clang/lib/AST/DeclCXX.cpp (+4-3)
  • (modified) clang/lib/Sema/SemaExprCXX.cpp (+4)
  • (modified) clang/lib/Serialization/ASTReaderDecl.cpp (+14)
  • (added) clang/test/PCH/cxx-exception-copy-ctor-crash.cpp (+19)
  • (added) clang/test/PCH/cxx-exception-copy-ctor-crash.h (+14)
diff --git a/clang/include/clang/AST/CXXRecordDeclDefinitionBits.def b/clang/include/clang/AST/CXXRecordDeclDefinitionBits.def
index 6620840df0ced2..7560691a0e4172 100644
--- a/clang/include/clang/AST/CXXRecordDeclDefinitionBits.def
+++ b/clang/include/clang/AST/CXXRecordDeclDefinitionBits.def
@@ -249,6 +249,11 @@ FIELD(HasDeclaredCopyAssignmentWithConstParam, 1, MERGE_OR)
 /// base classes or fields have a no-return destructor
 FIELD(IsAnyDestructorNoReturn, 1, NO_MERGE)
 
+/// Microsoft CXX ABI specific:
+/// Whether the copy constructor is used by a `throw` expression.
+/// Used by ASTReader to restore the sidecar RecordToCopyCtor LUT.
+FIELD(HasCopyConstructorForExceptionObject, 1, MERGE_OR)
+
 /// Whether the record type is intangible (if any base classes or fields have
 /// type that is intangible). HLSL only.
 FIELD(IsHLSLIntangible, 1, NO_MERGE)
diff --git a/clang/include/clang/AST/DeclCXX.h b/clang/include/clang/AST/DeclCXX.h
index 2693cc0e95b4b2..f9b2ecf8af665c 100644
--- a/clang/include/clang/AST/DeclCXX.h
+++ b/clang/include/clang/AST/DeclCXX.h
@@ -1551,6 +1551,14 @@ class CXXRecordDecl : public RecordDecl {
   /// a field or in base class.
   bool isHLSLIntangible() const { return data().IsHLSLIntangible; }
 
+  bool hasCopyConstructorForExceptionObject() const {
+    return data().HasCopyConstructorForExceptionObject;
+  }
+
+  void setHasCopyConstructorForExceptionObject() {
+    data().HasCopyConstructorForExceptionObject = true;
+  }
+
   /// If the class is a local class [class.local], returns
   /// the enclosing function declaration.
   const FunctionDecl *isLocalClass() const {
diff --git a/clang/lib/AST/DeclCXX.cpp b/clang/lib/AST/DeclCXX.cpp
index db0ea62a2323eb..ff21c44bf8fba2 100644
--- a/clang/lib/AST/DeclCXX.cpp
+++ b/clang/lib/AST/DeclCXX.cpp
@@ -109,9 +109,10 @@ CXXRecordDecl::DefinitionData::DefinitionData(CXXRecordDecl *D)
       ImplicitCopyAssignmentHasConstParam(true),
       HasDeclaredCopyConstructorWithConstParam(false),
       HasDeclaredCopyAssignmentWithConstParam(false),
-      IsAnyDestructorNoReturn(false), IsHLSLIntangible(false), IsLambda(false),
-      IsParsingBaseSpecifiers(false), ComputedVisibleConversions(false),
-      HasODRHash(false), Definition(D) {}
+      IsAnyDestructorNoReturn(false),
+      HasCopyConstructorForExceptionObject(false), IsHLSLIntangible(false),
+      IsLambda(false), IsParsingBaseSpecifiers(false),
+      ComputedVisibleConversions(false), HasODRHash(false), Definition(D) {}
 
 CXXBaseSpecifier *CXXRecordDecl::DefinitionData::getBasesSlowCase() const {
   return Bases.get(Definition->getASTContext().getExternalSource());
diff --git a/clang/lib/Sema/SemaExprCXX.cpp b/clang/lib/Sema/SemaExprCXX.cpp
index 50c1b24fce6da7..586ca74a848da2 100644
--- a/clang/lib/Sema/SemaExprCXX.cpp
+++ b/clang/lib/Sema/SemaExprCXX.cpp
@@ -1082,6 +1082,10 @@ bool Sema::CheckCXXThrowOperand(SourceLocation ThrowLoc,
       // friendship or any other means).
       Context.addCopyConstructorForExceptionObject(Subobject, CD);
 
+      // Store the bit in CXXRecordDecl so that ASTReader can restore this
+      // mapping later.
+      Subobject->setHasCopyConstructorForExceptionObject();
+
       // We don't keep the instantiated default argument expressions around so
       // we must rebuild them here.
       for (unsigned I = 1, E = CD->getNumParams(); I != E; ++I) {
diff --git a/clang/lib/Serialization/ASTReaderDecl.cpp b/clang/lib/Serialization/ASTReaderDecl.cpp
index d4e392dcc6bcd0..fdb1da4026db1c 100644
--- a/clang/lib/Serialization/ASTReaderDecl.cpp
+++ b/clang/lib/Serialization/ASTReaderDecl.cpp
@@ -2316,6 +2316,20 @@ void ASTDeclReader::VisitCXXConstructorDecl(CXXConstructorDecl *D) {
   }
 
   VisitCXXMethodDecl(D);
+
+  // Microsoft CXX ABI specific:
+  // Restore the RecordToCopyCtor sidecar LUT entry so that `throw` expressions
+  // find the correct copy constructor for exceptions during codegen.
+  // There is no need to check the target info because the
+  // HasCopyConstructorForExceptionObject bit only gets set for the MS ABI.
+  if (D->isCopyConstructor()) {
+    // TODO What if this is not the same copy constructor which was chosen by
+    //      LookupCopyingConstructor() in SemaExprCXX? Is there a better way?
+    auto *R = cast<CXXRecordDecl>(D->getDeclContext());
+    if (R->hasCopyConstructorForExceptionObject()) {
+      Reader.getContext().addCopyConstructorForExceptionObject(R, D);
+    }
+  }
 }
 
 void ASTDeclReader::VisitCXXDestructorDecl(CXXDestructorDecl *D) {
diff --git a/clang/test/PCH/cxx-exception-copy-ctor-crash.cpp b/clang/test/PCH/cxx-exception-copy-ctor-crash.cpp
new file mode 100644
index 00000000000000..29bcb114f20f0a
--- /dev/null
+++ b/clang/test/PCH/cxx-exception-copy-ctor-crash.cpp
@@ -0,0 +1,19 @@
+// REQUIRES: system-windows, target={{.*-windows-msvc}}
+// RUN: %clang_cc1 -x c++ -std=c++17 -fcxx-exceptions -fexceptions -triple=%ms_abi_triple -emit-pch -building-pch-with-obj -fmodules-codegen -o %t.pch %S/cxx-exception-copy-ctor-crash.h
+// RUN: %clang_cc1 -x c++ -std=c++17 -fcxx-exceptions -fexceptions -triple=%ms_abi_triple -include-pch %t.pch -emit-obj -building-pch-with-obj -fmodules-codegen -o %t.pch.obj
+// RUN: %clang_cc1 -x c++ -std=c++17 -fcxx-exceptions -fexceptions -triple=%ms_abi_triple -include-pch %t.pch -emit-obj -o %t.obj %s
+// RUN: lld-link -subsystem:console -out:%t.exe %t.pch.obj %t.obj libucrt.lib libvcruntime.lib libcmt.lib
+// RUN: %t.exe
+
+// Regression test for https://github.com/llvm/llvm-project/issues/53486
+
+int main() {
+  try {
+    throw Exception();
+  } catch (const Exception ex) { // catch by value to trigger copy constructor
+  }
+  if (ctor_count != dtor_count) {
+    return 1;
+  }
+  return 0;
+}
diff --git a/clang/test/PCH/cxx-exception-copy-ctor-crash.h b/clang/test/PCH/cxx-exception-copy-ctor-crash.h
new file mode 100644
index 00000000000000..9645df56c786e3
--- /dev/null
+++ b/clang/test/PCH/cxx-exception-copy-ctor-crash.h
@@ -0,0 +1,14 @@
+// Header for PCH test cxx-exception-copy-ctor-crash.cpp
+
+inline int ctor_count = 0;
+inline int dtor_count = 0;
+
+struct Exception {
+  Exception() { ++ctor_count; }
+  ~Exception() { ++dtor_count; }
+  Exception(const Exception &) noexcept { ++ctor_count; }
+};
+
+inline void throw_exception() {
+  throw Exception();
+}

@glebov-andrey
Copy link
Author

Tagging @rnk because this has to do with the MS ABI

Copy link
Collaborator

@erichkeane erichkeane left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have no idea here as this touches both modules AND MSVC ABI. That said, it seems really odd that we'd have to store this rather than calculate it on deserialization.

Perhaps @ChuanqiXu9 can take a look as someone more familiar with our serialization/modules implementation?

Copy link
Collaborator

@rnk rnk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think @majnemer added the lookup table originally. I think we did things this way because, while it creates obvious serialization problems, it's the most straightforwardly correct approach if you think finding copy constructors is an ambiguous operation. It's concerning that there isn't a straightforward CXXRecordDecl::getCopyConstructorIfItExists method, and I seem to recall I've had to work around this before in other contexts like debug info.

// RUN: %clang_cc1 -x c++ -std=c++17 -fcxx-exceptions -fexceptions -triple=%ms_abi_triple -include-pch %t.pch -emit-obj -building-pch-with-obj -fmodules-codegen -o %t.pch.obj
// RUN: %clang_cc1 -x c++ -std=c++17 -fcxx-exceptions -fexceptions -triple=%ms_abi_triple -include-pch %t.pch -emit-obj -o %t.obj %s
// RUN: lld-link -subsystem:console -out:%t.exe %t.pch.obj %t.obj libucrt.lib libvcruntime.lib libcmt.lib
// RUN: %t.exe
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Execution tests are not allowed in the clang test suite. It would of course be ideal to have a place for these to live, but for now we just don't check them in.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replaced with LLVM IR checks

// HasCopyConstructorForExceptionObject bit only gets set for the MS ABI.
if (D->isCopyConstructor()) {
// TODO What if this is not the same copy constructor which was chosen by
// LookupCopyingConstructor() in SemaExprCXX? Is there a better way?
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can rely on this assumption, and if it holds, then we don't actually need this lookup table at all. I've observed that LookupSpecialMember is cached, so once a special member (copy ctor) is looked up, it cannot change. Since we know at the point of the throw site, all relevant copy constructors are looked up, I think it should be safe to, in all cases, in MicrosoftCXXABI.cpp, when building the catchable type array, iterate the constructors and use the first constructor that returns true for isCopyConstructor.

I propose you prototype that change (replace the call to getCopyCtor... with the constructor search), run the tests, and see if that creates any fallout. If all tests involving copy constructors with default arguments pass, then I think it's safe to rely on this assumption in all cases, not just in the cases where we're doing a modules build.

And, if it turns out we're wrong and we should be storing the result of LookupCopyConstructor in a lookup table, that bug won't be hidden in a modules-only build configuration. It's better for the behavior to be consistent between regular builds and modules builds.

I propose the following experiment:

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright, then this is the first thing I will try, before making any other changes to the PR.

BTW, it seems your message got cut off - it ends with "I propose the following experiment:"

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I think that's dangling leftover text from previous edits. I think I re-edited that fragment into the earlier "I propose" paragraph, so I don't have anything more to add.

Copy link
Author

@glebov-andrey glebov-andrey Jan 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, it worked... kind of, not really

#if 0
  const CXXConstructorDecl *CD =
      RD ? CGM.getContext().getCopyConstructorForExceptionObject(RD) : nullptr;
#else
  const CXXConstructorDecl *CD = nullptr;
  if (RD) {
    for (const CXXConstructorDecl *Ctor : RD->ctors()) {
      if (Ctor->isCopyConstructor()) {
        // assert(!Ctor->isDeleted());
        if (Ctor->isDeleted()) {
          continue; // TODO why is this allowed? Should we break or continue?
        }
        if (!Ctor->isTrivial()) {
          CD = Ctor;
        }
        break;
      }
    }
  }
#endif

Simple cases worked fine, but a couple checks in CodeGenCXX/microsoft-abi-throw.cpp failed.

  1. @"_CT??_R0?AUDeletedCopy@@@81". As far as I can tell, it is not valid to throw a non-copyable type (CWG-1863 and CWG-2775). This seems especially bad in the MS ABI where std::exception_ptr copies exceptions.
    Anyway, just removed my assertion that the copy constructor wasn't deleted, and it passed.
  2. @"_CT??_R0?AUTemplateWithDefault@@@8??$?_OH@TemplateWithDefault@@QAEXAAU0@@Z1"
    This one I'm not so sure about. The resolution for CWG-2775 doesn't appear to actually require a copy constructor, or for the source object to be considered as a const lvalue.
    If I'm understanding this correctly then this in itself looks like a defect (but that's beside the point, and probably an ABI issue).
    As it turns out, only the const-param constructor is identified as isCopyConstructor() == true. From my reading of [class.copy] this is because the constructor template is not a copy constructor but can be used for copy-initialization. Presumably this is the reason we need the lookup table based on proper overload resolution.

It seems that after all we do need to explicitly store the lookup table in the AST, or am I missing something?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm proud of the depth of the test cases we added when implementing this the first time. There may be modules bugs, but at least there are good tests, and I remembered how to leverage them, so we don't have to rediscover all the edge cases. :)

I think we don't have to worry about cases involving deleted copy ctors because they will be diagnosed at the catch site (attempting to catch an uncopyable type by value), so the catchable type entry will effectively be dead. Skipping those entries is probably the right thing to do.

I think templated copy ctors is what makes this hard. The throw expression will instantiate the appropriate template, but the natural thing to do in other parts of the AST is store the lookup result in the AST and use it later in codegen.

I guess the most straightforward thing to do here is to create a special serialization record, like VTABLE_USES, that records a record type and a relevant copy constructor.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I haven't had as much time to work on this over the last few days.
But now I think I have a working version.

I created a serialization record with CXXRecordDecl, CXXConstrctorDecl pairs in it. It is written and read with all other record like VTABLE_USES, but only loaded into the AST on demand (is this correct?).
Generally some things seem dirty, like exposing the internal map in the interface. Also I'm not sure about naming - should MS ABI be mentioned everywhere, should it be called a "copy" constructor (even though that doesn't match the standard)?

I'm proud of the depth of the test cases we added when implementing this the first time. There may be modules bugs, but at least there are good tests, and I remembered how to leverage them, so we don't have to rediscover all the edge cases. :)

Absolutely! Regarding tests, what kind of tests do we need for this feature? I assume something like CodeGenCXX/microsoft-abi-throw.cpp, but for deserialized ASTs? And there's probably something specific to AST reading/writing?

I think we don't have to worry about cases involving deleted copy ctors because they will be diagnosed at the catch site (attempting to catch an uncopyable type by value), so the catchable type entry will effectively be dead.

I'm pretty sure there will still be a problem with std::current_exception() though - I'll have to test that.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Confirmed that std::current_exception causes UB/crash when the type is non-copyable. But this appears to be a matter of implementing CWG-2775.

// particular throw-site. Lookup will be performed at the catch-site to
// ensure that the copy constructor is, in fact, accessible (via
// friendship or any other means).
Context.addCopyConstructorForExceptionObject(Subobject, CD);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like we've already have the information so we don't need to add another bits for CXXRecordDecl, which is hot.

And generally we would write such information in the writer side and get such information and use it in the reader side.

Copy link
Author

@glebov-andrey glebov-andrey Jan 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The implementation has changed, and this code has been removed
I didn't notice that this was not my code :)
The need for this table has been discussed in #114075 (comment)


// Store the bit in CXXRecordDecl so that ASTReader can restore this
// mapping later.
Subobject->setHasCopyConstructorForExceptionObject();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally, when we add a new bit, we need to write it in ASTWriter and read it in ASTReader.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The implementation has changed, and this code has been removed

@glebov-andrey glebov-andrey force-pushed the 53486-fix branch 3 times, most recently from 7c4f07d to 0c8b9cb Compare January 27, 2025 22:36
@github-actions
Copy link

github-actions bot commented Jan 27, 2025

✅ With the latest revision this PR passed the C/C++ code formatter.

@glebov-andrey glebov-andrey force-pushed the 53486-fix branch 2 times, most recently from e90bab7 to c21c6e4 Compare January 30, 2025 17:59
@glebov-andrey glebov-andrey changed the title WIP: [clang] MicrosoftCXXABI: Fix exception copy constructor LUT after loading AST WIP: [clang] MicrosoftCXXABI: Serialize the exception copy constructor table in the AST Jan 30, 2025
…ing AST (llvm#53486)

- Includes a regression test for the issue
@glebov-andrey
Copy link
Author

(Ping)
I just squashed and rebased onto main.
Is there any possibility of getting this into Clang 21?

@glebov-andrey glebov-andrey changed the title WIP: [clang] MicrosoftCXXABI: Serialize the exception copy constructor table in the AST [clang] MicrosoftCXXABI: Serialize the exception copy constructor table in the AST Jun 4, 2025
@glebov-andrey
Copy link
Author

@rnk I was wondering, if you could take a look at the changes I made after our previous discussion?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

clang:frontend Language frontend issues, e.g. anything involving "Sema" clang:modules C++20 modules and Clang Header Modules clang Clang issues not falling into any other category

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Windows] clang producing crashing binary from exception not getting copy ctor called when using PCH

5 participants