Skip to content

Conversation

@dsandersllvm
Copy link
Collaborator

lldb's CPlusPlusNameParser is currently identifying keywords using it's own map implemented using clang/Basic/TokenKinds.def. However, it does not respect the language options so identifiers can incorrectly determined to be keywords when using languages in which they are not keywords.

Rather than implement the logic to enable/disable keywords in both projects it makes sense for lldb to use clang's implementation.

See #164284 for more information

…db to re-use it

lldb's CPlusPlusNameParser is currently identifying keywords using it's own
map implemented using clang/Basic/TokenKinds.def. However, it does not respect
the language options so identifiers can incorrectly determined to be keywords
when using languages in which they are not keywords.

Rather than implement the logic to enable/disable keywords in both projects it
makes sense for lldb to use clang's implementation.

See llvm#164284 for more information
@llvmbot llvmbot added clang Clang issues not falling into any other category clang:frontend Language frontend issues, e.g. anything involving "Sema" labels Oct 27, 2025
@llvmbot
Copy link
Member

llvmbot commented Oct 27, 2025

@llvm/pr-subscribers-clang

Author: Daniel Sanders (dsandersllvm)

Changes

lldb's CPlusPlusNameParser is currently identifying keywords using it's own map implemented using clang/Basic/TokenKinds.def. However, it does not respect the language options so identifiers can incorrectly determined to be keywords when using languages in which they are not keywords.

Rather than implement the logic to enable/disable keywords in both projects it makes sense for lldb to use clang's implementation.

See #164284 for more information


Full diff: https://github.com/llvm/llvm-project/pull/165323.diff

2 Files Affected:

  • (modified) clang/include/clang/Basic/IdentifierTable.h (+49)
  • (modified) clang/lib/Basic/IdentifierTable.cpp (+1-52)
diff --git a/clang/include/clang/Basic/IdentifierTable.h b/clang/include/clang/Basic/IdentifierTable.h
index e4044bcdfcc60..54500b95b2c06 100644
--- a/clang/include/clang/Basic/IdentifierTable.h
+++ b/clang/include/clang/Basic/IdentifierTable.h
@@ -46,6 +46,55 @@ class LangOptions;
 class MultiKeywordSelector;
 class SourceLocation;
 
+enum TokenKey : unsigned {
+  KEYC99 = 0x1,
+  KEYCXX = 0x2,
+  KEYCXX11 = 0x4,
+  KEYGNU = 0x8,
+  KEYMS = 0x10,
+  BOOLSUPPORT = 0x20,
+  KEYALTIVEC = 0x40,
+  KEYNOCXX = 0x80,
+  KEYBORLAND = 0x100,
+  KEYOPENCLC = 0x200,
+  KEYC23 = 0x400,
+  KEYNOMS18 = 0x800,
+  KEYNOOPENCL = 0x1000,
+  WCHARSUPPORT = 0x2000,
+  HALFSUPPORT = 0x4000,
+  CHAR8SUPPORT = 0x8000,
+  KEYOBJC = 0x10000,
+  KEYZVECTOR = 0x20000,
+  KEYCOROUTINES = 0x40000,
+  KEYMODULES = 0x80000,
+  KEYCXX20 = 0x100000,
+  KEYOPENCLCXX = 0x200000,
+  KEYMSCOMPAT = 0x400000,
+  KEYSYCL = 0x800000,
+  KEYCUDA = 0x1000000,
+  KEYZOS = 0x2000000,
+  KEYNOZOS = 0x4000000,
+  KEYHLSL = 0x8000000,
+  KEYFIXEDPOINT = 0x10000000,
+  KEYMAX = KEYFIXEDPOINT, // The maximum key
+  KEYALLCXX = KEYCXX | KEYCXX11 | KEYCXX20,
+  KEYALL = (KEYMAX | (KEYMAX - 1)) & ~KEYNOMS18 & ~KEYNOOPENCL &
+           ~KEYNOZOS // KEYNOMS18, KEYNOOPENCL, KEYNOZOS are excluded.
+};
+
+/// How a keyword is treated in the selected standard. This enum is ordered
+/// intentionally so that the value that 'wins' is the most 'permissive'.
+enum KeywordStatus {
+  KS_Unknown,   // Not yet calculated. Used when figuring out the status.
+  KS_Disabled,  // Disabled
+  KS_Future,    // Is a keyword in future standard
+  KS_Extension, // Is an extension
+  KS_Enabled,   // Enabled
+};
+
+KeywordStatus getKeywordStatus(const LangOptions &LangOpts,
+                               unsigned Flags);
+
 enum class ReservedIdentifierStatus {
   NotReserved = 0,
   StartsWithUnderscoreAtGlobalScope,
diff --git a/clang/lib/Basic/IdentifierTable.cpp b/clang/lib/Basic/IdentifierTable.cpp
index 4a2b77cd16bfc..65ed1273350c5 100644
--- a/clang/lib/Basic/IdentifierTable.cpp
+++ b/clang/lib/Basic/IdentifierTable.cpp
@@ -77,57 +77,6 @@ IdentifierTable::IdentifierTable(const LangOptions &LangOpts,
 // Language Keyword Implementation
 //===----------------------------------------------------------------------===//
 
-// Constants for TokenKinds.def
-namespace {
-
-enum TokenKey : unsigned {
-  KEYC99 = 0x1,
-  KEYCXX = 0x2,
-  KEYCXX11 = 0x4,
-  KEYGNU = 0x8,
-  KEYMS = 0x10,
-  BOOLSUPPORT = 0x20,
-  KEYALTIVEC = 0x40,
-  KEYNOCXX = 0x80,
-  KEYBORLAND = 0x100,
-  KEYOPENCLC = 0x200,
-  KEYC23 = 0x400,
-  KEYNOMS18 = 0x800,
-  KEYNOOPENCL = 0x1000,
-  WCHARSUPPORT = 0x2000,
-  HALFSUPPORT = 0x4000,
-  CHAR8SUPPORT = 0x8000,
-  KEYOBJC = 0x10000,
-  KEYZVECTOR = 0x20000,
-  KEYCOROUTINES = 0x40000,
-  KEYMODULES = 0x80000,
-  KEYCXX20 = 0x100000,
-  KEYOPENCLCXX = 0x200000,
-  KEYMSCOMPAT = 0x400000,
-  KEYSYCL = 0x800000,
-  KEYCUDA = 0x1000000,
-  KEYZOS = 0x2000000,
-  KEYNOZOS = 0x4000000,
-  KEYHLSL = 0x8000000,
-  KEYFIXEDPOINT = 0x10000000,
-  KEYMAX = KEYFIXEDPOINT, // The maximum key
-  KEYALLCXX = KEYCXX | KEYCXX11 | KEYCXX20,
-  KEYALL = (KEYMAX | (KEYMAX - 1)) & ~KEYNOMS18 & ~KEYNOOPENCL &
-           ~KEYNOZOS // KEYNOMS18, KEYNOOPENCL, KEYNOZOS are excluded.
-};
-
-/// How a keyword is treated in the selected standard. This enum is ordered
-/// intentionally so that the value that 'wins' is the most 'permissive'.
-enum KeywordStatus {
-  KS_Unknown,   // Not yet calculated. Used when figuring out the status.
-  KS_Disabled,  // Disabled
-  KS_Future,    // Is a keyword in future standard
-  KS_Extension, // Is an extension
-  KS_Enabled,   // Enabled
-};
-
-} // namespace
-
 // This works on a single TokenKey flag and checks the LangOpts to get the
 // KeywordStatus based exclusively on this flag, so that it can be merged in
 // getKeywordStatus. Most should be enabled/disabled, but some might imply
@@ -222,7 +171,7 @@ static KeywordStatus getKeywordStatusHelper(const LangOptions &LangOpts,
 
 /// Translates flags as specified in TokenKinds.def into keyword status
 /// in the given language standard.
-static KeywordStatus getKeywordStatus(const LangOptions &LangOpts,
+KeywordStatus clang::getKeywordStatus(const LangOptions &LangOpts,
                                       unsigned Flags) {
   // KEYALL means always enabled, so special case this one.
   if (Flags == KEYALL) return KS_Enabled;

@github-actions
Copy link

github-actions bot commented Oct 27, 2025

✅ With the latest revision this PR passed the C/C++ code formatter.

Copy link
Member

@Michael137 Michael137 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM (modulo two minor comments). Thanks for splitting this out! Could now even be unit-testable (not that you should do that as part of this PR though). Please wait a couple of days for the Clang maintainers to approve before merging

Comment on lines 172 to 173
/// Translates flags as specified in TokenKinds.def into keyword status
/// in the given language standard.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we move these docs into the header now?

@@ -77,57 +77,6 @@ IdentifierTable::IdentifierTable(const LangOptions &LangOpts,
// Language Keyword Implementation
//===----------------------------------------------------------------------===//

// Constants for TokenKinds.def
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it worth retaining this comment? It gives us "some" indication that TokenKinds.def and this enum are connected.,

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, it used to be on the anonymous namespace for TokenKind and KeywordStatus, I've restored it on TokenKind only because KeywordStatus is only used by the code that uses TokenKinds.def not TokenKinds.def itself

Copy link
Collaborator

@AaronBallman AaronBallman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM modulo the requests from @Michael137

This comment was originally on the namespace containing TokenKey and
KeywordStatus. I've only restored it to TokenKey because KeywordStatus
is used by the code that uses TokenKinds.def not TokenKinds.def itself.
@dsandersllvm dsandersllvm merged commit d3caae1 into llvm:main Nov 5, 2025
10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

clang:frontend Language frontend issues, e.g. anything involving "Sema" clang Clang issues not falling into any other category

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants