Skip to content

Conversation

@ckandeler
Copy link
Member

Apart from fixing the linked issue, this is also necessary for supporting LSP's LocationLink feature and for finding proper insertion locations in the DefineOutline tweak. Memory consumption of the background index grows by about ~2.5%.

Closes clangd/clangd#59

@llvmbot
Copy link
Member

llvmbot commented Nov 29, 2024

@llvm/pr-subscribers-clangd

Author: Christian Kandeler (ckandeler)

Changes

Apart from fixing the linked issue, this is also necessary for supporting LSP's LocationLink feature and for finding proper insertion locations in the DefineOutline tweak. Memory consumption of the background index grows by about ~2.5%.

Closes clangd/clangd#59


Patch is 69.37 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/118102.diff

32 Files Affected:

  • (modified) clang-tools-extra/clangd/CodeComplete.cpp (+2-2)
  • (modified) clang-tools-extra/clangd/FindSymbols.cpp (+24-10)
  • (modified) clang-tools-extra/clangd/FindSymbols.h (+5-3)
  • (modified) clang-tools-extra/clangd/HeaderSourceSwitch.cpp (+2-2)
  • (modified) clang-tools-extra/clangd/IncludeFixer.cpp (+3-3)
  • (modified) clang-tools-extra/clangd/Quality.cpp (+1-1)
  • (modified) clang-tools-extra/clangd/XRefs.cpp (+35-30)
  • (modified) clang-tools-extra/clangd/index/FileIndex.cpp (+3-3)
  • (modified) clang-tools-extra/clangd/index/Merge.cpp (+7-6)
  • (modified) clang-tools-extra/clangd/index/Ref.h (+4-7)
  • (modified) clang-tools-extra/clangd/index/Serialization.cpp (+28-8)
  • (modified) clang-tools-extra/clangd/index/StdLib.cpp (+6-6)
  • (modified) clang-tools-extra/clangd/index/Symbol.h (+4-4)
  • (modified) clang-tools-extra/clangd/index/SymbolCollector.cpp (+68-21)
  • (modified) clang-tools-extra/clangd/index/SymbolCollector.h (+2-1)
  • (modified) clang-tools-extra/clangd/index/SymbolLocation.cpp (+6-5)
  • (modified) clang-tools-extra/clangd/index/SymbolLocation.h (+62-41)
  • (modified) clang-tools-extra/clangd/index/YAMLSerialization.cpp (+25-12)
  • (modified) clang-tools-extra/clangd/index/dex/Dex.cpp (+2-2)
  • (modified) clang-tools-extra/clangd/refactor/Rename.cpp (+2-2)
  • (modified) clang-tools-extra/clangd/test/Inputs/symbols.test.yaml (+12-5)
  • (modified) clang-tools-extra/clangd/test/index-serialization/Inputs/sample.idx ()
  • (modified) clang-tools-extra/clangd/test/type-hierarchy-ext.test (+4-4)
  • (modified) clang-tools-extra/clangd/test/type-hierarchy.test (+4-4)
  • (modified) clang-tools-extra/clangd/unittests/BackgroundIndexTests.cpp (+2-2)
  • (modified) clang-tools-extra/clangd/unittests/CodeCompleteTests.cpp (+9-9)
  • (modified) clang-tools-extra/clangd/unittests/DexTests.cpp (+3-2)
  • (modified) clang-tools-extra/clangd/unittests/DiagnosticsTests.cpp (+7-6)
  • (modified) clang-tools-extra/clangd/unittests/FileIndexTests.cpp (+7-7)
  • (modified) clang-tools-extra/clangd/unittests/IndexTests.cpp (+23-22)
  • (modified) clang-tools-extra/clangd/unittests/SerializationTests.cpp (+19-4)
  • (modified) clang-tools-extra/clangd/unittests/SymbolCollectorTests.cpp (+7-5)
diff --git a/clang-tools-extra/clangd/CodeComplete.cpp b/clang-tools-extra/clangd/CodeComplete.cpp
index 2c2d5f0b5ac924..04e4aa2d2d1ca9 100644
--- a/clang-tools-extra/clangd/CodeComplete.cpp
+++ b/clang-tools-extra/clangd/CodeComplete.cpp
@@ -418,7 +418,7 @@ struct CodeCompletionBuilder {
     auto Inserted = [&](llvm::StringRef Header)
         -> llvm::Expected<std::pair<std::string, bool>> {
       auto ResolvedDeclaring =
-          URI::resolve(C.IndexResult->CanonicalDeclaration.FileURI, FileName);
+          URI::resolve(C.IndexResult->CanonicalDeclaration.fileURI(), FileName);
       if (!ResolvedDeclaring)
         return ResolvedDeclaring.takeError();
       auto ResolvedInserted = toHeaderFile(Header, FileName);
@@ -451,7 +451,7 @@ struct CodeCompletionBuilder {
       } else
         log("Failed to generate include insertion edits for adding header "
             "(FileURI='{0}', IncludeHeader='{1}') into {2}: {3}",
-            C.IndexResult->CanonicalDeclaration.FileURI, Inc.Header, FileName,
+            C.IndexResult->CanonicalDeclaration.fileURI(), Inc.Header, FileName,
             ToInclude.takeError());
     }
     // Prefer includes that do not need edits (i.e. already exist).
diff --git a/clang-tools-extra/clangd/FindSymbols.cpp b/clang-tools-extra/clangd/FindSymbols.cpp
index 84bcbc1f2ddd3f..646cb261309c80 100644
--- a/clang-tools-extra/clangd/FindSymbols.cpp
+++ b/clang-tools-extra/clangd/FindSymbols.cpp
@@ -54,9 +54,19 @@ bool approximateScopeMatch(llvm::StringRef Scope, llvm::StringRef Query) {
   return Query.empty();
 }
 
+Range indexToLSPRange(const SymbolPosition &SrcStart,
+                      const SymbolPosition &SrcEnd) {
+  Position Start, End;
+  Start.line = SrcStart.line();
+  Start.character = SrcStart.column();
+  End.line = SrcEnd.line();
+  End.character = SrcEnd.column();
+  return {Start, End};
+}
+
 } // namespace
 
-llvm::Expected<Location> indexToLSPLocation(const SymbolLocation &Loc,
+llvm::Expected<Location> indexToLSPLocation(const SymbolNameLocation &Loc,
                                             llvm::StringRef TUPath) {
   auto Path = URI::resolve(Loc.FileURI, TUPath);
   if (!Path)
@@ -64,17 +74,21 @@ llvm::Expected<Location> indexToLSPLocation(const SymbolLocation &Loc,
                  Path.takeError());
   Location L;
   L.uri = URIForFile::canonicalize(*Path, TUPath);
-  Position Start, End;
-  Start.line = Loc.Start.line();
-  Start.character = Loc.Start.column();
-  End.line = Loc.End.line();
-  End.character = Loc.End.column();
-  L.range = {Start, End};
+  L.range = indexToLSPRange(Loc.Start, Loc.End);
   return L;
 }
 
-llvm::Expected<Location> symbolToLocation(const Symbol &Sym,
-                                          llvm::StringRef TUPath) {
+llvm::Expected<std::pair<Location, Range>>
+indexToLSPLocation(const SymbolDeclDefLocation &Loc, StringRef TUPath) {
+  auto L = indexToLSPLocation(Loc.NameLocation, TUPath);
+  if (!L)
+    return L.takeError();
+  return std::make_pair(L.get(),
+                        indexToLSPRange(Loc.DeclDefStart, Loc.DeclDefEnd));
+}
+
+llvm::Expected<std::pair<Location, Range>>
+symbolToLocation(const Symbol &Sym, llvm::StringRef TUPath) {
   // Prefer the definition over e.g. a function declaration in a header
   return indexToLSPLocation(
       Sym.Definition ? Sym.Definition : Sym.CanonicalDeclaration, TUPath);
@@ -152,7 +166,7 @@ getWorkspaceSymbols(llvm::StringRef Query, int Limit,
     SymbolInformation Info;
     Info.name = (Sym.Name + Sym.TemplateSpecializationArgs).str();
     Info.kind = indexSymbolKindToSymbolKind(Sym.SymInfo.Kind);
-    Info.location = *Loc;
+    Info.location = Loc->first;
     Scope.consume_back("::");
     Info.containerName = Scope.str();
 
diff --git a/clang-tools-extra/clangd/FindSymbols.h b/clang-tools-extra/clangd/FindSymbols.h
index 5fb116b13d1136..29fe82d8aaa867 100644
--- a/clang-tools-extra/clangd/FindSymbols.h
+++ b/clang-tools-extra/clangd/FindSymbols.h
@@ -22,12 +22,14 @@ class ParsedAST;
 class SymbolIndex;
 
 /// Helper function for deriving an LSP Location from an index SymbolLocation.
-llvm::Expected<Location> indexToLSPLocation(const SymbolLocation &Loc,
+llvm::Expected<Location> indexToLSPLocation(const SymbolNameLocation &Loc,
                                             llvm::StringRef TUPath);
+llvm::Expected<std::pair<Location, Range>>
+indexToLSPLocation(const SymbolDeclDefLocation &Loc, llvm::StringRef TUPath);
 
 /// Helper function for deriving an LSP Location for a Symbol.
-llvm::Expected<Location> symbolToLocation(const Symbol &Sym,
-                                          llvm::StringRef TUPath);
+llvm::Expected<std::pair<Location, Range>>
+symbolToLocation(const Symbol &Sym, llvm::StringRef TUPath);
 
 /// Searches for the symbols matching \p Query. The syntax of \p Query can be
 /// the non-qualified name or fully qualified of a symbol. For example,
diff --git a/clang-tools-extra/clangd/HeaderSourceSwitch.cpp b/clang-tools-extra/clangd/HeaderSourceSwitch.cpp
index 2351858cc62972..c191e79b1962d3 100644
--- a/clang-tools-extra/clangd/HeaderSourceSwitch.cpp
+++ b/clang-tools-extra/clangd/HeaderSourceSwitch.cpp
@@ -95,9 +95,9 @@ std::optional<Path> getCorrespondingHeaderOrSource(PathRef OriginalFile,
   bool IsHeader = isHeaderFile(OriginalFile, AST.getLangOpts());
   Index->lookup(Request, [&](const Symbol &Sym) {
     if (IsHeader)
-      AwardTarget(Sym.Definition.FileURI);
+      AwardTarget(Sym.Definition.fileURI());
     else
-      AwardTarget(Sym.CanonicalDeclaration.FileURI);
+      AwardTarget(Sym.CanonicalDeclaration.fileURI());
   });
   // FIXME: our index doesn't have any interesting information (this could be
   // that the background-index is not finished), we should use the decl/def
diff --git a/clang-tools-extra/clangd/IncludeFixer.cpp b/clang-tools-extra/clangd/IncludeFixer.cpp
index fadd1105691fc0..f146bb3ba1b161 100644
--- a/clang-tools-extra/clangd/IncludeFixer.cpp
+++ b/clang-tools-extra/clangd/IncludeFixer.cpp
@@ -289,7 +289,7 @@ std::vector<Fix> IncludeFixer::fixIncompleteType(const Type &T) const {
   if (!Syms.empty()) {
     auto &Matched = *Syms.begin();
     if (!Matched.IncludeHeaders.empty() && Matched.Definition &&
-        Matched.CanonicalDeclaration.FileURI == Matched.Definition.FileURI)
+        Matched.CanonicalDeclaration.fileURI() == Matched.Definition.fileURI())
       Fixes = fixesForSymbols(Syms);
   }
   return Fixes;
@@ -299,7 +299,7 @@ std::vector<Fix> IncludeFixer::fixesForSymbols(const SymbolSlab &Syms) const {
   auto Inserted = [&](const Symbol &Sym, llvm::StringRef Header)
       -> llvm::Expected<std::pair<std::string, bool>> {
     auto ResolvedDeclaring =
-        URI::resolve(Sym.CanonicalDeclaration.FileURI, File);
+        URI::resolve(Sym.CanonicalDeclaration.fileURI(), File);
     if (!ResolvedDeclaring)
       return ResolvedDeclaring.takeError();
     auto ResolvedInserted = toHeaderFile(Header, File);
@@ -616,7 +616,7 @@ IncludeFixer::lookupCached(const SymbolID &ID) const {
   if (!Syms.empty()) {
     auto &Matched = *Syms.begin();
     if (!Matched.IncludeHeaders.empty() && Matched.Definition &&
-        Matched.CanonicalDeclaration.FileURI == Matched.Definition.FileURI)
+        Matched.CanonicalDeclaration.fileURI() == Matched.Definition.fileURI())
       Fixes = fixesForSymbols(Syms);
   }
   auto E = LookupCache.try_emplace(ID, std::move(Syms));
diff --git a/clang-tools-extra/clangd/Quality.cpp b/clang-tools-extra/clangd/Quality.cpp
index c1ab63fb22f61e..1154a426740209 100644
--- a/clang-tools-extra/clangd/Quality.cpp
+++ b/clang-tools-extra/clangd/Quality.cpp
@@ -281,7 +281,7 @@ computeScope(const NamedDecl *D) {
 }
 
 void SymbolRelevanceSignals::merge(const Symbol &IndexResult) {
-  SymbolURI = IndexResult.CanonicalDeclaration.FileURI;
+  SymbolURI = IndexResult.CanonicalDeclaration.fileURI();
   SymbolScope = IndexResult.Scope;
   IsInstanceMember |= isInstanceMember(IndexResult.SymInfo);
   if (!(IndexResult.Flags & Symbol::VisibleOutsideFile)) {
diff --git a/clang-tools-extra/clangd/XRefs.cpp b/clang-tools-extra/clangd/XRefs.cpp
index 61fa66180376cd..c815db4ffa533e 100644
--- a/clang-tools-extra/clangd/XRefs.cpp
+++ b/clang-tools-extra/clangd/XRefs.cpp
@@ -114,7 +114,7 @@ const NamedDecl *getDefinition(const NamedDecl *D) {
   return nullptr; // except cases above
 }
 
-void logIfOverflow(const SymbolLocation &Loc) {
+void logIfOverflow(const SymbolNameLocation &Loc) {
   if (Loc.Start.hasOverflow() || Loc.End.hasOverflow())
     log("Possible overflow in symbol location: {0}", Loc);
 }
@@ -123,7 +123,7 @@ void logIfOverflow(const SymbolLocation &Loc) {
 // TUPath is used to resolve the path of URI.
 // FIXME: figure out a good home for it, and share the implementation with
 // FindSymbols.
-std::optional<Location> toLSPLocation(const SymbolLocation &Loc,
+std::optional<Location> toLSPLocation(const SymbolNameLocation &Loc,
                                       llvm::StringRef TUPath) {
   if (!Loc)
     return std::nullopt;
@@ -148,8 +148,9 @@ std::optional<Location> toLSPLocation(const SymbolLocation &Loc,
   return LSPLoc;
 }
 
-SymbolLocation toIndexLocation(const Location &Loc, std::string &URIStorage) {
-  SymbolLocation SymLoc;
+SymbolNameLocation toIndexLocation(const Location &Loc,
+                                   std::string &URIStorage) {
+  SymbolNameLocation SymLoc;
   URIStorage = Loc.uri.uri();
   SymLoc.FileURI = URIStorage.c_str();
   SymLoc.Start.setLine(Loc.range.start.line);
@@ -160,17 +161,17 @@ SymbolLocation toIndexLocation(const Location &Loc, std::string &URIStorage) {
 }
 
 // Returns the preferred location between an AST location and an index location.
-SymbolLocation getPreferredLocation(const Location &ASTLoc,
-                                    const SymbolLocation &IdxLoc,
-                                    std::string &Scratch) {
+SymbolNameLocation getPreferredLocation(const Location &ASTLoc,
+                                        const SymbolNameLocation &IdxLoc,
+                                        std::string &Scratch) {
   // Also use a mock symbol for the index location so that other fields (e.g.
   // definition) are not factored into the preference.
   Symbol ASTSym, IdxSym;
   ASTSym.ID = IdxSym.ID = SymbolID("mock_symbol_id");
-  ASTSym.CanonicalDeclaration = toIndexLocation(ASTLoc, Scratch);
-  IdxSym.CanonicalDeclaration = IdxLoc;
+  ASTSym.CanonicalDeclaration.NameLocation = toIndexLocation(ASTLoc, Scratch);
+  IdxSym.CanonicalDeclaration.NameLocation = IdxLoc;
   auto Merged = mergeSymbol(ASTSym, IdxSym);
-  return Merged.CanonicalDeclaration;
+  return Merged.CanonicalDeclaration.NameLocation;
 }
 
 std::vector<std::pair<const NamedDecl *, DeclRelationSet>>
@@ -323,8 +324,8 @@ std::vector<LocatedSymbol> findImplementors(llvm::DenseSet<SymbolID> IDs,
   Req.Subjects = std::move(IDs);
   std::vector<LocatedSymbol> Results;
   Index->relations(Req, [&](const SymbolID &Subject, const Symbol &Object) {
-    auto DeclLoc =
-        indexToLSPLocation(Object.CanonicalDeclaration, MainFilePath);
+    auto DeclLoc = indexToLSPLocation(Object.CanonicalDeclaration.NameLocation,
+                                      MainFilePath);
     if (!DeclLoc) {
       elog("Find overrides: {0}", DeclLoc.takeError());
       return;
@@ -332,7 +333,8 @@ std::vector<LocatedSymbol> findImplementors(llvm::DenseSet<SymbolID> IDs,
     Results.emplace_back();
     Results.back().Name = Object.Name.str();
     Results.back().PreferredDeclaration = *DeclLoc;
-    auto DefLoc = indexToLSPLocation(Object.Definition, MainFilePath);
+    auto DefLoc =
+        indexToLSPLocation(Object.Definition.NameLocation, MainFilePath);
     if (!DefLoc) {
       elog("Failed to convert location: {0}", DefLoc.takeError());
       return;
@@ -364,23 +366,26 @@ void enhanceLocatedSymbolsFromIndex(llvm::MutableArrayRef<LocatedSymbol> Result,
     if (R.Definition) { // from AST
       // Special case: if the AST yielded a definition, then it may not be
       // the right *declaration*. Prefer the one from the index.
-      if (auto Loc = toLSPLocation(Sym.CanonicalDeclaration, MainFilePath))
+      if (auto Loc = toLSPLocation(Sym.CanonicalDeclaration.NameLocation,
+                                   MainFilePath))
         R.PreferredDeclaration = *Loc;
 
       // We might still prefer the definition from the index, e.g. for
       // generated symbols.
       if (auto Loc = toLSPLocation(
-              getPreferredLocation(*R.Definition, Sym.Definition, Scratch),
+              getPreferredLocation(*R.Definition, Sym.Definition.NameLocation,
+                                   Scratch),
               MainFilePath))
         R.Definition = *Loc;
     } else {
-      R.Definition = toLSPLocation(Sym.Definition, MainFilePath);
+      R.Definition = toLSPLocation(Sym.Definition.NameLocation, MainFilePath);
 
       // Use merge logic to choose AST or index declaration.
-      if (auto Loc = toLSPLocation(
-              getPreferredLocation(R.PreferredDeclaration,
-                                   Sym.CanonicalDeclaration, Scratch),
-              MainFilePath))
+      if (auto Loc =
+              toLSPLocation(getPreferredLocation(
+                                R.PreferredDeclaration,
+                                Sym.CanonicalDeclaration.NameLocation, Scratch),
+                            MainFilePath))
         R.PreferredDeclaration = *Loc;
     }
   });
@@ -606,7 +611,7 @@ std::vector<LocatedSymbol> locateSymbolTextually(const SpelledWord &Word,
       return;
 
     auto MaybeDeclLoc =
-        indexToLSPLocation(Sym.CanonicalDeclaration, MainFilePath);
+        indexToLSPLocation(Sym.CanonicalDeclaration.NameLocation, MainFilePath);
     if (!MaybeDeclLoc) {
       log("locateSymbolNamedTextuallyAt: {0}", MaybeDeclLoc.takeError());
       return;
@@ -616,7 +621,8 @@ std::vector<LocatedSymbol> locateSymbolTextually(const SpelledWord &Word,
     Located.Name = (Sym.Name + Sym.TemplateSpecializationArgs).str();
     Located.ID = Sym.ID;
     if (Sym.Definition) {
-      auto MaybeDefLoc = indexToLSPLocation(Sym.Definition, MainFilePath);
+      auto MaybeDefLoc =
+          indexToLSPLocation(Sym.Definition.NameLocation, MainFilePath);
       if (!MaybeDefLoc) {
         log("locateSymbolNamedTextuallyAt: {0}", MaybeDefLoc.takeError());
         return;
@@ -1495,9 +1501,10 @@ ReferencesResult findReferences(ParsedAST &AST, Position Pos, uint32_t Limit,
           Results.HasMore = true;
           return;
         }
-        const auto LSPLocDecl =
-            toLSPLocation(Object.CanonicalDeclaration, MainFilePath);
-        const auto LSPLocDef = toLSPLocation(Object.Definition, MainFilePath);
+        const auto LSPLocDecl = toLSPLocation(
+            Object.CanonicalDeclaration.NameLocation, MainFilePath);
+        const auto LSPLocDef =
+            toLSPLocation(Object.Definition.NameLocation, MainFilePath);
         if (LSPLocDecl && LSPLocDecl != LSPLocDef) {
           ReferencesResult::Reference Result;
           Result.Loc = {std::move(*LSPLocDecl), std::nullopt};
@@ -1754,11 +1761,9 @@ static std::optional<HierarchyItem> symbolToHierarchyItem(const Symbol &S,
   HierarchyItem HI;
   HI.name = std::string(S.Name);
   HI.kind = indexSymbolKindToSymbolKind(S.SymInfo.Kind);
-  HI.selectionRange = Loc->range;
-  // FIXME: Populate 'range' correctly
-  // (https://github.com/clangd/clangd/issues/59).
-  HI.range = HI.selectionRange;
-  HI.uri = Loc->uri;
+  HI.selectionRange = Loc->first.range;
+  HI.range = Loc->second;
+  HI.uri = Loc->first.uri;
 
   return HI;
 }
diff --git a/clang-tools-extra/clangd/index/FileIndex.cpp b/clang-tools-extra/clangd/index/FileIndex.cpp
index eb9562d2b6bf81..572f4b6e4e02ce 100644
--- a/clang-tools-extra/clangd/index/FileIndex.cpp
+++ b/clang-tools-extra/clangd/index/FileIndex.cpp
@@ -132,13 +132,13 @@ FileShardedIndex::FileShardedIndex(IndexFileIn Input)
   // Attribute each Symbol to both their declaration and definition locations.
   if (Index.Symbols) {
     for (const auto &S : *Index.Symbols) {
-      auto It = Shards.try_emplace(S.CanonicalDeclaration.FileURI);
+      auto It = Shards.try_emplace(S.CanonicalDeclaration.fileURI());
       It.first->getValue().Symbols.insert(&S);
       SymbolIDToFile[S.ID] = &It.first->getValue();
       // Only bother if definition file is different than declaration file.
       if (S.Definition &&
-          S.Definition.FileURI != S.CanonicalDeclaration.FileURI) {
-        auto It = Shards.try_emplace(S.Definition.FileURI);
+          S.Definition.fileURI() != S.CanonicalDeclaration.fileURI()) {
+        auto It = Shards.try_emplace(S.Definition.fileURI());
         It.first->getValue().Symbols.insert(&S);
       }
     }
diff --git a/clang-tools-extra/clangd/index/Merge.cpp b/clang-tools-extra/clangd/index/Merge.cpp
index 8221d4b1f44405..b72d7d1fa366d7 100644
--- a/clang-tools-extra/clangd/index/Merge.cpp
+++ b/clang-tools-extra/clangd/index/Merge.cpp
@@ -26,7 +26,7 @@ bool isIndexAuthoritative(const SymbolIndex::IndexedFiles &Index,
   // We expect the definition to see the canonical declaration, so it seems to
   // be enough to check only the definition if it exists.
   const char *OwningFile =
-      S.Definition ? S.Definition.FileURI : S.CanonicalDeclaration.FileURI;
+      S.Definition ? S.Definition.fileURI() : S.CanonicalDeclaration.fileURI();
   return (Index(OwningFile) & IndexContents::Symbols) != IndexContents::None;
 }
 } // namespace
@@ -189,15 +189,16 @@ void MergedIndex::relations(
 
 // Returns true if \p L is (strictly) preferred to \p R (e.g. by file paths). If
 // neither is preferred, this returns false.
-static bool prefer(const SymbolLocation &L, const SymbolLocation &R) {
+static bool prefer(const SymbolDeclDefLocation &L,
+                   const SymbolDeclDefLocation &R) {
   if (!L)
     return false;
   if (!R)
     return true;
-  auto HasCodeGenSuffix = [](const SymbolLocation &Loc) {
+  auto HasCodeGenSuffix = [](const SymbolDeclDefLocation &Loc) {
     constexpr static const char *CodegenSuffixes[] = {".proto"};
     return llvm::any_of(CodegenSuffixes, [&](llvm::StringRef Suffix) {
-      return llvm::StringRef(Loc.FileURI).ends_with(Suffix);
+      return llvm::StringRef(Loc.fileURI()).ends_with(Suffix);
     });
   };
   return HasCodeGenSuffix(L) && !HasCodeGenSuffix(R);
@@ -211,9 +212,9 @@ Symbol mergeSymbol(const Symbol &L, const Symbol &R) {
   bool PreferR = R.Definition && !L.Definition;
   // Merge include headers only if both have definitions or both have no
   // definition; otherwise, only accumulate references of common includes.
-  assert(L.Definition.FileURI && R.Definition.FileURI);
+  assert(L.Definition.fileURI() && R.Definition.fileURI());
   bool MergeIncludes =
-      bool(*L.Definition.FileURI) == bool(*R.Definition.FileURI);
+      bool(*L.Definition.fileURI()) == bool(*R.Definition.fileURI());
   Symbol S = PreferR ? R : L;        // The target symbol we're merging into.
   const Symbol &O = PreferR ? L : R; // The "other" less-preferred symbol.
 
diff --git a/clang-tools-extra/clangd/index/Ref.h b/clang-tools-extra/clangd/index/Ref.h
index 6e383e2ade3d25..ba5362f301b8d2 100644
--- a/clang-tools-extra/clangd/index/Ref.h
+++ b/clang-tools-extra/clangd/index/Ref.h
@@ -18,6 +18,7 @@
 #include <cstdint>
 #include <set>
 #include <utility>
+#include <variant>
 
 namespace clang {
 namespace clangd {
@@ -84,7 +85,7 @@ llvm::raw_ostream &operator<<(llvm::raw_ostream &, RefKind);
 /// WARNING: Location does not own the underlying data - Copies are shallow.
 struct Ref {
   /// The source location where the symbol is named.
-  SymbolLocation Location;
+  SymbolNameLocation Location;
   RefKind Kind = RefKind::Unknown;
   /// The ID of the symbol whose definition contains this reference.
   /// For example, for a reference inside a function body, this would
@@ -182,12 +183,8 @@ template <> struct DenseMapInfo<clang::clangd::RefSlab::Builder::Entry> {
         Val.Reference.Location.Start.rep(), Val.Reference.Location.End.rep());
   }
   static bool isEqual(const Entry &LHS, const Entry &RHS) {
-    return std::tie(LHS.Symbol, LHS.Reference.Location.FileURI,
-                    LHS.Reference.Kind) ==
-               std::tie(RHS.Symbol, RHS.Reference.Location.FileURI,
-                        RHS.Reference.Kind) &&
-           LHS.Reference.Location.Start == RHS.Reference.Location.Star...
[truncated]

@llvmbot
Copy link
Member

llvmbot commented Nov 29, 2024

@llvm/pr-subscribers-clang-tools-extra

Author: Christian Kandeler (ckandeler)

Changes

Apart from fixing the linked issue, this is also necessary for supporting LSP's LocationLink feature and for finding proper insertion locations in the DefineOutline tweak. Memory consumption of the background index grows by about ~2.5%.

Closes clangd/clangd#59


Patch is 69.37 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/118102.diff

32 Files Affected:

  • (modified) clang-tools-extra/clangd/CodeComplete.cpp (+2-2)
  • (modified) clang-tools-extra/clangd/FindSymbols.cpp (+24-10)
  • (modified) clang-tools-extra/clangd/FindSymbols.h (+5-3)
  • (modified) clang-tools-extra/clangd/HeaderSourceSwitch.cpp (+2-2)
  • (modified) clang-tools-extra/clangd/IncludeFixer.cpp (+3-3)
  • (modified) clang-tools-extra/clangd/Quality.cpp (+1-1)
  • (modified) clang-tools-extra/clangd/XRefs.cpp (+35-30)
  • (modified) clang-tools-extra/clangd/index/FileIndex.cpp (+3-3)
  • (modified) clang-tools-extra/clangd/index/Merge.cpp (+7-6)
  • (modified) clang-tools-extra/clangd/index/Ref.h (+4-7)
  • (modified) clang-tools-extra/clangd/index/Serialization.cpp (+28-8)
  • (modified) clang-tools-extra/clangd/index/StdLib.cpp (+6-6)
  • (modified) clang-tools-extra/clangd/index/Symbol.h (+4-4)
  • (modified) clang-tools-extra/clangd/index/SymbolCollector.cpp (+68-21)
  • (modified) clang-tools-extra/clangd/index/SymbolCollector.h (+2-1)
  • (modified) clang-tools-extra/clangd/index/SymbolLocation.cpp (+6-5)
  • (modified) clang-tools-extra/clangd/index/SymbolLocation.h (+62-41)
  • (modified) clang-tools-extra/clangd/index/YAMLSerialization.cpp (+25-12)
  • (modified) clang-tools-extra/clangd/index/dex/Dex.cpp (+2-2)
  • (modified) clang-tools-extra/clangd/refactor/Rename.cpp (+2-2)
  • (modified) clang-tools-extra/clangd/test/Inputs/symbols.test.yaml (+12-5)
  • (modified) clang-tools-extra/clangd/test/index-serialization/Inputs/sample.idx ()
  • (modified) clang-tools-extra/clangd/test/type-hierarchy-ext.test (+4-4)
  • (modified) clang-tools-extra/clangd/test/type-hierarchy.test (+4-4)
  • (modified) clang-tools-extra/clangd/unittests/BackgroundIndexTests.cpp (+2-2)
  • (modified) clang-tools-extra/clangd/unittests/CodeCompleteTests.cpp (+9-9)
  • (modified) clang-tools-extra/clangd/unittests/DexTests.cpp (+3-2)
  • (modified) clang-tools-extra/clangd/unittests/DiagnosticsTests.cpp (+7-6)
  • (modified) clang-tools-extra/clangd/unittests/FileIndexTests.cpp (+7-7)
  • (modified) clang-tools-extra/clangd/unittests/IndexTests.cpp (+23-22)
  • (modified) clang-tools-extra/clangd/unittests/SerializationTests.cpp (+19-4)
  • (modified) clang-tools-extra/clangd/unittests/SymbolCollectorTests.cpp (+7-5)
diff --git a/clang-tools-extra/clangd/CodeComplete.cpp b/clang-tools-extra/clangd/CodeComplete.cpp
index 2c2d5f0b5ac924..04e4aa2d2d1ca9 100644
--- a/clang-tools-extra/clangd/CodeComplete.cpp
+++ b/clang-tools-extra/clangd/CodeComplete.cpp
@@ -418,7 +418,7 @@ struct CodeCompletionBuilder {
     auto Inserted = [&](llvm::StringRef Header)
         -> llvm::Expected<std::pair<std::string, bool>> {
       auto ResolvedDeclaring =
-          URI::resolve(C.IndexResult->CanonicalDeclaration.FileURI, FileName);
+          URI::resolve(C.IndexResult->CanonicalDeclaration.fileURI(), FileName);
       if (!ResolvedDeclaring)
         return ResolvedDeclaring.takeError();
       auto ResolvedInserted = toHeaderFile(Header, FileName);
@@ -451,7 +451,7 @@ struct CodeCompletionBuilder {
       } else
         log("Failed to generate include insertion edits for adding header "
             "(FileURI='{0}', IncludeHeader='{1}') into {2}: {3}",
-            C.IndexResult->CanonicalDeclaration.FileURI, Inc.Header, FileName,
+            C.IndexResult->CanonicalDeclaration.fileURI(), Inc.Header, FileName,
             ToInclude.takeError());
     }
     // Prefer includes that do not need edits (i.e. already exist).
diff --git a/clang-tools-extra/clangd/FindSymbols.cpp b/clang-tools-extra/clangd/FindSymbols.cpp
index 84bcbc1f2ddd3f..646cb261309c80 100644
--- a/clang-tools-extra/clangd/FindSymbols.cpp
+++ b/clang-tools-extra/clangd/FindSymbols.cpp
@@ -54,9 +54,19 @@ bool approximateScopeMatch(llvm::StringRef Scope, llvm::StringRef Query) {
   return Query.empty();
 }
 
+Range indexToLSPRange(const SymbolPosition &SrcStart,
+                      const SymbolPosition &SrcEnd) {
+  Position Start, End;
+  Start.line = SrcStart.line();
+  Start.character = SrcStart.column();
+  End.line = SrcEnd.line();
+  End.character = SrcEnd.column();
+  return {Start, End};
+}
+
 } // namespace
 
-llvm::Expected<Location> indexToLSPLocation(const SymbolLocation &Loc,
+llvm::Expected<Location> indexToLSPLocation(const SymbolNameLocation &Loc,
                                             llvm::StringRef TUPath) {
   auto Path = URI::resolve(Loc.FileURI, TUPath);
   if (!Path)
@@ -64,17 +74,21 @@ llvm::Expected<Location> indexToLSPLocation(const SymbolLocation &Loc,
                  Path.takeError());
   Location L;
   L.uri = URIForFile::canonicalize(*Path, TUPath);
-  Position Start, End;
-  Start.line = Loc.Start.line();
-  Start.character = Loc.Start.column();
-  End.line = Loc.End.line();
-  End.character = Loc.End.column();
-  L.range = {Start, End};
+  L.range = indexToLSPRange(Loc.Start, Loc.End);
   return L;
 }
 
-llvm::Expected<Location> symbolToLocation(const Symbol &Sym,
-                                          llvm::StringRef TUPath) {
+llvm::Expected<std::pair<Location, Range>>
+indexToLSPLocation(const SymbolDeclDefLocation &Loc, StringRef TUPath) {
+  auto L = indexToLSPLocation(Loc.NameLocation, TUPath);
+  if (!L)
+    return L.takeError();
+  return std::make_pair(L.get(),
+                        indexToLSPRange(Loc.DeclDefStart, Loc.DeclDefEnd));
+}
+
+llvm::Expected<std::pair<Location, Range>>
+symbolToLocation(const Symbol &Sym, llvm::StringRef TUPath) {
   // Prefer the definition over e.g. a function declaration in a header
   return indexToLSPLocation(
       Sym.Definition ? Sym.Definition : Sym.CanonicalDeclaration, TUPath);
@@ -152,7 +166,7 @@ getWorkspaceSymbols(llvm::StringRef Query, int Limit,
     SymbolInformation Info;
     Info.name = (Sym.Name + Sym.TemplateSpecializationArgs).str();
     Info.kind = indexSymbolKindToSymbolKind(Sym.SymInfo.Kind);
-    Info.location = *Loc;
+    Info.location = Loc->first;
     Scope.consume_back("::");
     Info.containerName = Scope.str();
 
diff --git a/clang-tools-extra/clangd/FindSymbols.h b/clang-tools-extra/clangd/FindSymbols.h
index 5fb116b13d1136..29fe82d8aaa867 100644
--- a/clang-tools-extra/clangd/FindSymbols.h
+++ b/clang-tools-extra/clangd/FindSymbols.h
@@ -22,12 +22,14 @@ class ParsedAST;
 class SymbolIndex;
 
 /// Helper function for deriving an LSP Location from an index SymbolLocation.
-llvm::Expected<Location> indexToLSPLocation(const SymbolLocation &Loc,
+llvm::Expected<Location> indexToLSPLocation(const SymbolNameLocation &Loc,
                                             llvm::StringRef TUPath);
+llvm::Expected<std::pair<Location, Range>>
+indexToLSPLocation(const SymbolDeclDefLocation &Loc, llvm::StringRef TUPath);
 
 /// Helper function for deriving an LSP Location for a Symbol.
-llvm::Expected<Location> symbolToLocation(const Symbol &Sym,
-                                          llvm::StringRef TUPath);
+llvm::Expected<std::pair<Location, Range>>
+symbolToLocation(const Symbol &Sym, llvm::StringRef TUPath);
 
 /// Searches for the symbols matching \p Query. The syntax of \p Query can be
 /// the non-qualified name or fully qualified of a symbol. For example,
diff --git a/clang-tools-extra/clangd/HeaderSourceSwitch.cpp b/clang-tools-extra/clangd/HeaderSourceSwitch.cpp
index 2351858cc62972..c191e79b1962d3 100644
--- a/clang-tools-extra/clangd/HeaderSourceSwitch.cpp
+++ b/clang-tools-extra/clangd/HeaderSourceSwitch.cpp
@@ -95,9 +95,9 @@ std::optional<Path> getCorrespondingHeaderOrSource(PathRef OriginalFile,
   bool IsHeader = isHeaderFile(OriginalFile, AST.getLangOpts());
   Index->lookup(Request, [&](const Symbol &Sym) {
     if (IsHeader)
-      AwardTarget(Sym.Definition.FileURI);
+      AwardTarget(Sym.Definition.fileURI());
     else
-      AwardTarget(Sym.CanonicalDeclaration.FileURI);
+      AwardTarget(Sym.CanonicalDeclaration.fileURI());
   });
   // FIXME: our index doesn't have any interesting information (this could be
   // that the background-index is not finished), we should use the decl/def
diff --git a/clang-tools-extra/clangd/IncludeFixer.cpp b/clang-tools-extra/clangd/IncludeFixer.cpp
index fadd1105691fc0..f146bb3ba1b161 100644
--- a/clang-tools-extra/clangd/IncludeFixer.cpp
+++ b/clang-tools-extra/clangd/IncludeFixer.cpp
@@ -289,7 +289,7 @@ std::vector<Fix> IncludeFixer::fixIncompleteType(const Type &T) const {
   if (!Syms.empty()) {
     auto &Matched = *Syms.begin();
     if (!Matched.IncludeHeaders.empty() && Matched.Definition &&
-        Matched.CanonicalDeclaration.FileURI == Matched.Definition.FileURI)
+        Matched.CanonicalDeclaration.fileURI() == Matched.Definition.fileURI())
       Fixes = fixesForSymbols(Syms);
   }
   return Fixes;
@@ -299,7 +299,7 @@ std::vector<Fix> IncludeFixer::fixesForSymbols(const SymbolSlab &Syms) const {
   auto Inserted = [&](const Symbol &Sym, llvm::StringRef Header)
       -> llvm::Expected<std::pair<std::string, bool>> {
     auto ResolvedDeclaring =
-        URI::resolve(Sym.CanonicalDeclaration.FileURI, File);
+        URI::resolve(Sym.CanonicalDeclaration.fileURI(), File);
     if (!ResolvedDeclaring)
       return ResolvedDeclaring.takeError();
     auto ResolvedInserted = toHeaderFile(Header, File);
@@ -616,7 +616,7 @@ IncludeFixer::lookupCached(const SymbolID &ID) const {
   if (!Syms.empty()) {
     auto &Matched = *Syms.begin();
     if (!Matched.IncludeHeaders.empty() && Matched.Definition &&
-        Matched.CanonicalDeclaration.FileURI == Matched.Definition.FileURI)
+        Matched.CanonicalDeclaration.fileURI() == Matched.Definition.fileURI())
       Fixes = fixesForSymbols(Syms);
   }
   auto E = LookupCache.try_emplace(ID, std::move(Syms));
diff --git a/clang-tools-extra/clangd/Quality.cpp b/clang-tools-extra/clangd/Quality.cpp
index c1ab63fb22f61e..1154a426740209 100644
--- a/clang-tools-extra/clangd/Quality.cpp
+++ b/clang-tools-extra/clangd/Quality.cpp
@@ -281,7 +281,7 @@ computeScope(const NamedDecl *D) {
 }
 
 void SymbolRelevanceSignals::merge(const Symbol &IndexResult) {
-  SymbolURI = IndexResult.CanonicalDeclaration.FileURI;
+  SymbolURI = IndexResult.CanonicalDeclaration.fileURI();
   SymbolScope = IndexResult.Scope;
   IsInstanceMember |= isInstanceMember(IndexResult.SymInfo);
   if (!(IndexResult.Flags & Symbol::VisibleOutsideFile)) {
diff --git a/clang-tools-extra/clangd/XRefs.cpp b/clang-tools-extra/clangd/XRefs.cpp
index 61fa66180376cd..c815db4ffa533e 100644
--- a/clang-tools-extra/clangd/XRefs.cpp
+++ b/clang-tools-extra/clangd/XRefs.cpp
@@ -114,7 +114,7 @@ const NamedDecl *getDefinition(const NamedDecl *D) {
   return nullptr; // except cases above
 }
 
-void logIfOverflow(const SymbolLocation &Loc) {
+void logIfOverflow(const SymbolNameLocation &Loc) {
   if (Loc.Start.hasOverflow() || Loc.End.hasOverflow())
     log("Possible overflow in symbol location: {0}", Loc);
 }
@@ -123,7 +123,7 @@ void logIfOverflow(const SymbolLocation &Loc) {
 // TUPath is used to resolve the path of URI.
 // FIXME: figure out a good home for it, and share the implementation with
 // FindSymbols.
-std::optional<Location> toLSPLocation(const SymbolLocation &Loc,
+std::optional<Location> toLSPLocation(const SymbolNameLocation &Loc,
                                       llvm::StringRef TUPath) {
   if (!Loc)
     return std::nullopt;
@@ -148,8 +148,9 @@ std::optional<Location> toLSPLocation(const SymbolLocation &Loc,
   return LSPLoc;
 }
 
-SymbolLocation toIndexLocation(const Location &Loc, std::string &URIStorage) {
-  SymbolLocation SymLoc;
+SymbolNameLocation toIndexLocation(const Location &Loc,
+                                   std::string &URIStorage) {
+  SymbolNameLocation SymLoc;
   URIStorage = Loc.uri.uri();
   SymLoc.FileURI = URIStorage.c_str();
   SymLoc.Start.setLine(Loc.range.start.line);
@@ -160,17 +161,17 @@ SymbolLocation toIndexLocation(const Location &Loc, std::string &URIStorage) {
 }
 
 // Returns the preferred location between an AST location and an index location.
-SymbolLocation getPreferredLocation(const Location &ASTLoc,
-                                    const SymbolLocation &IdxLoc,
-                                    std::string &Scratch) {
+SymbolNameLocation getPreferredLocation(const Location &ASTLoc,
+                                        const SymbolNameLocation &IdxLoc,
+                                        std::string &Scratch) {
   // Also use a mock symbol for the index location so that other fields (e.g.
   // definition) are not factored into the preference.
   Symbol ASTSym, IdxSym;
   ASTSym.ID = IdxSym.ID = SymbolID("mock_symbol_id");
-  ASTSym.CanonicalDeclaration = toIndexLocation(ASTLoc, Scratch);
-  IdxSym.CanonicalDeclaration = IdxLoc;
+  ASTSym.CanonicalDeclaration.NameLocation = toIndexLocation(ASTLoc, Scratch);
+  IdxSym.CanonicalDeclaration.NameLocation = IdxLoc;
   auto Merged = mergeSymbol(ASTSym, IdxSym);
-  return Merged.CanonicalDeclaration;
+  return Merged.CanonicalDeclaration.NameLocation;
 }
 
 std::vector<std::pair<const NamedDecl *, DeclRelationSet>>
@@ -323,8 +324,8 @@ std::vector<LocatedSymbol> findImplementors(llvm::DenseSet<SymbolID> IDs,
   Req.Subjects = std::move(IDs);
   std::vector<LocatedSymbol> Results;
   Index->relations(Req, [&](const SymbolID &Subject, const Symbol &Object) {
-    auto DeclLoc =
-        indexToLSPLocation(Object.CanonicalDeclaration, MainFilePath);
+    auto DeclLoc = indexToLSPLocation(Object.CanonicalDeclaration.NameLocation,
+                                      MainFilePath);
     if (!DeclLoc) {
       elog("Find overrides: {0}", DeclLoc.takeError());
       return;
@@ -332,7 +333,8 @@ std::vector<LocatedSymbol> findImplementors(llvm::DenseSet<SymbolID> IDs,
     Results.emplace_back();
     Results.back().Name = Object.Name.str();
     Results.back().PreferredDeclaration = *DeclLoc;
-    auto DefLoc = indexToLSPLocation(Object.Definition, MainFilePath);
+    auto DefLoc =
+        indexToLSPLocation(Object.Definition.NameLocation, MainFilePath);
     if (!DefLoc) {
       elog("Failed to convert location: {0}", DefLoc.takeError());
       return;
@@ -364,23 +366,26 @@ void enhanceLocatedSymbolsFromIndex(llvm::MutableArrayRef<LocatedSymbol> Result,
     if (R.Definition) { // from AST
       // Special case: if the AST yielded a definition, then it may not be
       // the right *declaration*. Prefer the one from the index.
-      if (auto Loc = toLSPLocation(Sym.CanonicalDeclaration, MainFilePath))
+      if (auto Loc = toLSPLocation(Sym.CanonicalDeclaration.NameLocation,
+                                   MainFilePath))
         R.PreferredDeclaration = *Loc;
 
       // We might still prefer the definition from the index, e.g. for
       // generated symbols.
       if (auto Loc = toLSPLocation(
-              getPreferredLocation(*R.Definition, Sym.Definition, Scratch),
+              getPreferredLocation(*R.Definition, Sym.Definition.NameLocation,
+                                   Scratch),
               MainFilePath))
         R.Definition = *Loc;
     } else {
-      R.Definition = toLSPLocation(Sym.Definition, MainFilePath);
+      R.Definition = toLSPLocation(Sym.Definition.NameLocation, MainFilePath);
 
       // Use merge logic to choose AST or index declaration.
-      if (auto Loc = toLSPLocation(
-              getPreferredLocation(R.PreferredDeclaration,
-                                   Sym.CanonicalDeclaration, Scratch),
-              MainFilePath))
+      if (auto Loc =
+              toLSPLocation(getPreferredLocation(
+                                R.PreferredDeclaration,
+                                Sym.CanonicalDeclaration.NameLocation, Scratch),
+                            MainFilePath))
         R.PreferredDeclaration = *Loc;
     }
   });
@@ -606,7 +611,7 @@ std::vector<LocatedSymbol> locateSymbolTextually(const SpelledWord &Word,
       return;
 
     auto MaybeDeclLoc =
-        indexToLSPLocation(Sym.CanonicalDeclaration, MainFilePath);
+        indexToLSPLocation(Sym.CanonicalDeclaration.NameLocation, MainFilePath);
     if (!MaybeDeclLoc) {
       log("locateSymbolNamedTextuallyAt: {0}", MaybeDeclLoc.takeError());
       return;
@@ -616,7 +621,8 @@ std::vector<LocatedSymbol> locateSymbolTextually(const SpelledWord &Word,
     Located.Name = (Sym.Name + Sym.TemplateSpecializationArgs).str();
     Located.ID = Sym.ID;
     if (Sym.Definition) {
-      auto MaybeDefLoc = indexToLSPLocation(Sym.Definition, MainFilePath);
+      auto MaybeDefLoc =
+          indexToLSPLocation(Sym.Definition.NameLocation, MainFilePath);
       if (!MaybeDefLoc) {
         log("locateSymbolNamedTextuallyAt: {0}", MaybeDefLoc.takeError());
         return;
@@ -1495,9 +1501,10 @@ ReferencesResult findReferences(ParsedAST &AST, Position Pos, uint32_t Limit,
           Results.HasMore = true;
           return;
         }
-        const auto LSPLocDecl =
-            toLSPLocation(Object.CanonicalDeclaration, MainFilePath);
-        const auto LSPLocDef = toLSPLocation(Object.Definition, MainFilePath);
+        const auto LSPLocDecl = toLSPLocation(
+            Object.CanonicalDeclaration.NameLocation, MainFilePath);
+        const auto LSPLocDef =
+            toLSPLocation(Object.Definition.NameLocation, MainFilePath);
         if (LSPLocDecl && LSPLocDecl != LSPLocDef) {
           ReferencesResult::Reference Result;
           Result.Loc = {std::move(*LSPLocDecl), std::nullopt};
@@ -1754,11 +1761,9 @@ static std::optional<HierarchyItem> symbolToHierarchyItem(const Symbol &S,
   HierarchyItem HI;
   HI.name = std::string(S.Name);
   HI.kind = indexSymbolKindToSymbolKind(S.SymInfo.Kind);
-  HI.selectionRange = Loc->range;
-  // FIXME: Populate 'range' correctly
-  // (https://github.com/clangd/clangd/issues/59).
-  HI.range = HI.selectionRange;
-  HI.uri = Loc->uri;
+  HI.selectionRange = Loc->first.range;
+  HI.range = Loc->second;
+  HI.uri = Loc->first.uri;
 
   return HI;
 }
diff --git a/clang-tools-extra/clangd/index/FileIndex.cpp b/clang-tools-extra/clangd/index/FileIndex.cpp
index eb9562d2b6bf81..572f4b6e4e02ce 100644
--- a/clang-tools-extra/clangd/index/FileIndex.cpp
+++ b/clang-tools-extra/clangd/index/FileIndex.cpp
@@ -132,13 +132,13 @@ FileShardedIndex::FileShardedIndex(IndexFileIn Input)
   // Attribute each Symbol to both their declaration and definition locations.
   if (Index.Symbols) {
     for (const auto &S : *Index.Symbols) {
-      auto It = Shards.try_emplace(S.CanonicalDeclaration.FileURI);
+      auto It = Shards.try_emplace(S.CanonicalDeclaration.fileURI());
       It.first->getValue().Symbols.insert(&S);
       SymbolIDToFile[S.ID] = &It.first->getValue();
       // Only bother if definition file is different than declaration file.
       if (S.Definition &&
-          S.Definition.FileURI != S.CanonicalDeclaration.FileURI) {
-        auto It = Shards.try_emplace(S.Definition.FileURI);
+          S.Definition.fileURI() != S.CanonicalDeclaration.fileURI()) {
+        auto It = Shards.try_emplace(S.Definition.fileURI());
         It.first->getValue().Symbols.insert(&S);
       }
     }
diff --git a/clang-tools-extra/clangd/index/Merge.cpp b/clang-tools-extra/clangd/index/Merge.cpp
index 8221d4b1f44405..b72d7d1fa366d7 100644
--- a/clang-tools-extra/clangd/index/Merge.cpp
+++ b/clang-tools-extra/clangd/index/Merge.cpp
@@ -26,7 +26,7 @@ bool isIndexAuthoritative(const SymbolIndex::IndexedFiles &Index,
   // We expect the definition to see the canonical declaration, so it seems to
   // be enough to check only the definition if it exists.
   const char *OwningFile =
-      S.Definition ? S.Definition.FileURI : S.CanonicalDeclaration.FileURI;
+      S.Definition ? S.Definition.fileURI() : S.CanonicalDeclaration.fileURI();
   return (Index(OwningFile) & IndexContents::Symbols) != IndexContents::None;
 }
 } // namespace
@@ -189,15 +189,16 @@ void MergedIndex::relations(
 
 // Returns true if \p L is (strictly) preferred to \p R (e.g. by file paths). If
 // neither is preferred, this returns false.
-static bool prefer(const SymbolLocation &L, const SymbolLocation &R) {
+static bool prefer(const SymbolDeclDefLocation &L,
+                   const SymbolDeclDefLocation &R) {
   if (!L)
     return false;
   if (!R)
     return true;
-  auto HasCodeGenSuffix = [](const SymbolLocation &Loc) {
+  auto HasCodeGenSuffix = [](const SymbolDeclDefLocation &Loc) {
     constexpr static const char *CodegenSuffixes[] = {".proto"};
     return llvm::any_of(CodegenSuffixes, [&](llvm::StringRef Suffix) {
-      return llvm::StringRef(Loc.FileURI).ends_with(Suffix);
+      return llvm::StringRef(Loc.fileURI()).ends_with(Suffix);
     });
   };
   return HasCodeGenSuffix(L) && !HasCodeGenSuffix(R);
@@ -211,9 +212,9 @@ Symbol mergeSymbol(const Symbol &L, const Symbol &R) {
   bool PreferR = R.Definition && !L.Definition;
   // Merge include headers only if both have definitions or both have no
   // definition; otherwise, only accumulate references of common includes.
-  assert(L.Definition.FileURI && R.Definition.FileURI);
+  assert(L.Definition.fileURI() && R.Definition.fileURI());
   bool MergeIncludes =
-      bool(*L.Definition.FileURI) == bool(*R.Definition.FileURI);
+      bool(*L.Definition.fileURI()) == bool(*R.Definition.fileURI());
   Symbol S = PreferR ? R : L;        // The target symbol we're merging into.
   const Symbol &O = PreferR ? L : R; // The "other" less-preferred symbol.
 
diff --git a/clang-tools-extra/clangd/index/Ref.h b/clang-tools-extra/clangd/index/Ref.h
index 6e383e2ade3d25..ba5362f301b8d2 100644
--- a/clang-tools-extra/clangd/index/Ref.h
+++ b/clang-tools-extra/clangd/index/Ref.h
@@ -18,6 +18,7 @@
 #include <cstdint>
 #include <set>
 #include <utility>
+#include <variant>
 
 namespace clang {
 namespace clangd {
@@ -84,7 +85,7 @@ llvm::raw_ostream &operator<<(llvm::raw_ostream &, RefKind);
 /// WARNING: Location does not own the underlying data - Copies are shallow.
 struct Ref {
   /// The source location where the symbol is named.
-  SymbolLocation Location;
+  SymbolNameLocation Location;
   RefKind Kind = RefKind::Unknown;
   /// The ID of the symbol whose definition contains this reference.
   /// For example, for a reference inside a function body, this would
@@ -182,12 +183,8 @@ template <> struct DenseMapInfo<clang::clangd::RefSlab::Builder::Entry> {
         Val.Reference.Location.Start.rep(), Val.Reference.Location.End.rep());
   }
   static bool isEqual(const Entry &LHS, const Entry &RHS) {
-    return std::tie(LHS.Symbol, LHS.Reference.Location.FileURI,
-                    LHS.Reference.Kind) ==
-               std::tie(RHS.Symbol, RHS.Reference.Location.FileURI,
-                        RHS.Reference.Kind) &&
-           LHS.Reference.Location.Start == RHS.Reference.Location.Star...
[truncated]

@ckandeler ckandeler force-pushed the decl-def-range-in-index branch from 99e7189 to 188cbf1 Compare December 5, 2024 09:29
Apart from fixing the linked issue, this is also necessary
for supporting LSP's LocationLink feature and for finding
proper insertion locations in the DefineOutline tweak.
Memory consumption of the background index grows by about ~2.5%.

Closes clangd/clangd#59
@ckandeler ckandeler force-pushed the decl-def-range-in-index branch from 188cbf1 to c442378 Compare December 5, 2024 10:11
@ckandeler
Copy link
Member Author

ping

@HighCommander4
Copy link
Collaborator

Sorry I haven't had a chance to look at this so far; I haven't quite gotten to a place where I'm keeping up with review requests.

I will try to make time for it over the coming weeks, but I've also added our other clangd maintainers as reviewers, in case they have some time for it sooner. I would particularly appreciate any guidance from @kadircet on the high-level considerations (e.g. thoughts on the tradeoff of increased memory usage vs. the additional functionality we get for it).

@kadircet
Copy link
Member

I am a little hesitant on the memory growth implications, as this won't be a fixed cost we have in static index, but also a significant increase in dynamic-index memory footprint (which grows proportionally with number of touched files in an editing session). We're going to pay some significant costs in a pretty common workflow, and not in a way that we can turn off (or only pay if we're getting some functionality). Hence I'd like to better understand what we're getting in practice (and if we can't get it any other way).

so some high-level questions:

  • is this functionality really worthwhile?
    • clangd doesn't support LocationLinks today already. do we have plans to add capabilities here?
    • regarding various .range fields in responses, including TypeHierarchyItem.range. what does it really enable for editors/lsp clients?
  • Why not invest into an implementation that just performs bracket matching in a file?
    • all of this functionality seem to just care about ~nearest enclosing brace range. parsing declarator itself is hard, braces might not be around in some edge cases where they expand from macro bodies, performing IO might be hard.
    • but depending on the use cases we want, this might actually be a viable alternative. as define-outline already does IO and has a fallback. type-hierarchy is likely to perform ~limited IO. providing a generic LocationLink capability might be hard with this approach.

@ckandeler
Copy link
Member Author

My main motivation is to get DefineOutline out of the proof-of-concept stage where it just dumps the definition somewhere in the same namespace.
Usually, the location that best fits the user's expectation is next to the definition of an adjacent declaration. If that adjacent declaration follows the one to be moved, then we want to insert the new definition in front of the existing one, which requires us to know where it starts. I don't see how bracket matching can help us get from the name of the function to the actual start of the declaration.

@kadircet
Copy link
Member

My main motivation is to get DefineOutline

I think if we're solely aiming for improvements to DefineOutline, I'd say the resource trade-off here isn't worth it.

I don't see how bracket matching can help us get from the name of the function to the actual start of the declaration.

so let's say we've got the location for foo::bar from the index, as the declaration to insert after:

...
void foo::^bar() {}
...

WDYT about an approach that just looks for first trailing {} (at the same level of indentation, i.e. ignoring anything in the function prototype) ? this might surely have false positives in cases with returning auto-types/concepts etc. but I think it's still a big enough increment without incurring any costs.

@ckandeler
Copy link
Member Author

sigh
I want to use clangd for this purpose because it produces correct results particularly with non-trivial constructs, so I'm afraid
that using such textual heuristics would kill that advantage.

I'll give it a try, though.

@HighCommander4
Copy link
Collaborator

Two quick thoughts, based on reading the discussion so far and a cursory glance at the patch:

  1. Do we need to store a decl/def range in both Symbol::Definition and Symbol::CanonicalDeclaration? We could cut the memory usage overhead of the patch in half by only having it in one of them. (@kadircet would that change your analysis of the tradeoffs at all?) If we only need one decl/def range per symbol, but sometimes it's at the definition and sometimes at the declaration, we could have a top-level field in Symbol store it.
  2. A more general musing: it would be nice to have a mechanism for storing additional information for a subset of symbols, without incurring the overhead of increasing the representation size of every symbol. If we had such a mechanism, we could cut down the overhead of this patch further, by e.g. only storing the decl/def range for functions rather than all symbol types.
    • Perhaps this could take the form of a pointer field in Symbol which for a subset of symbols points to additional data stored in the symbol slab? That would, in and of itself, not reduce the memory footprint compared to (1), since the pointer would take up 8 bytes, the same as a decl/def range stored inline. However, we've had use cases like this come up before. (An example that comes to mind is storing all definitions for symbols with multiple definitions; see this comment and the next few.) So maybe it would be forward-looking to pay the cost for that pointer once, and unlock the ability to store various extra data in the future?

@ckandeler
Copy link
Member Author

  1. A more general musing: it would be nice to have a mechanism for storing additional information for a subset of symbols, without incurring the overhead of increasing the representation size of every symbol. If we had such a mechanism, we could cut down the overhead of this patch further, by e.g. only storing the decl/def range for functions rather than all symbol types.

    • Perhaps this could take the form of a pointer field in Symbol which for a subset of symbols points to additional data stored in the symbol slab? That would, in and of itself, not reduce the memory footprint compared to (1), since the pointer would take up 8 bytes, the same as a decl/def range stored inline. However, we've had use cases like this come up before. (An example that comes to mind is storing all definitions for symbols with multiple definitions; see this comment and the next few.) So maybe it would be forward-looking to pay the cost for that pointer once, and unlock the ability to store various extra data in the future?

I notice that there are four fields in the Symbol struct that are only relevant for symbols indexed for completion. Presumably these would be candidates for such an extension block?

@kadircet
Copy link
Member

We could cut the memory usage overhead of the patch in half by only having it in one of them. (@kadircet would that change your analysis of the tradeoffs at all?)

I think that'd still be too much considering the functionality we'll get in the end. Others might not share my values here, but my guiding principle has been: We shouldn't increase costs of overall clangd infrastructure, if it isn't going to benefit ~all users and functionality.

#127359 is a good example of this. That patch cost us some memory (1%), but in return we got improvements on workflows that are ~always used, and available in pretty much all the editors (signature help & diagnostics).

The justification I have for eating costs in such features is, even if we handled them in a narrower focus, we'd still pay those costs ~always (since they're going to be used frequently by all users). Hence if the feature is worth implementing, resource costs are fine to move into core bits of clangd (it would usually result in better complexity/resource usage overall).

A more general musing: it would be nice to have a mechanism for storing additional information for a subset of symbols, without incurring the overhead of increasing the representation size of every symbol.

As a result of that principle, I am definitely in favor of such mechanisms, bearing maintenance/feature trade-offs in mind.

(For example ClangdModules was an action we were trying to take in this direction. It actually aims to provide people infrastructure to implement their desired functionality in clangd, while limiting resource and maintenance costs to only that specific feature and its users. but then we took an arrow in the knee)

@HighCommander4
Copy link
Collaborator

(For example ClangdModules was an action we were trying to take in this direction. It actually aims to provide people infrastructure to implement their desired functionality in clangd, while limiting resource and maintenance costs to only that specific feature and its users. but then we took an arrow in the knee)

Could you say more about this? What happened to ClangdModules?

@HighCommander4
Copy link
Collaborator

HighCommander4 commented Feb 19, 2025

I notice that there are four fields in the Symbol struct that are only relevant for symbols indexed for completion. Presumably these would be candidates for such an extension block?

I think that depends on what proportion of symbols are indexed for completion.

I was imagining a fairly simple, fixed-size extension block; that means the overhead of the extension block is incurred for any symbol which needs any of the fields in the extension block. If the completion-related fields are in the extension block, and if the proportion of symbols which need them is high, we are then storing this larger extension block for many symbols, potentially negating the savings we get from having them in the first place.

We could also consider more involved extension block designs which are dynamically sized to contain only the fields they need, at the expense of more complexity.

@kadircet
Copy link
Member

Could you say more about this? What happened to ClangdModules?

Sorry, apparently I even forgot its proper name, i meant clangd::FeatureModule. Nothing bad happened, we built this infra but didn't get a chance to polish and use it afterwards.
We wanted to migrate any non-core components into this bit slowly (clang-tidy, include-cleaner, even LSP features themselves) to both make clangd features more plug&play but also ensure less maintenance burden in the core infra.

@damnskippy
Copy link

Been following this thread with much interest albeit selfishly a bit since I was of the impression this PR will pave the way for supporting LocationLink. [Ref: clangd/clangd/discussions/2274]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Populate TypeHierarchyItem.range correctly for subtypes

5 participants