[clangd] Doxygen Parsing: Add proper handling of \note, \warning and \retval command + change render kind of command arguments + add preprocessing for markdown code blocks/spans #156365

tcottin · 2025-09-01T18:07:10Z

PR for some followup issues from clangd/clangd#529.

\note and \warning:
In the hover card, they are now displayed with heading and enclosing rulers.

\retval commands:
Each \retval command is now a bullet point under the return section of the hover card.

Added a Markdown preprocessing step before parsing the documentation with the doxygen parser.
This mainly replaces markdown code blocks with @code...@endcode commands.

Change arguments of doxygen commands to be rendered as monospaced

Stehsaer · 2025-09-02T15:12:02Z

It's working great as intended!

Can other commands like @details be added the same way we add paragraphs for @note and @warning?

Update: Yes we can, just add some more changes as given below. This should work well for "generic" ones that only needs a Header and its content to be shown. Compiled and tested it out, working quite well.

diff --git a/clang-tools-extra/clangd/SymbolDocumentation.cpp b/clang-tools-extra/clangd/SymbolDocumentation.cpp
index 86bd6162b..1638e4b55 100644
--- a/clang-tools-extra/clangd/SymbolDocumentation.cpp
+++ b/clang-tools-extra/clangd/SymbolDocumentation.cpp
@@ -216,6 +216,7 @@ public:
       break;
     case comments::CommandTraits::KCI_note:
     case comments::CommandTraits::KCI_warning:
+    case comments::CommandTraits::KCI_details:
       commandToHeadedParagraph(B);
       break;
     case comments::CommandTraits::KCI_retval: {

…ctly

tcottin · 2025-09-15T12:28:38Z

It's working great as intended! Can other commands like @details be added the same way we add paragraphs for @note and @warning?

Update: Yes we can, just add some more changes as given below. This should work well for "generic" ones that only needs a Header and its content to be shown. Compiled and tested it out, working quite well.
diff --git a/clang-tools-extra/clangd/SymbolDocumentation.cpp b/clang-tools-extra/clangd/SymbolDocumentation.cpp
index 86bd6162b..1638e4b55 100644
--- a/clang-tools-extra/clangd/SymbolDocumentation.cpp
+++ b/clang-tools-extra/clangd/SymbolDocumentation.cpp
@@ -216,6 +216,7 @@ public:
       break;
     case comments::CommandTraits::KCI_note:
     case comments::CommandTraits::KCI_warning:
+    case comments::CommandTraits::KCI_details:
       commandToHeadedParagraph(B);
       break;
     case comments::CommandTraits::KCI_retval: {

I am not sure if @details should get a header like this.
Doxygen just omits this command:

/// @brief a sample function
/// @note test note for myFunction
/// @details detailed description of myFunction
void myFunction();

tcottin · 2025-09-17T05:18:53Z

@emaxx-google could you take a look at this?

llvmbot · 2025-09-17T05:19:04Z

@llvm/pr-subscribers-clangd

@llvm/pr-subscribers-clang-tools-extra

Author: None (tcottin)

Changes

PR for some followup issues from clangd/clangd#529.

\note and \warning:
In the hover card, they are now displayed with heading and enclosing rulers.

\retval commands:
Each \retval command is now a bullet point under the return section of the hover card.

Patch is 36.46 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/156365.diff

6 Files Affected:

(modified) clang-tools-extra/clangd/Hover.cpp (+2-4)
(modified) clang-tools-extra/clangd/SymbolDocumentation.cpp (+214-48)
(modified) clang-tools-extra/clangd/SymbolDocumentation.h (+31-51)
(modified) clang-tools-extra/clangd/support/Markup.cpp (+27-2)
(modified) clang-tools-extra/clangd/unittests/HoverTests.cpp (+39-4)
(modified) clang-tools-extra/clangd/unittests/SymbolDocumentationTests.cpp (+509-10)

diff --git a/clang-tools-extra/clangd/Hover.cpp b/clang-tools-extra/clangd/Hover.cpp
index 9eec322fe5963..3b9baed1c2838 100644
--- a/clang-tools-extra/clangd/Hover.cpp
+++ b/clang-tools-extra/clangd/Hover.cpp
@@ -1589,13 +1589,11 @@ markup::Document HoverInfo::presentDoxygen() const {
       P.appendText(" - ");
       SymbolDoc.returnToMarkup(P);
     }
+
+    SymbolDoc.retvalsToMarkup(Output);
     Output.addRuler();
   }
 
-  // add specially handled doxygen commands.
-  SymbolDoc.warningsToMarkup(Output);
-  SymbolDoc.notesToMarkup(Output);
-
   // add any other documentation.
   SymbolDoc.docToMarkup(Output);
 
diff --git a/clang-tools-extra/clangd/SymbolDocumentation.cpp b/clang-tools-extra/clangd/SymbolDocumentation.cpp
index 9ae1ef3f828e0..ee67f9ae66443 100644
--- a/clang-tools-extra/clangd/SymbolDocumentation.cpp
+++ b/clang-tools-extra/clangd/SymbolDocumentation.cpp
@@ -13,6 +13,7 @@
 #include "clang/AST/CommentCommandTraits.h"
 #include "clang/AST/CommentVisitor.h"
 #include "llvm/ADT/DenseMap.h"
+#include "llvm/ADT/StringExtras.h"
 #include "llvm/ADT/StringRef.h"
 
 namespace clang {
@@ -34,10 +35,20 @@ void commandToMarkup(markup::Paragraph &Out, StringRef Command,
                      StringRef Args) {
   Out.appendBoldText(commandMarkerAsString(CommandMarker) + Command.str());
   Out.appendSpace();
-  if (!Args.empty()) {
-    Out.appendEmphasizedText(Args.str());
+  if (!Args.empty())
+    Out.appendCode(Args.str());
+}
+
+template <typename T> std::string getArgText(const T *Command) {
+  std::string ArgText;
+  for (unsigned I = 0; I < Command->getNumArgs(); ++I) {
+    if (!ArgText.empty())
+      ArgText += " ";
+    ArgText += Command->getArgText(I);
   }
+  return ArgText;
 }
+
 } // namespace
 
 class ParagraphToMarkupDocument
@@ -70,12 +81,7 @@ class ParagraphToMarkupDocument
   void visitInlineCommandComment(const comments::InlineCommandComment *C) {
 
     if (C->getNumArgs() > 0) {
-      std::string ArgText;
-      for (unsigned I = 0; I < C->getNumArgs(); ++I) {
-        if (!ArgText.empty())
-          ArgText += " ";
-        ArgText += C->getArgText(I);
-      }
+      std::string ArgText = getArgText(C);
 
       switch (C->getRenderKind()) {
       case comments::InlineCommandRenderKind::Monospaced:
@@ -158,10 +164,9 @@ class ParagraphToString
   void visitInlineCommandComment(const comments::InlineCommandComment *C) {
     Out << commandMarkerAsString(C->getCommandMarker());
     Out << C->getCommandName(Traits);
-    if (C->getNumArgs() > 0) {
-      for (unsigned I = 0; I < C->getNumArgs(); ++I)
-        Out << " " << C->getArgText(I);
-    }
+    std::string ArgText = getArgText(C);
+    if (!ArgText.empty())
+      Out << " " << ArgText;
     Out << " ";
   }
 
@@ -210,16 +215,27 @@ class BlockCommentToMarkupDocument
                                 Traits)
           .visit(B->getParagraph());
       break;
+    case comments::CommandTraits::KCI_note:
+    case comments::CommandTraits::KCI_warning:
+      commandToHeadedParagraph(B);
+      break;
+    case comments::CommandTraits::KCI_retval: {
+      // The \retval command describes the return value given as its single
+      // argument in the corresponding paragraph.
+      // Note: We know that we have exactly one argument but not if it has an
+      // associated paragraph.
+      auto &P = Out.addParagraph().appendCode(getArgText(B));
+      if (B->getParagraph() && !B->getParagraph()->isWhitespace()) {
+        P.appendText(" - ");
+        ParagraphToMarkupDocument(P, Traits).visit(B->getParagraph());
+      }
+      return;
+    }
     default: {
       // Some commands have arguments, like \throws.
       // The arguments are not part of the paragraph.
       // We need reconstruct them here.
-      std::string ArgText;
-      for (unsigned I = 0; I < B->getNumArgs(); ++I) {
-        if (!ArgText.empty())
-          ArgText += " ";
-        ArgText += B->getArgText(I);
-      }
+      std::string ArgText = getArgText(B);
       auto &P = Out.addParagraph();
       commandToMarkup(P, B->getCommandName(Traits), B->getCommandMarker(),
                       ArgText);
@@ -234,7 +250,53 @@ class BlockCommentToMarkupDocument
     }
   }
 
+  void visitCodeCommand(const comments::VerbatimBlockComment *VB) {
+    std::string CodeLang = "";
+    auto *FirstLine = VB->child_begin();
+    // The \\code command has an optional language argument.
+    // This argument is currently not parsed by the clang doxygen parser.
+    // Therefore we try to extract it from the first line of the verbatim
+    // block.
+    if (VB->getNumLines() > 0) {
+      if (const auto *Line =
+              cast<comments::VerbatimBlockLineComment>(*FirstLine)) {
+        llvm::StringRef Text = Line->getText();
+        // Language is a single word enclosed in {}.
+        if (llvm::none_of(Text, llvm::isSpace) && Text.consume_front("{") &&
+            Text.consume_back("}")) {
+          // drop a potential . since this is not supported in Markdown
+          // fenced code blocks.
+          Text.consume_front(".");
+          // Language is alphanumeric or '+'.
+          CodeLang = Text.take_while([](char C) {
+                           return llvm::isAlnum(C) || C == '+';
+                         })
+                         .str();
+          // Skip the first line for the verbatim text.
+          ++FirstLine;
+        }
+      }
+    }
+
+    std::string CodeBlockText;
+
+    for (const auto *LI = FirstLine; LI != VB->child_end(); ++LI) {
+      if (const auto *Line = cast<comments::VerbatimBlockLineComment>(*LI)) {
+        CodeBlockText += Line->getText().str() + "\n";
+      }
+    }
+
+    Out.addCodeBlock(CodeBlockText, CodeLang);
+  }
+
   void visitVerbatimBlockComment(const comments::VerbatimBlockComment *VB) {
+    // The \\code command is a special verbatim block command which we handle
+    // separately.
+    if (VB->getCommandID() == comments::CommandTraits::KCI_code) {
+      visitCodeCommand(VB);
+      return;
+    }
+
     commandToMarkup(Out.addParagraph(), VB->getCommandName(Traits),
                     VB->getCommandMarker(), "");
 
@@ -262,8 +324,125 @@ class BlockCommentToMarkupDocument
   markup::Document &Out;
   const comments::CommandTraits &Traits;
   StringRef CommentEscapeMarker;
+
+  /// Emphasize the given command in a paragraph.
+  /// Uses the command name with the first letter capitalized as the heading.
+  void commandToHeadedParagraph(const comments::BlockCommandComment *B) {
+    Out.addRuler();
+    auto &P = Out.addParagraph();
+    std::string Heading = B->getCommandName(Traits).slice(0, 1).upper() +
+                          B->getCommandName(Traits).drop_front().str();
+    P.appendBoldText(Heading + ":");
+    P.appendText("  \n");
+    ParagraphToMarkupDocument(P, Traits).visit(B->getParagraph());
+    Out.addRuler();
+  }
 };
 
+void SymbolDocCommentVisitor::preprocessDocumentation(StringRef Doc) {
+  enum State {
+    Normal,
+    FencedCodeblock,
+  } State = Normal;
+  std::string CodeFence;
+
+  llvm::raw_string_ostream OS(CommentWithMarkers);
+
+  // The documentation string is processed line by line.
+  // The raw documentation string does not contain the comment markers
+  // (e.g. /// or /** */).
+  // But the comment lexer expects doxygen markers, so add them back.
+  // We need to use the /// style doxygen markers because the comment could
+  // contain the closing tag "*/" of a C Style "/** */" comment
+  // which would break the parsing if we would just enclose the comment text
+  // with "/** */".
+
+  // Escape doxygen commands inside markdown inline code spans.
+  // This is required to not let the doxygen parser interpret them as
+  // commands.
+  // Note: This is a heuristic which may fail in some cases.
+  bool InCodeSpan = false;
+
+  llvm::StringRef Line, Rest;
+  for (std::tie(Line, Rest) = Doc.split('\n'); !(Line.empty() && Rest.empty());
+       std::tie(Line, Rest) = Rest.split('\n')) {
+
+    // Detect code fence (``` or ~~~)
+    if (State == Normal) {
+      llvm::StringRef Trimmed = Line.ltrim();
+      if (Trimmed.starts_with("```") || Trimmed.starts_with("~~~")) {
+        // https://www.doxygen.nl/manual/markdown.html#md_fenced
+        CodeFence =
+            Trimmed.take_while([](char C) { return C == '`' || C == '~'; })
+                .str();
+        // Try to detect language: first word after fence. Could also be
+        // enclosed in {}
+        llvm::StringRef AfterFence =
+            Trimmed.drop_front(CodeFence.size()).ltrim();
+        // ignore '{' at the beginning of the language name to not duplicate it
+        // for the doxygen command
+        AfterFence.consume_front("{");
+        // The name is alphanumeric or '.' or '+'
+        StringRef CodeLang = AfterFence.take_while(
+            [](char C) { return llvm::isAlnum(C) || C == '.' || C == '+'; });
+
+        OS << "///@code";
+
+        if (!CodeLang.empty())
+          OS << "{" << CodeLang.str() << "}";
+
+        OS << "\n";
+
+        State = FencedCodeblock;
+        continue;
+      }
+
+      // FIXME: handle indented code blocks too?
+      // In doxygen, the indentation which triggers a code block depends on the
+      // indentation of the previous paragraph.
+      // https://www.doxygen.nl/manual/markdown.html#mddox_code_blocks
+    } else if (State == FencedCodeblock) {
+      // End of code fence
+      if (Line.ltrim().starts_with(CodeFence)) {
+        OS << "///@endcode\n";
+        State = Normal;
+        continue;
+      }
+      OS << "///" << Line << "\n";
+      continue;
+    }
+
+    // Normal line preprocessing (add doxygen markers, handle escaping)
+    OS << "///";
+
+    if (Line.empty() || Line.trim().empty()) {
+      OS << "\n";
+      // Empty lines reset the InCodeSpan state.
+      InCodeSpan = false;
+      continue;
+    }
+
+    if (Line.starts_with("<"))
+      // A comment line starting with '///<' is treated as a doxygen
+      // command. To avoid this, we add a space before the '<'.
+      OS << ' ';
+
+    for (char C : Line) {
+      if (C == '`')
+        InCodeSpan = !InCodeSpan;
+      else if (InCodeSpan && (C == '@' || C == '\\'))
+        OS << '\\';
+      OS << C;
+    }
+
+    OS << "\n";
+  }
+
+  // Close any unclosed code block
+  if (State == FencedCodeblock)
+    OS << "///@endcode\n";
+}
+
 void SymbolDocCommentVisitor::visitBlockCommandComment(
     const comments::BlockCommandComment *B) {
   switch (B->getCommandID()) {
@@ -282,36 +461,22 @@ void SymbolDocCommentVisitor::visitBlockCommandComment(
     }
     break;
   case comments::CommandTraits::KCI_retval:
-    RetvalParagraphs.push_back(B->getParagraph());
-    return;
-  case comments::CommandTraits::KCI_warning:
-    WarningParagraphs.push_back(B->getParagraph());
-    return;
-  case comments::CommandTraits::KCI_note:
-    NoteParagraphs.push_back(B->getParagraph());
+    // Only consider retval commands having an argument.
+    // The argument contains the described return value which is needed to
+    // convert it to markup.
+    if (B->getNumArgs() == 1)
+      RetvalCommands.push_back(B);
     return;
   default:
     break;
   }
 
-  // For all other commands, we store them in the UnhandledCommands map.
+  // For all other commands, we store them in the BlockCommands map.
   // This allows us to keep the order of the comments.
-  UnhandledCommands[CommentPartIndex] = B;
+  BlockCommands[CommentPartIndex] = B;
   CommentPartIndex++;
 }
 
-void SymbolDocCommentVisitor::paragraphsToMarkup(
-    markup::Document &Out,
-    const llvm::SmallVectorImpl<const comments::ParagraphComment *> &Paragraphs)
-    const {
-  if (Paragraphs.empty())
-    return;
-
-  for (const auto *P : Paragraphs) {
-    ParagraphToMarkupDocument(Out.addParagraph(), Traits).visit(P);
-  }
-}
-
 void SymbolDocCommentVisitor::briefToMarkup(markup::Paragraph &Out) const {
   if (!BriefParagraph)
     return;
@@ -324,14 +489,6 @@ void SymbolDocCommentVisitor::returnToMarkup(markup::Paragraph &Out) const {
   ParagraphToMarkupDocument(Out, Traits).visit(ReturnParagraph);
 }
 
-void SymbolDocCommentVisitor::notesToMarkup(markup::Document &Out) const {
-  paragraphsToMarkup(Out, NoteParagraphs);
-}
-
-void SymbolDocCommentVisitor::warningsToMarkup(markup::Document &Out) const {
-  paragraphsToMarkup(Out, WarningParagraphs);
-}
-
 void SymbolDocCommentVisitor::parameterDocToMarkup(
     StringRef ParamName, markup::Paragraph &Out) const {
   if (ParamName.empty())
@@ -354,7 +511,7 @@ void SymbolDocCommentVisitor::parameterDocToString(
 
 void SymbolDocCommentVisitor::docToMarkup(markup::Document &Out) const {
   for (unsigned I = 0; I < CommentPartIndex; ++I) {
-    if (const auto *BC = UnhandledCommands.lookup(I)) {
+    if (const auto *BC = BlockCommands.lookup(I)) {
       BlockCommentToMarkupDocument(Out, Traits).visit(BC);
     } else if (const auto *P = FreeParagraphs.lookup(I)) {
       ParagraphToMarkupDocument(Out.addParagraph(), Traits).visit(P);
@@ -382,5 +539,14 @@ void SymbolDocCommentVisitor::templateTypeParmDocToString(
   }
 }
 
+void SymbolDocCommentVisitor::retvalsToMarkup(markup::Document &Out) const {
+  if (RetvalCommands.empty())
+    return;
+  markup::BulletList &BL = Out.addBulletList();
+  for (const auto *P : RetvalCommands) {
+    BlockCommentToMarkupDocument(BL.addItem(), Traits).visit(P);
+  }
+}
+
 } // namespace clangd
 } // namespace clang
diff --git a/clang-tools-extra/clangd/SymbolDocumentation.h b/clang-tools-extra/clangd/SymbolDocumentation.h
index b0d3428dfce20..825c9a296df70 100644
--- a/clang-tools-extra/clangd/SymbolDocumentation.h
+++ b/clang-tools-extra/clangd/SymbolDocumentation.h
@@ -21,6 +21,7 @@
 #include "clang/AST/CommentSema.h"
 #include "clang/AST/CommentVisitor.h"
 #include "clang/Basic/SourceManager.h"
+#include "llvm/ADT/StringRef.h"
 #include "llvm/Support/raw_ostream.h"
 #include <string>
 
@@ -51,31 +52,8 @@ class SymbolDocCommentVisitor
     CommentWithMarkers.reserve(Documentation.size() +
                                Documentation.count('\n') * 3);
 
-    // The comment lexer expects doxygen markers, so add them back.
-    // We need to use the /// style doxygen markers because the comment could
-    // contain the closing the closing tag "*/" of a C Style "/** */" comment
-    // which would break the parsing if we would just enclose the comment text
-    // with "/** */".
-    CommentWithMarkers = "///";
-    bool NewLine = true;
-    for (char C : Documentation) {
-      if (C == '\n') {
-        CommentWithMarkers += "\n///";
-        NewLine = true;
-      } else {
-        if (NewLine && (C == '<')) {
-          // A comment line starting with '///<' is treated as a doxygen
-          // comment. Therefore add a space to separate the '<' from the comment
-          // marker. This allows to parse html tags at the beginning of a line
-          // and the escape marker prevents adding the artificial space in the
-          // markup documentation. The extra space will not be rendered, since
-          // we render it as markdown.
-          CommentWithMarkers += ' ';
-        }
-        CommentWithMarkers += C;
-        NewLine = false;
-      }
-    }
+    preprocessDocumentation(Documentation);
+
     SourceManagerForFile SourceMgrForFile("mock_file.cpp", CommentWithMarkers);
 
     SourceManager &SourceMgr = SourceMgrForFile.get();
@@ -114,22 +92,14 @@ class SymbolDocCommentVisitor
 
   bool hasReturnCommand() const { return ReturnParagraph; }
 
-  bool hasRetvalCommands() const { return !RetvalParagraphs.empty(); }
-
-  bool hasNoteCommands() const { return !NoteParagraphs.empty(); }
-
-  bool hasWarningCommands() const { return !WarningParagraphs.empty(); }
-
   /// Converts all unhandled comment commands to a markup document.
   void docToMarkup(markup::Document &Out) const;
   /// Converts the "brief" command(s) to a markup document.
   void briefToMarkup(markup::Paragraph &Out) const;
   /// Converts the "return" command(s) to a markup document.
   void returnToMarkup(markup::Paragraph &Out) const;
-  /// Converts the "note" command(s) to a markup document.
-  void notesToMarkup(markup::Document &Out) const;
-  /// Converts the "warning" command(s) to a markup document.
-  void warningsToMarkup(markup::Document &Out) const;
+  /// Converts the "retval" command(s) to a markup document.
+  void retvalsToMarkup(markup::Document &Out) const;
 
   void visitBlockCommandComment(const comments::BlockCommandComment *B);
 
@@ -157,6 +127,27 @@ class SymbolDocCommentVisitor
     TemplateParameters[TP->getParamNameAsWritten()] = std::move(TP);
   }
 
+  /// \brief Preprocesses the raw documentation string to prepare it for doxygen
+  /// parsing.
+  ///
+  /// This is a workaround to provide better support for markdown in
+  /// doxygen. Clang's doxygen parser e.g. does not handle markdown code blocks.
+  ///
+  /// The documentation string is preprocessed to replace some markdown
+  /// constructs with parsable doxygen commands. E.g. markdown code blocks are
+  /// replaced with doxygen \\code{.lang} ...
+  /// \\endcode blocks.
+  ///
+  /// Additionally, potential doxygen commands inside markdown
+  /// inline code spans are escaped to avoid that doxygen tries to interpret
+  /// them as commands.
+  ///
+  /// \note Although this is a workaround, it is very similar to what
+  /// doxygen itself does for markdown. In doxygen, the first parsing step is
+  /// also a markdown preprocessing step.
+  /// See https://www.doxygen.nl/manual/markdown.html
+  void preprocessDocumentation(StringRef Doc);
+
 private:
   comments::CommandTraits Traits;
   llvm::BumpPtrAllocator Allocator;
@@ -173,19 +164,13 @@ class SymbolDocCommentVisitor
   /// Paragraph of the "return" command.
   const comments::ParagraphComment *ReturnParagraph = nullptr;
 
-  /// Paragraph(s) of the "note" command(s)
-  llvm::SmallVector<const comments::ParagraphComment *> RetvalParagraphs;
-
-  /// Paragraph(s) of the "note" command(s)
-  llvm::SmallVector<const comments::ParagraphComment *> NoteParagraphs;
+  /// All the "retval" command(s)
+  llvm::SmallVector<const comments::BlockCommandComment *> RetvalCommands;
 
-  /// Paragraph(s) of the "warning" command(s)
-  llvm::SmallVector<const comments::ParagraphComment *> WarningParagraphs;
-
-  /// All the paragraphs we don't have any special handling for,
-  /// e.g. "details".
+  /// All the parsed doxygen block commands.
+  /// They might have special handling internally like \\note or \\warning
   llvm::SmallDenseMap<unsigned, const comments::BlockCommandComment *>
-      UnhandledCommands;
+      BlockCommands;
 
   /// Parsed paragaph(s) of the "param" comamnd(s)
   llvm::SmallDenseMap<StringRef, const comments::ParamCommandComment *>
@@ -198,11 +183,6 @@ class SymbolDocCommentVisitor
   /// All "free" text paragraphs.
   llvm::SmallDenseMap<unsigned, const comments::ParagraphComment *>
       FreeParagraphs;
-
-  void paragraphsToMarkup(
-      markup::Document &Out,
-      const llvm::SmallVectorImpl<const comments::ParagraphComment *>
-          &Paragraphs) const;
 };
 
 } // namespace clangd
diff --git a/clang-tools-extra/clangd/support/Markup.cpp b/clang-tools-extra/clangd/support/Markup.cpp
index 89bdc656d440f..304917de252bf 100644
--- a/clang-tools-extra/clangd/support/Markup.cpp
+++ b/clang-tools-extra/clangd/support/Markup.cpp
@@ -199,10 +199,16 @@ bool needsLeadingEscape(char C, llvm::StringRef Before, llvm::StringRef After,
   return needsLeadingEscapeMarkdown(C, After);
 }
 
-/// Escape a markdown text block.
+/// \brief Render text for markdown output.
+///
 /// If \p EscapeMarkdown is true it ensures the punctuation will not introduce
 /// any of the markdown constructs.
+///
 /// Else, markdown syntax is not escaped, only HTML tags and entities.
+/// HTML is escaped because usually clients do not support HTML rendering by
+/// default. Passing unescaped HTML will therefore often result in not showing
+/// the HTML at all.
+/// \note In markdown code spans, we do not escape anything.
 std::string renderText(llvm::StringRef Input, bool StartsLine,
                        bool EscapeMarkdown) {
   std::string R;
@@ -213,6 +219,10 @@ std::string renderText(llvm::StringRef Input, bool StartsLine,
 
   bool IsFirstLine = true;
 
+  // Inside markdown code spans, we do not escape anything when EscapeMarkdown
+  // is false.
+  bool InCodeSpan = false;
+
   for (std::tie(L...
[truncated]

Stehsaer · 2025-09-17T05:55:28Z

It's working great as intended! Can other commands like @details be added the same way we add paragraphs for @note and @warning?
Update: Yes we can, just add some more changes as given below. This should work well for "generic" ones that only needs a Header and its content to be shown. Compiled and tested it out, working quite well.
diff --git a/clang-tools-extra/clangd/SymbolDocumentation.cpp b/clang-tools-extra/clangd/SymbolDocumentation.cpp
index 86bd6162b..1638e4b55 100644
--- a/clang-tools-extra/clangd/SymbolDocumentation.cpp
+++ b/clang-tools-extra/clangd/SymbolDocumentation.cpp
@@ -216,6 +216,7 @@ public:
       break;
     case comments::CommandTraits::KCI_note:
     case comments::CommandTraits::KCI_warning:
+    case comments::CommandTraits::KCI_details:
       commandToHeadedParagraph(B);
       break;
     case comments::CommandTraits::KCI_retval: {
I am not sure if @details should get a header like this. Doxygen just omits this command:
/// @brief a sample function
/// @note test note for myFunction
/// @details detailed description of myFunction
void myFunction();

Personally, I am for the idea of letting @details have a header.

We add doxygen support to clangd not just for "preserving the original look of HTML document generated by doxygen", but to improve the overall coding experience by providing the information acquired from the doxygen comment to the user. I've been using clangd compiled from this patch (along with header of @details added, see the diff text I gave) for two weeks, and it turns out that having a dedicated header for @details greatly improve the experience for me.

Specifically, it explicitly let me know that, THIS paragraph is a brief (without header) which I must pay attention to, and THAT paragraph is the detailed description (with a header) that I may look into when I want to know about more details. This is very helpful when under quick development, because it let me acquire the MOST USEFUL information by just quickly scanning the hover. Without the header, it is harder to distinguish between the brief and the detail, especially when they are placed in the adjacent lines. And by the way, this is also the reason that I'm not in favor of that doxygen isn't adding a header for the @details paragraphs.

To sum up, because of the different goal and scenario

doxygen: generate a good looking, unified documentation, potentially to be deployed on websites
clangd: providing quick information (from doxygen comments) WHILE programming

I prefer adding a header for @details.

Attachment

How it look with a dedicated header for @details:

With @notes removed:

tcottin · 2025-09-19T20:57:01Z

I aggree, the separation of brief and details could be better. But:

One disadvantage of using the simple header approach for details is that we would rely on the user documentation to create the separation between brief and details.
Meaning, only if the \details command is used, the separation can be done.
And since the \details command does not have an effect in doxygen (except if it used to "terminate" a brief paragraph without an empty line) I can imagine that this command is not that commonly used and therefore I dont think we should generally rely on this.

Instead of using the header approach for the \details command, I think the separation should be fixed by

Implement an automatic brief extraction as specified for the doxygen setting JAVADOC_AUTOBRIEF==YES. This way we always have a known separation between the brief description and the rest of the documentation (details).
Currently the behaviour is like JAVADOC_AUTOBRIEF==NO and therefore this separation is not possible in case no \brief command was used.
Add a separate section for the detailed documentation the same way as for the return and paramater documentation.
Under the new details section we can then add the detailed description we separated from the brief in step 1.

Note: there already is a brief parser available in clang clang/include/clang/AST/CommentBriefParser.h but I think it is not generic enough to be used directly for clangd.

tcottin · 2025-09-21T10:17:42Z

@Stehsaer I implemented the separation. Please give it a try.

For simplicity, I use the first paragraph in case there is no brief command and the first paragraph is not a block command.
This is not exactly the same as with doxygen JAVADOC_AUTOBRIEF==YES but I think its good this way.

Stehsaer · 2025-09-22T16:24:08Z

@Stehsaer I implemented the separation. Please give it a try.

For simplicity, I use the first paragraph in case there is no brief command and the first paragraph is not a block command. This is not exactly the same as with doxygen JAVADOC_AUTOBRIEF==YES but I think its good this way.

Just tried it out and tested over my codebases, it's working greater than expected! Although it doesn't perfectly align with the purpose of the original @details, but it is working even better than the original, at least in all my testcases. All briefs and details inside the comments are separated nicely and expectedly.

Edit: Sorry for the incomplete testing, I've found one unexpected behavior:

///
/// @brief Some brief
///
/// @param a An integer parameter
/// @param b A floating-point parameter
/// @param c A string parameter
///
/// @note Some note to take care of...
///
void test(int a, float b, const std::string& c);

was parsed and displayed as:

Is this intended/by design?

If not: further testing and fixings are needed to cover the edge cases as much as possible. Still, this commit is a great step forward!
If yes: I wonder why an empty Details: paragraph is placed there? Does it serve the purpose of separating between brief and params, and note, warning etc.

tcottin · 2025-09-22T20:42:11Z

Showing the Details: paragraph/section like in your example is intended.

The purpose is to create an own section for the detailed description, like for Parameters and Return etc.
And in fact, this section is not empty. It "contains" the note command in your example.

In your example case the ruler after the Details: section heading is confusing maybe.
But this ruler is actually part of the @note command.
I added this ruler to try to emphasize the @note even more, like the markdown alerts would.

But we could also remove these rulers for the note, warning highlights to not separate it too much from the rest of the text.

Also note that the sections Parameters, Return, Details etc. are different than the highlighting of the note and warning commands. The highlighting is "inline" and can be used multiple times, whereas the sections are structural adaptions and unique.

E.g. the following shows all sections and 2 notes for a function documentation:

/// Detailed description.
/// @note this is a note
///
/// Here is some more text of the detailed description.
/// @note another note
///
/// @brief This is the brief description.
///
/// @return return value description.
/// @param arg A function parameter.
/// @tparam T A template type parameter.
template<typename T>
bool func(T arg);

The note commands are shown in the same place as they were written in the detailed description, but the sections all have their fixed place in the hover, although in the code they are placed completely different.

Stehsaer · 2025-09-23T14:22:48Z

Showing the Details: paragraph/section like in your example is intended.

The purpose is to create an own section for the detailed description, like for Parameters and Return etc. And in fact, this section is not empty. It "contains" the note command in your example.

In your example case the ruler after the Details: section heading is confusing maybe. But this ruler is actually part of the @note command. I added this ruler to try to emphasize the @note even more, like the markdown alerts would.

But we could also remove these rulers for the note, warning highlights to not separate it too much from the rest of the text.

Also note that the sections Parameters, Return, Details etc. are different than the highlighting of the note and warning commands. The highlighting is "inline" and can be used multiple times, whereas the sections are structural adaptions and unique.

This totally makes sense. Shall we make the Details: text a header (H5 or H6? I'm not quite sure), in order to make it bigger and separates more from other small headers like Note:?

And also, another case with unexpected behavior was found:

///
/// @brief Some brief
///
/// @param some_param Some description
/// @retval 0 on success
/// @retval 1 Kitty asleep
///
int worker_thread(const std::string& some_param);

Literally there's nothing under the Details: header, is there any empty line that is mistakenly parsed as a part of detail? Removing the trailing empty comment still yields the Details: header followed with empty content. Here are some further tests on this issue:

Simple comment with `@brief`

///
/// @brief Some brief
///
int worker_thread(const std::string& some_param);

Simple comment without `@brief`

///
/// Some brief
///
int worker_thread(const std::string& some_param);

Original case without `@brief`

///
/// Some brief
///
/// @param some_param Some description
/// @retval 0 on success
/// @retval 1 Kitty asleep
///
int worker_thread(const std::string& some_param);

…s for highlighted commands

tcottin · 2025-09-27T13:25:27Z

@Stehsaer Thanks for your feedback.

I fixed the bug with the empty details section.

Additionally I added a separate section for "Brief" now and changed all section headers to h3.
For the hover heading, h3 was initially chosen due to this vscode issue microsoft/vscode#88417 I think.
h4 and normal bold text are almost not distinguishable, hence I changed all section headings to h3 instead of bold text now.
Also removed the : in the heading.

For the note and warning highlight I removed the rulers.
I think the rulers are too confusing.

Hover with the newest changes:

/// Detailed description.
/// @note this is a note
///
/// Here is some more text of the detailed description.
/// @note another note
///
/// @brief This is the brief description.
///
/// @return return value description.
/// @param arg A function parameter.
/// @tparam T A template type parameter.
template<typename T>
bool func(T arg);

tcottin · 2025-10-01T11:14:56Z

ping @Stehsaer @emaxx-google

Stehsaer · 2025-10-01T13:17:03Z

@tcottin Sorry for the late reply. I've been testing the latest commit the last 2 days.

Feedbacks

Bugfix: the empty detail header bug was fixed and I didn't find any other edge case yet. It's working quite well.
H3 headers: I didn't see any problem with readability with H3 headers. They provide much better readability compared to the previous bold text.

Additional thoughts

By reviewing the code, I found that many texts, like the texts in the header, are hard-coded. I don't know if there is any localization support (or at least a plan if localization is not yet supported) in clangd, but I think it is worth considering to use an approach that's easier to modify or configure (eg. access the texts from the API that clangd uses to read the user configuration). What do you think?

tcottin · 2025-10-01T15:18:20Z

Thanks for the feedback!

I think adding localization support or any customizations to the hardcoded strings is out of the scope for this change.

emaxx-google

Looks good to me, thanks for working on this and for others to having tested it out! (The preprocessing part seems fragile though, however I don't have a better suggestion for this problem.)

tcottin · 2025-10-10T05:40:18Z

@emaxx-google thanks for the approval.
Could you also merge it?

HighCommander4 · 2025-10-10T05:43:45Z

@emaxx-google thanks for the approval. Could you also merge it?

I went ahead and did that. (More precisely, scheduled an auto-merge once the CI checks complete.)

…\retval command + change render kind of command arguments + add preprocessing for markdown code blocks/spans (llvm#156365) Fixes some followup issues from clangd/clangd#529. `\note` and `\warning`: In the hover card, they are now displayed with heading and enclosing rulers. `\retval` commands: Each `\retval` command is now a bullet point under the return section of the hover card. Added a Markdown preprocessing step before parsing the documentation with the doxygen parser. This mainly replaces markdown code blocks with `@code...@endcode` commands.

fix note, warning and retval command.

5e5b87c

Change arguments of doxygen commands to be rendered as monospaced

add markdown preprocessing to handle code blocks and code spans corre…

078294b

…ctly

tcottin mentioned this pull request Sep 15, 2025

Doxygen parsing missing clangd/clangd#529

Open

tcottin marked this pull request as ready for review September 17, 2025 05:18

llvmbot added clang-tools-extra clangd labels Sep 17, 2025

HighCommander4 requested a review from emaxx-google September 17, 2025 05:20

separate brief and details by using first paragraph as brief

cfbda5a

tcottin added 2 commits September 27, 2025 12:32

ignore empty freestanding paragraphs

d77969f

itroduce brief section, use h3 headings for all section, remove ruler…

ff0418a

…s for highlighted commands

tcottin mentioned this pull request Sep 27, 2025

Make the hover box more compact clangd/clangd#747

Open

emaxx-google approved these changes Oct 10, 2025

View reviewed changes

Merge branch 'main' into improve-doxygen-to-markup

1e5e870

HighCommander4 enabled auto-merge (squash) October 10, 2025 05:43

HighCommander4 merged commit 8a505a1 into llvm:main Oct 10, 2025
8 of 9 checks passed

[clangd] Doxygen Parsing: Add proper handling of \note, \warning and \retval command + change render kind of command arguments + add preprocessing for markdown code blocks/spans #156365

[clangd] Doxygen Parsing: Add proper handling of \note, \warning and \retval command + change render kind of command arguments + add preprocessing for markdown code blocks/spans #156365

Uh oh!

Conversation

tcottin commented Sep 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Stehsaer commented Sep 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

tcottin commented Sep 15, 2025

Uh oh!

tcottin commented Sep 17, 2025

Uh oh!

llvmbot commented Sep 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Stehsaer commented Sep 17, 2025

Attachment

Uh oh!

tcottin commented Sep 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

tcottin commented Sep 21, 2025

Uh oh!

Stehsaer commented Sep 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

tcottin commented Sep 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Stehsaer commented Sep 23, 2025

Simple comment with @brief

Simple comment without @brief

Original case without @brief

Uh oh!

tcottin commented Sep 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

tcottin commented Oct 1, 2025

Uh oh!

Stehsaer commented Oct 1, 2025

Feedbacks

Additional thoughts

Uh oh!

tcottin commented Oct 1, 2025

Uh oh!

emaxx-google left a comment

Choose a reason for hiding this comment

Uh oh!

tcottin commented Oct 10, 2025

Uh oh!

HighCommander4 commented Oct 10, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

tcottin commented Sep 1, 2025 •

edited

Loading

Stehsaer commented Sep 2, 2025 •

edited

Loading

llvmbot commented Sep 17, 2025 •

edited

Loading

tcottin commented Sep 19, 2025 •

edited

Loading

Stehsaer commented Sep 22, 2025 •

edited

Loading

tcottin commented Sep 22, 2025 •

edited

Loading

Simple comment with `@brief`

Simple comment without `@brief`

Original case without `@brief`

tcottin commented Sep 27, 2025 •

edited

Loading