Skip to content

Commit 77b1e65

Browse files
evelez7asl
authored andcommitted
address review feedback
1 parent c3959c6 commit 77b1e65

File tree

1 file changed

+29
-30
lines changed

1 file changed

+29
-30
lines changed

content/posts/2025-gsoc-clang-doc.md

Lines changed: 29 additions & 30 deletions
Original file line numberDiff line numberDiff line change
@@ -37,10 +37,10 @@ Here's a quick overview on Clang-Doc's architecture, which follows a map-reduce
3737
<img src="/img/gsoc-2025-clang-doc-architecture.png"><br/>
3838
</div>
3939

40-
1. Visit source declarations via Clang's ASTVisitor.
41-
2. Serialize relevant source information into an Info (Clang-Doc's main data entity).
42-
3. Write Infos into bitcode, reduce, and reread.
43-
4. Serialize Infos into the desired format with a target backend.
40+
1. Visit source declarations via Clang's `ASTVisitor`.
41+
2. Serialize relevant source information into an `Info` (Clang-Doc's main data entity).
42+
3. Write `Info`s into bitcode, reduce, and reread.
43+
4. Serialize `Info`s into the desired format with a target backend.
4444

4545
The architecture seems straightforward at a glance, but Clang-Doc has critical flaws at step 4.
4646

@@ -56,7 +56,7 @@ That sounds great in principal, but the backend pipeline's execution made develo
5656
Unlike in LLVM, Clang-Doc doesn't have a framework like CodeGen that shares functionality across different targets.
5757
To document a `class`, every backend needs to independently implement logic to serialize the `class` into its target format.
5858
Each backend also has separate logic to write all of the documented entities to disk.
59-
There is also no IR where Infos can be preprocessed, which means that any organizational preprocessing done in a backend cant be shared.
59+
There is also no IR where `Info`s can be preprocessed, which means that any organizational preprocessing done in a backend can't be shared.
6060

6161
Here's the code for serializing the bases and virtual bases of a class in the HTML backend:
6262

@@ -107,7 +107,7 @@ There's a logical disconnect: what's serialized in YAML isn't guaranteed to refl
107107
## The Good
108108

109109
The good news is that Clang-Doc's recent improvements had brought in changes that could rectify these problems, with a bit more work.
110-
Last year's GSoC brought in great improvements that became the basis of my summer.
110+
[Last year's GSoC](https://blog.llvm.org/posts/2024-12-04-improve-clang-doc/) brought in great improvements that became the basis of my summer.
111111
First, last year's GSoC contributor landed a large performance improvement.
112112
I might not have been able to test Clang-Doc on Clang itself without it.
113113

@@ -136,7 +136,7 @@ Markdown generation would be a similar case where templates would be used to aut
136136
This diagram models the architecture that Clang-Doc would follow given a unified JSON backend.
137137
Note the similarities to Clang, where our frontend (the visitation/serialization) gathers all the information we need and emits an intermediate representation (JSON).
138138
The JSON is then fed to the desired templates to produce our documentation, similar to how IR is used for different LLVM backends.
139-
Following this pattern would reduce the logic maintenance to only the JSON generation; all the formatting for HTML, Markdown, etc. would exist in template files that are very simple to change and neatly separates documentation logic from display/formatting logic.
139+
Following this pattern would reduce the maintenance to only the JSON generation; all the formatting for HTML, Markdown, etc. would exist in template files that are very simple to change and neatly separates documentation logic from display/formatting logic.
140140
Also note how much more streamlined it is compared to the previous diagram where serialization logic was separated among Clang-Doc's backends.
141141

142142
Thus, I adapted the JSON logic from the Mustache backend and create a separate JSON backend.
@@ -191,25 +191,25 @@ All of the logic to order them needs to be done in the serialization to JSON its
191191

192192
Previously, Clang-Doc's comments were organized exactly as in Clang's AST like the following:
193193

194-
- FullComment
195-
- BriefComment
196-
- ParagraphComment
197-
- TextComment
198-
- TextComment
199-
- BriefComment
200-
- ParagraphComment
194+
- `FullComment`
195+
- `BriefComment`
196+
- `ParagraphComment`
197+
- `TextComment`
198+
- `TextComment`
199+
- `BriefComment`
200+
- `ParagraphComment`
201201

202-
Everything was unnecessarily nested under a FullComment, and TextComments were also unnecessarily nested.
203-
Every non-verbatim comment's text was held in one ParagraphComment.
204-
Since there was only one, we could reduce some boilerplate by directly mapping to the array of TextComments.
202+
Everything was unnecessarily nested under a `FullComment`, and `TextComment`s were also unnecessarily nested.
203+
Every non-verbatim comment's text was held in one `ParagraphComment`.
204+
Since there was only one, we could reduce some boilerplate by directly mapping to the array of `TextComment`s.
205205

206206
After the change, Clang-Doc's comments were structured like this:
207207

208-
- BriefComments
209-
- TextCommentArray
210-
- TextCommentArray
211-
- ParagraphComments
212-
- TextCommentArray
208+
- `BriefComments`
209+
- `TextCommentArray`
210+
- `TextCommentArray`
211+
- `ParagraphComments`
212+
- `TextCommentArray`
213213

214214
Now, we can just iterate over every type of comment, which means iterating over every array.
215215
This left our JSON documentation with a few more fields, since one is needed for every Doxygen command, but with easier identification of what comments exist in the documentation.
@@ -255,7 +255,7 @@ We would have to parse any potential HTML in comments anyways.
255255
## A Parser Solution
256256
Without an out-of-the-box solution, we were left with implementing our own parser.
257257
When I considered this in my proposal, I knew an in-tree parser would want to conform to the simplest possible standard.
258-
Markdown has no official standard, so I opted for CommonMark conformance.
258+
Markdown has no official standard, so I opted for [CommonMark](https://commonmark.org/) conformance.
259259

260260
The summer ended without a complete solution since a couple weeks were spent researching whether or not this could be integrated directly in the Clang comment parser or whether we'd need to build our own solution or not.
261261
You can see my initial draft [here](https://github.com/llvm/llvm-project/pull/155887).
@@ -315,12 +315,11 @@ Here are the pull requests I made for refactors during the project:
315315
- [refactor JSON for better Mustache compatibility](https://github.com/llvm/llvm-project/pull/149588)
316316

317317
# Overview
318-
I implemented a new JSON generator that will serve as the basis for Clang-Doc's documentation generation.
319-
This will vastly reduce overall lines of code and maintenance burdens.
320-
I added a lot of tests to increase code coverage and ensure we are serializing all the information necessary for high-quality documentation.
321-
I refactored our comment handling to streamline the logic that handles them and for better output in the HTML.
322-
I also explored options for rendering Markdown and began an implementation for a parser that I plan on working on in the future.
323-
Along the way, I also did some refactoring to improve code reuse and improved maintenance burdens by reducing boilerplate code.
318+
- I implemented a new JSON generator that will serve as the basis for Clang-Doc's documentation generation. This will vastly reduce overall lines of code and maintenance burdens.
319+
- I added a lot of tests to increase code coverage and ensure we are serializing all the information necessary for high-quality documentation.
320+
- I refactored our comment handling to streamline the logic that handles them and for better output in the HTML.
321+
- I also explored options for rendering Markdown and began an implementation for a parser that I plan on working on in the future.
322+
- Along the way, I also did some refactoring to improve code reuse and improved maintenance burdens by reducing boilerplate code.
324323

325324
After my work this summer, Clang-Doc is nearly ready to switch to HTML generation via Mustache templates, which will be a huge milestone.
326325
It is backed by the JSON generator which will allow for a much more flexible architecture that will change how we generate other documentation formats like our existing Markdown backend.
@@ -365,7 +364,7 @@ Doxygen also displays where an entity is referenced, like where a function is in
365364
Clang-Doc currently has no support for this kind of behavior.
366365
367366
Clang-Doc would need a preprocessing step where any reference to another entity is identified and then resolved somewhere.
368-
One of my mentors pointed out that it would be great to do during the reduction step where every Info is being visited anyways.
367+
One of my mentors pointed out that it would be great to do during the reduction step where every `Info` is being visited anyways.
369368
This actually wasn't something I had even considered in my proposal besides identifying that `@copydoc` wasn't supported by the comment parser.
370369
It's a common feature of modern documentation, so hopefully someday soon Clang-Doc can acquire it.
371370

0 commit comments

Comments
 (0)