Skip to content

Conversation

GuillaumeGomez
Copy link
Member

I was very bothered by the complexity of this file, in particular the handling of pending_elems which was very tricky to follow.

So instead, here comes a more sane approach: the content is store in a stack-like type which handles "levels" of HTML (ie, a macro expansion can contain other HTML tags which can themselves contain other, etc). Making it much simpler to keep track of what's going on.

r? @lolbinarycat

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-rustdoc Relevant to the rustdoc team, which will review and decide on the PR/issue. T-rustdoc-frontend Relevant to the rustdoc-frontend team, which will review and decide on the web UI/UX output. labels Sep 24, 2025
@GuillaumeGomez
Copy link
Member Author

Also need to check the impact on performance (likely slower).

@bors try @rust-timer queue

@rust-timer

This comment has been minimized.

rust-bors bot added a commit that referenced this pull request Sep 24, 2025
@rust-bors

This comment has been minimized.

@rustbot rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Sep 24, 2025
@rust-bors
Copy link

rust-bors bot commented Sep 24, 2025

☀️ Try build successful (CI)
Build commit: 6020c97 (6020c97e3046a35e53fd9885eda164c570010a6c, parent: 15283f6fe95e5b604273d13a428bab5fc0788f5a)

@rust-timer

This comment has been minimized.

let mut closing_tag = None;
for part in &self.content {
let text: &dyn Display =
if part.needs_escape { &EscapeBodyText(&part.text) } else { &part.text };
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI, Either impls Display, which can be nicer than a dyn ref (and maybe slightly more performant)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea, thanks!

Comment on lines +256 to +240
for part in elem.content.drain(..) {
last.content.push(part);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
for part in elem.content.drain(..) {
last.content.push(part);
}
last.content.append(&mut elem.content);

Both shorter and might also be slightly more performant (can probably pre-reserve just enough capacity in target vector)

Comment on lines +282 to +266
for elem in elements {
self.elements.push(elem);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
for elem in elements {
self.elements.push(elem);
}
self.elements.extend(elements);

Same deal as https://github.com/rust-lang/rust/pull/146992/files#r2376863766

@yotamofek
Copy link
Contributor

Code reads much better IMHO!

@rust-timer
Copy link
Collaborator

Finished benchmarking commit (6020c97): comparison URL.

Overall result: ❌ regressions - please read the text below

Benchmarking this pull request means it may be perf-sensitive – we'll automatically label it not fit for rolling up. You can override this, but we strongly advise not to, due to possible changes in compiler perf.

Next Steps: If you can justify the regressions found in this try perf run, please do so in sufficient writing along with @rustbot label: +perf-regression-triaged. If not, please fix the regressions and do another perf run. If its results are neutral or positive, the label will be automatically removed.

@bors rollup=never
@rustbot label: -S-waiting-on-perf +perf-regression

Instruction count

Our most reliable metric. Used to determine the overall result above. However, even this metric can be noisy.

mean range count
Regressions ❌
(primary)
2.2% [0.7%, 6.6%] 19
Regressions ❌
(secondary)
3.2% [0.1%, 13.3%] 17
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) 2.2% [0.7%, 6.6%] 19

Max RSS (memory usage)

Results (primary 0.4%, secondary 12.0%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

mean range count
Regressions ❌
(primary)
6.2% [6.2%, 6.2%] 1
Regressions ❌
(secondary)
12.0% [2.6%, 31.9%] 6
Improvements ✅
(primary)
-2.5% [-2.8%, -2.2%] 2
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) 0.4% [-2.8%, 6.2%] 3

Cycles

Results (primary 2.8%, secondary 5.9%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

mean range count
Regressions ❌
(primary)
4.3% [3.2%, 6.5%] 3
Regressions ❌
(secondary)
8.3% [3.6%, 12.8%] 6
Improvements ✅
(primary)
-1.6% [-1.6%, -1.6%] 1
Improvements ✅
(secondary)
-1.5% [-1.5%, -1.4%] 2
All ❌✅ (primary) 2.8% [-1.6%, 6.5%] 4

Binary size

This benchmark run did not return any relevant results for this metric.

Bootstrap: 471.213s -> 470.794s (-0.09%)
Artifact size: 387.83 MiB -> 387.93 MiB (0.03%)

@rustbot rustbot added perf-regression Performance regression. and removed S-waiting-on-perf Status: Waiting on a perf run to be completed. labels Sep 24, 2025
@GuillaumeGomez
Copy link
Member Author

Code reads much better IMHO!

Agreed (unsurprisingly 😆), but sadly I think this solution is unlikely to get much better performance-wise so unlikely it'll be merged.

However I now have a much cleaner code, so I think I'll go back to the original "streaming content" but with a much cleaner approach.

@lolbinarycat
Copy link
Contributor

If I had to guess the reason for the perf regression, I would say it probably has to do with all the extra intermediate string allocations. I feel like if we had a way to delay formatting (maybe using an enum, or with dyn Display, or just make it so the final writer is given up front), this would have a lot less overhead.

@GuillaumeGomez
Copy link
Member Author

Possibly. Want to give a try pushing it even further before I try to turn this back into a streaming algorithm? Same question for you @yotamofek. 😉

Start from my branch and open PRs with your commits so we can run perf check on them.

@yotamofek
Copy link
Contributor

I'll give it a go, but my gut says the extra string allocations aren't causing the lion's share of the regressions. Worth a shot though

@bors
Copy link
Collaborator

bors commented Sep 26, 2025

☔ The latest upstream changes (presumably #147037) made this pull request unmergeable. Please resolve the merge conflicts.

@rustbot
Copy link
Collaborator

rustbot commented Sep 27, 2025

This PR was rebased onto a different master commit. Here's a range-diff highlighting what actually changed.

Rebasing is a normal part of keeping PRs up to date, so no action is needed—this note is just to help reviewers.

@bors
Copy link
Collaborator

bors commented Oct 6, 2025

☔ The latest upstream changes (presumably #147397) made this pull request unmergeable. Please resolve the merge conflicts.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
perf-regression Performance regression. S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-rustdoc Relevant to the rustdoc team, which will review and decide on the PR/issue. T-rustdoc-frontend Relevant to the rustdoc-frontend team, which will review and decide on the web UI/UX output.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants