Protect included code snippets from "recursive" highlighting#1441
Protect included code snippets from "recursive" highlighting#1441woopsn wants to merge 2 commits intolsegal:mainfrom
Conversation
lsegal
left a comment
There was a problem hiding this comment.
Thanks for contributing! I added a few inline comments; in short, I'm all for this fix but there are a few blockers in the current approach in the form of a breaking change and regression.
Those would need to be resolved before getting this through.
| # @return [String] highlighted html | ||
| # @see #html_syntax_highlight | ||
| def parse_codeblocks(html) | ||
| def parse_codeblocks(html, internal=false) |
There was a problem hiding this comment.
nit pick here but this should be styled as internal = false with spaces.
| # To add a custom markup type, see {MarkupHelper} | ||
| # @return [String] the HTML | ||
| def htmlify(text, markup = options.markup) | ||
| def htmlify(text, markup = options.markup, internal: false) |
There was a problem hiding this comment.
Unfortunately using keyword args limits us to Ruby 3.x and would be a breaking change that could not be merged.
The good news is this could easily be changed to a positional arg.
| language ||= object.source_type | ||
|
|
||
| if options.highlight | ||
| pre_attrs = Array %(class="code #{language}") |
There was a problem hiding this comment.
This is an odd syntax. Did you simply mean pre_attrs = %(class="code #{language}")?
| output | ||
| rescue Parser::ParserSyntaxError | ||
| source =~ /^<span\s+class=/ ? source : h(source) | ||
| h(source) |
There was a problem hiding this comment.
This is a subtle but problematic change that I think will cause a regression. This change was explicitly added to permit pre-formatted HTML extra-files to be added via: yard - OTHER.html where OTHER.html might include:
<pre class="code ruby"><code class="ruby"><span class='kw'>return</span></code></pre>The output of this section would generate double encoded output if you ran it through this block:
If removing this HTML detection causes a failure in your implementation, you may have to rethink part of your approach.
8eab01f to
9273d42
Compare
9273d42 to
5d3e04f
Compare
5d3e04f to
a9babdb
Compare
|
Thanks for taking a look! I fixed the syntax and style issues you mentioned, and restored the bypass behavior for pre-formatted blocks. I'd misinterpreted what that check was for and thought it was to account for the same recursive/nested case that this branch considers. I brought the check back, but lifted it out of the rescue block and into and into yard/lib/yard/templates/helpers/html_helper.rb Lines 206 to 208 in a9babdb yard/spec/templates/helpers/html_helper_spec.rb Lines 675 to 679 in a9babdb Does this look alright? It seems slightly better to not feed non-ruby input to the parser and catch the exception, but the rest of my changes should be compatible either way. |


Description
This PR aims to fix an issue with how yard handles syntax highlighted snippets from
{include:file:...}source.When an included file contains code blocks, they'll be fed through #parse_codeblocks more than once, as the file and then the parent are processed depth first. After the first highlighting pass, the code block contains additional
<span>s and escapes (<etc), and later passes are unable to tell that these weren't part of the original source.yard's built-in ruby highlighter does work in these cases -- its tokenizer tends to raise an exception when the source looks like HTML instead of Ruby, and by inspecting the source it can tell that the block was already highlighted:
Plugins, on the other hand, are unaware that the code snippets they're given in some cases are already marked up. This leads to the final document showing a mess of escaped html within the
<pre>.[1]In this branch, data-highlighted=true is added to code snippets when they're processed through
{include:...}, allowing them to be skipped during subsequent passes through #parse_codeblocks. The attribute is then removed by the outermost #htmlify call, so it's only an internal annotation and shouldn't make it into the final output (unless a theme calls #insert_include directly).[1] This can be seen in the output of
Completed Tasks
bundle exec rakelocally (if code is attached to PR).