Skip to content
Closed
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion Lib/_markupbase.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@

_declname_match = re.compile(r'[a-zA-Z][-_.a-zA-Z0-9]*\s*').match
_declstringlit_match = re.compile(r'(\'[^\']*\'|"[^"]*")\s*').match
_commentclose = re.compile(r'--\s*>')
_commentclose = re.compile(r'--!?>')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would leave the \s*, even though I should double check what the HTML5 specs say exactly.

Copy link
Contributor Author

@Privat33r-dev Privat33r-dev Apr 1, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would leave the \s*, even though I should double check what the HTML5 specs say exactly.

I provided the links to HTML5 specification earlier and "\s*" mentioned nowhere, moreover, my tests with latest versions of Firefox and Chrome has shown that it's in fact an incorrect behaviour and is not considered a closing tag by modern browsers. Thus I see no reason in keeping it (nor spec, nor common practice).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://html.spec.whatwg.org/#comment-end-state is the section of the specs I was looking for. It does indeed mention the ! but not the spaces, so updating the code accordingly sounds good to me.

Do you want to add tests to check these (-->, --!>, -- >, --x>, --->, etc.) cases?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://html.spec.whatwg.org/#comment-end-state is the section of the specs I was looking for. It does indeed mention the ! but not the spaces, so updating the code accordingly sounds good to me.

Do you want to add tests to check these (-->, --!>, -- >, --x>, --->, etc.) cases?

I am thinking about improving the solution to even include <!-->, unexpected EOF and similar other test cases (that were mentioned in a similar PR), but at the moment, unfortunately, I am lacking time to work on this PR. Hopefully, in the week (or at the weekend at worst) I can add the test cases and change a few other parts of the code to handle even wider variety of edge cases.

_markedsectionclose = re.compile(r']\s*]\s*>')

# An analysis of the MS-Word extensions is available at
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
Follow the `parsing recommendation <https://html.spec.whatwg.org/multipage/parsing.html#parse-error-incorrectly-closed-comment>`_ and `standard <https://html.spec.whatwg.org/#comments>`_ for closing comment tag in the :mod:`html.parser`. Increased compliance leads to predictable behavior, thus enhancing security.