feat: add support for additional HTML tags in RichView #78

graycreate · 2025-12-01T02:23:55Z

Summary

Add comprehensive HTML tag support for RichView component
Implement table rendering (converts HTML tables to Markdown table format)
Add strikethrough support (del, s, strike → ~~text~~)
Add underline support (u, ins → _text_)
Add superscript/subscript support (sup, sub)
Add mark/highlight support (mark → ==text==)
Add definition list support (dl, dt, dd)
Add semantic element support (abbr, cite, kbd, samp, var, small)
Add figure elements (figure, figcaption)
Add document structure elements (address, time, details, summary)
Add container elements (article, section, nav, aside, header, footer, main, caption)

This fixes the issue where V2EX posts containing tables (like https://www.v2ex.com/t/1176026) were not rendering correctly.

Test plan

Added unit tests for all new HTML tag conversions
Added tests for table conversion with thead/tbody
Added tests for strikethrough, underline, mark rendering
Added performance test for complex tables
Build succeeded
All existing tests pass

🤖 Generated with Claude Code

Add support for previously unsupported HTML tags: - Table support (table, thead, tbody, tfoot, tr, th, td) - Strikethrough (del, s, strike) - Underline (u, ins) - Superscript/subscript (sup, sub) - Mark/highlight (mark) - Definition lists (dl, dt, dd) - Semantic elements (abbr, cite, kbd, samp, var, small) - Figure elements (figure, figcaption) - Document structure (address, time, details, summary) - Container elements (article, section, nav, aside, header, footer, main, caption) Also added corresponding tests for all new HTML tags and strikethrough/highlight rendering in MarkdownRenderer. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>

Copilot

Pull request overview

This PR adds comprehensive HTML tag support to the RichView component, enabling proper rendering of V2EX posts that contain tables and various semantic HTML elements. The implementation converts HTML to markdown format which is then rendered with appropriate styling.

Key changes:

Implements table conversion from HTML to markdown format with support for thead/tbody
Adds support for text formatting tags (strikethrough, underline, superscript/subscript, highlight)
Adds semantic HTML element support (abbreviations, citations, keyboard input, etc.)
Includes extensive unit tests for HTML to markdown conversion
Adds performance tests for complex table rendering

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 6 comments.

File	Description
V2er/Sources/RichView/Converters/HTMLToMarkdownConverter.swift	Adds conversion logic for tables, strikethrough, underline, superscript/subscript, mark/highlight, definition lists, and semantic elements to markdown format
V2er/Sources/RichView/Renderers/MarkdownRenderer.swift	Implements markdown table rendering with visual styling, adds strikethrough and highlight rendering support
V2erTests/RichView/HTMLToMarkdownConverterTests.swift	Comprehensive test coverage for all new HTML tag conversions including tables, formatting tags, semantic elements, and performance tests

Copilot · 2025-12-01T02:28:18Z

V2er/Sources/RichView/Converters/HTMLToMarkdownConverter.swift

+                // Underline - no standard markdown, render as emphasized text
+                case "u", "ins":
+                    let content = try convertElement(childElement)
+                    result += "_\(content)_"


The underline HTML tags ( and <ins>) are converted to markdown format _text_, but the MarkdownRenderer does not have logic to render this format with actual underline styling. The underscore format will be interpreted as italic (similar to *text*), not underline.

Consider either:

Adding underline rendering support in renderInlineMarkdown() method to handle _text_ with underlineStyle = .single

Using a different markdown format for underline that doesn't conflict with italic (e.g., text or custom markers)

Suggested change

// Underline - no standard markdown, render as emphasized text

case "u", "ins":

let content = try convertElement(childElement)

result += "_\(content)_"

// Underline - no standard markdown, preserve as HTML for custom renderer

case "u", "ins":

let content = try convertElement(childElement)

result += "\(content)"

Copilot · 2025-12-01T02:28:18Z

V2er/Sources/RichView/Converters/HTMLToMarkdownConverter.swift

+                // Superscript/subscript - render with markers
+                case "sup":
+                    let content = try convertElement(childElement)
+                    result += "^\(content)"
+
+                case "sub":
+                    let content = try convertElement(childElement)
+                    result += "~\(content)"


Superscript and subscript HTML tags are converted to markdown format (^text and ~text), but the MarkdownRenderer does not have logic to render these formats. These will be rendered as plain text without any special styling.

Additionally, the subscript marker ~ could conflict with regular tildes in text (e.g., "~5 minutes ago"). Consider:

Adding rendering support in renderInlineMarkdown() for ^ and ~ markers

Using a more distinctive marker that's less likely to appear in regular text (e.g., ^{text} or HTML-style text)

Suggested change

// Superscript/subscript - render with markers

case "sup":

let content = try convertElement(childElement)

result += "^\(content)"

case "sub":

let content = try convertElement(childElement)

result += "~\(content)"

// Superscript/subscript - render as HTML tags to preserve formatting

case "sup":

let content = try convertElement(childElement)

result += "\(content)"

case "sub":

let content = try convertElement(childElement)

result += "\(content)"

Copilot · 2025-12-01T02:28:18Z

V2er/Sources/RichView/Renderers/MarkdownRenderer.swift

+            // Skip separator row (| --- | --- |)
+            if line.contains("---") {


The table separator detection using line.contains("---") could incorrectly skip rows that legitimately contain three consecutive hyphens in their cell content (e.g., a cell with "2020---2025" or "pre---post").

Consider a more precise check that verifies the line matches the separator pattern more strictly:

if line.range(of: #"^\|\s*---+\s*(\|\s*---+\s*)*\|$"#, options: .regularExpression) != nil {

This ensures the line consists only of pipes and hyphens in the expected separator format.

Suggested change

// Skip separator row (| --- | --- |)

if line.contains("---") {

// Skip separator row (| --- | --- | or with colons for alignment)

if line.range(of: #"^\|\s*(:?-+:?)\s*(\|\s*(:?-+:?)\s*)*\|$"#, options: .regularExpression) != nil {

Copilot · 2025-12-01T02:28:19Z

V2er/Sources/RichView/Converters/HTMLToMarkdownConverter.swift

+                let tagName = cell.tagName().lowercased()
+                if tagName == "th" || tagName == "td" {
+                    let content = try convertElement(cell)
+                        .replacingOccurrences(of: "\n", with: " ")


Table cells containing pipe characters (|) are not escaped, which will break the markdown table structure. For example, a cell containing "option A | option B" would be split into multiple columns.

Consider escaping pipe characters in cell content before building the table:

let content = try convertElement(cell) .replacingOccurrences(of: "\n", with: " ") .replacingOccurrences(of: "|", with: "\\|") // Escape pipes .trimmingCharacters(in: .whitespaces)

Suggested change

.replacingOccurrences(of: "\n", with: " ")

.replacingOccurrences(of: "\n", with: " ")

.replacingOccurrences(of: "|", with: "\\|") // Escape pipes for Markdown tables

Copilot · 2025-12-01T02:28:19Z

V2er/Sources/RichView/Converters/HTMLToMarkdownConverter.swift

+        var headerRowCount = 0
+
+        // Get all rows from thead and tbody
+        let allRows = try element.select("tr")
+
+        for row in allRows {
+            var cells: [String] = []
+            let thCells = try row.select("th")
+            let isHeaderRow = row.parent()?.tagName().lowercased() == "thead"
+                || !thCells.isEmpty()
+
+            // Get th and td cells
+            for cell in row.children() {
+                let tagName = cell.tagName().lowercased()
+                if tagName == "th" || tagName == "td" {
+                    let content = try convertElement(cell)
+                        .replacingOccurrences(of: "\n", with: " ")
+                        .trimmingCharacters(in: .whitespaces)
+                    cells.append(content)
+                }
+            }
+
+            if !cells.isEmpty {
+                rows.append(cells)
+                if isHeaderRow && headerRowCount == 0 {
+                    headerRowCount = 1
+                }


[nitpick] The headerRowCount variable is incremented when a header row is detected but is never used to influence the table conversion logic. The separator is unconditionally added after the first row (line 378) regardless of whether a header was detected.

Consider either:

Removing this unused variable and the isHeaderRow logic if it's not needed

Using it to control separator placement (though markdown tables always require a separator after the first row, so the current behavior may be intentional)

github-actions · 2025-12-01T02:29:35Z

Code Coverage Report ❌

Current coverage: 17.7%

- Preserve HTML tags for underline instead of converting to _text_ to avoid conflict with italic styling - Preserve / HTML tags for superscript/subscript instead of ^/~ markers to avoid conflicts with regular text - Use regex pattern for table separator detection to avoid false positives when cell content contains "---" - Escape pipe characters in table cells to prevent markdown table structure breakage - Remove unused headerRowCount variable and related code - Update tests to reflect new behavior - Add test for pipe escaping in table cells 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>

github-actions · 2025-12-01T10:22:56Z

Code Coverage Report ❌

Current coverage: 25.62%

Copilot AI review requested due to automatic review settings December 1, 2025 02:23

github-actions bot added the size/XL label Dec 1, 2025

Copilot started reviewing on behalf of graycreate December 1, 2025 02:24 View session

Copilot finished reviewing on behalf of graycreate December 1, 2025 02:27

Copilot AI reviewed Dec 1, 2025

View reviewed changes

github-actions bot added size/XL and removed size/XL labels Dec 1, 2025

graycreate merged commit 75f5d90 into main Dec 1, 2025
6 checks passed

graycreate deleted the feature/richview-html-tags-support branch December 1, 2025 10:23

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: add support for additional HTML tags in RichView #78

feat: add support for additional HTML tags in RichView #78

Uh oh!

graycreate commented Dec 1, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Dec 1, 2025

Uh oh!

Copilot AI Dec 1, 2025

Uh oh!

Copilot AI Dec 1, 2025

Uh oh!

Copilot AI Dec 1, 2025

Uh oh!

Copilot AI Dec 1, 2025

Uh oh!

github-actions bot commented Dec 1, 2025

Uh oh!

github-actions bot commented Dec 1, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		// Skip separator row (\| --- \| --- \|)
		if line.contains("---") {

	.replacingOccurrences(of: "\n", with: " ")
	.replacingOccurrences(of: "\n", with: " ")
	.replacingOccurrences(of: "\|", with: "\\\|") // Escape pipes for Markdown tables

feat: add support for additional HTML tags in RichView #78

feat: add support for additional HTML tags in RichView #78

Uh oh!

Conversation

graycreate commented Dec 1, 2025

Summary

Test plan

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Dec 1, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 1, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 1, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 1, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 1, 2025

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Dec 1, 2025

Code Coverage Report ❌

Uh oh!

github-actions bot commented Dec 1, 2025

Code Coverage Report ❌

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants