Skip to content

Convert adoc tables to html tables#138

Merged
james-ball-qualcomm merged 1 commit intomainfrom
132-convert-adoc-like-table-syntax
Nov 27, 2025
Merged

Convert adoc tables to html tables#138
james-ball-qualcomm merged 1 commit intomainfrom
132-convert-adoc-like-table-syntax

Conversation

@james-ball-qualcomm
Copy link
Collaborator

No description provided.

Signed-off-by: James Ball <jameball@qti.qualcomm.com>
@james-ball-qualcomm james-ball-qualcomm merged commit 22a17b0 into main Nov 27, 2025
6 checks passed
@james-ball-qualcomm james-ball-qualcomm deleted the 132-convert-adoc-like-table-syntax branch November 27, 2025 20:04
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR converts AsciiDoc tables embedded in normative rule tags from a custom text format to proper HTML tables. The conversion handles tables with optional headers and truncates large tables to a maximum of 12 rows. The implementation modifies the tags backend to prefix table cells with pipes and introduces new conversion functions to parse and transform these tables into HTML.

  • Changes the "tags" Asciidoctor backend to prefix table cells with | instead of separating them
  • Adds convert_tags_tables_to_html() function to parse the pipe-prefixed format and generate HTML tables
  • Renames html_convert_newlines() to convert_newlines_to_html() for consistency
  • Fixes test data typo (Name7 paired with Color7 instead of Color8)

Reviewed changes

Copilot reviewed 7 out of 8 changed files in this pull request and generated 3 comments.

File Description
tools/create_normative_rules.rb Adds table-to-HTML conversion functions, renames newline conversion function, introduces MAX_TABLE_ROWS constant
converters/tags.rb Updates table cell formatting to prefix with pipes instead of separating
tests/norm-rule/test.adoc Fixes test data inconsistency and updates comment for clarity
tests/norm-rule/expected/. Updates expected outputs to reflect new HTML table format

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +1214 to +1230
# This pattern matches strings that:
# - Start with a non-pipe, non-whitespace character
# - Then contain zero or more non-pipe characters (can include internal spaces)
# - End with a non-pipe, non-whitespace character
#
# All leading/trailing whitespace is removed.
#
# Examples:
# "| H1 | H2".scan(/[^|\s][^|]*[^|\s]/)
# => ["H1", "H2"]
#
# "| ABC | DEF GHI |".scan(/[^|\s][^|]*[^|\s]/)
# => ["ABC", "DEF GHI"] # Note: internal space preserved
#
# "| Name | Value |".scan(/[^|\s][^|]*[^|\s]/)
# => ["Name", "Value"] # Leading/trailing spaces removed
text.scan(/[^|\s][^|]*[^|\s]/)
Copy link

Copilot AI Nov 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The regex pattern /[^|\s][^|]*[^|\s]/ requires at least two non-whitespace, non-pipe characters, which means it will fail to match single-character cell content. For example, a cell containing just "A" or "1" would not be extracted.

To fix this, the pattern should be changed to /[^|\s]([^|]*[^|\s])?/ to match either:

  • A single non-pipe, non-whitespace character, OR
  • A non-pipe, non-whitespace character followed by zero or more non-pipe characters, ending with a non-pipe, non-whitespace character

Alternatively, use a simpler approach: /\|([^|]*)/ and then trim whitespace from each captured group.

Suggested change
# This pattern matches strings that:
# - Start with a non-pipe, non-whitespace character
# - Then contain zero or more non-pipe characters (can include internal spaces)
# - End with a non-pipe, non-whitespace character
#
# All leading/trailing whitespace is removed.
#
# Examples:
# "| H1 | H2".scan(/[^|\s][^|]*[^|\s]/)
# => ["H1", "H2"]
#
# "| ABC | DEF GHI |".scan(/[^|\s][^|]*[^|\s]/)
# => ["ABC", "DEF GHI"] # Note: internal space preserved
#
# "| Name | Value |".scan(/[^|\s][^|]*[^|\s]/)
# => ["Name", "Value"] # Leading/trailing spaces removed
text.scan(/[^|\s][^|]*[^|\s]/)
# This pattern matches everything between pipes, including single-character and empty cells.
# All leading/trailing whitespace is removed from each cell.
#
# Examples:
# "| H1 | H2".scan(/\|([^|]*)/)
# => [["H1 "], [" H2"]]
# .map(&:strip) => ["H1", "H2"]
#
# "| A | 1 |".scan(/\|([^|]*)/)
# => [["A "], [" 1"]]
# .map(&:strip) => ["A", "1"]
#
# "| Name | Value |".scan(/\|([^|]*)/)
# => [[" Name "], [" Value "]]
# .map(&:strip) => ["Name", "Value"]
text.scan(/\|([^|]*)/).map { |m| m[0].strip }

Copilot uses AI. Check for mistakes.
Comment on lines +1168 to +1204
ret = text # Default to input

text.gsub(/(.*?)===\n(.+)\n===/m) do
# Found a "tags" formatted table
heading = $1.chomp # Remove trailing newline
rows = $2.split("\n") # Split into array of rows

ret = "<table>".dup # Start html table

# Add heading if present
heading_cells = extract_tags_table_cells(heading)
unless heading_cells.empty?
ret << "<thead>"
ret << "<tr>"
ret << heading_cells.map { |cell| "<th>#{cell}</th>" }.join("")
ret << "</tr>"
ret << "</thead>"
end

# Add each row
ret << "<tbody>"
rows.each_with_index do |row,index|
if index < MAX_TABLE_ROWS
ret << "<tr>"
row_cells = extract_tags_table_cells(row)
ret << row_cells.map { |cell| "<td>#{cell}</td>" }.join("")
ret << "</tr>"
elsif index == MAX_TABLE_ROWS
ret << "<tr>"
row_cells = extract_tags_table_cells(row)
ret << row_cells.map { |cell| "<td>...</td>" }.join("")
ret << "</tr>"
end
end

ret << "</tbody>"
ret << "</table>" # End html table
Copy link

Copilot AI Nov 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The function does not return the result of gsub. The variable ret is assigned inside the gsub block (line 1175), but this assignment is scoped to the block. The function returns the original ret (initialized to text on line 1168), not the transformed HTML table.

To fix this, assign the result of gsub to ret:

ret = text.gsub(/(.*?)===\n(.+)\n===/m) do
  html = "<table>".dup
  # ... build up html instead of ret ...
  html
end
return ret
Suggested change
ret = text # Default to input
text.gsub(/(.*?)===\n(.+)\n===/m) do
# Found a "tags" formatted table
heading = $1.chomp # Remove trailing newline
rows = $2.split("\n") # Split into array of rows
ret = "<table>".dup # Start html table
# Add heading if present
heading_cells = extract_tags_table_cells(heading)
unless heading_cells.empty?
ret << "<thead>"
ret << "<tr>"
ret << heading_cells.map { |cell| "<th>#{cell}</th>" }.join("")
ret << "</tr>"
ret << "</thead>"
end
# Add each row
ret << "<tbody>"
rows.each_with_index do |row,index|
if index < MAX_TABLE_ROWS
ret << "<tr>"
row_cells = extract_tags_table_cells(row)
ret << row_cells.map { |cell| "<td>#{cell}</td>" }.join("")
ret << "</tr>"
elsif index == MAX_TABLE_ROWS
ret << "<tr>"
row_cells = extract_tags_table_cells(row)
ret << row_cells.map { |cell| "<td>...</td>" }.join("")
ret << "</tr>"
end
end
ret << "</tbody>"
ret << "</table>" # End html table
ret = text.gsub(/(.*?)===\n(.+)\n===/m) do
# Found a "tags" formatted table
heading = $1.chomp # Remove trailing newline
rows = $2.split("\n") # Split into array of rows
html = "<table>".dup # Start html table
# Add heading if present
heading_cells = extract_tags_table_cells(heading)
unless heading_cells.empty?
html << "<thead>"
html << "<tr>"
html << heading_cells.map { |cell| "<th>#{cell}</th>" }.join("")
html << "</tr>"
html << "</thead>"
end
# Add each row
html << "<tbody>"
rows.each_with_index do |row,index|
if index < MAX_TABLE_ROWS
html << "<tr>"
row_cells = extract_tags_table_cells(row)
html << row_cells.map { |cell| "<td>#{cell}</td>" }.join("")
html << "</tr>"
elsif index == MAX_TABLE_ROWS
html << "<tr>"
row_cells = extract_tags_table_cells(row)
html << row_cells.map { |cell| "<td>...</td>" }.join("")
html << "</tr>"
end
end
html << "</tbody>"
html << "</table>" # End html table
html

Copilot uses AI. Check for mistakes.
elsif index == MAX_TABLE_ROWS
ret << "<tr>"
row_cells = extract_tags_table_cells(row)
ret << row_cells.map { |cell| "<td>...</td>" }.join("")
Copy link

Copilot AI Nov 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On line 1197, row_cells is extracted but then not used. Instead, line 1198 creates ... for every cell, which means the number of ... placeholders will match the column count. However, this defeats the purpose of extracting row_cells on line 1197.

The code should either:

  1. Remove line 1197 entirely since row_cells is not used, OR
  2. Use row_cells.size to determine how many ... cells to create

Currently, the unused variable may cause confusion about the code's intent.

Suggested change
ret << row_cells.map { |cell| "<td>...</td>" }.join("")
ret << row_cells.size.times.map { "<td>...</td>" }.join("")

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Convert adoc-like table syntax emitted by tags backend to HTML table

2 participants