Skip to content
Merged
Show file tree
Hide file tree
Changes from 6 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 3 additions & 2 deletions tests/norm-rule/expected/test-norm-rules.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -12,13 +12,14 @@
| [Zicsr, ABC] | Rule Instances
| inside inline a| link:test.html#norm:inline[norm:inline]

.2+| no_tag
| Normative rule *without* tag/tags | Rule's 'summary' property
.3+| no_tag
| Normative rule *without* tag/tags and *nested **bold** cases*. | Rule's 'summary' property
| This normative rule has no references to the standard. This should only be used in extraordinary circumstances.
It does include a link to <<table1>> (another normative rule).
Has basic adoc formatting such as *bold*, ita__lics__, `monospace`, 2^superscript^, ~subscript~, [.underline]#underline#,
and &le; (Unicode text for less-than-equals-to) and &#8800; (Unicode decimal value for not-equal-to).
| Rule's 'note' property
| Let's try a nested *_bold italics_* case or all 3 *_`bold italic monospace`_* too. | Rule's 'description' property

.1+| inline-with-hash
| includes a hash # symbol. a| link:test.html#norm:inline-with-hash[norm:inline-with-hash]
Expand Down
8 changes: 6 additions & 2 deletions tests/norm-rule/expected/test-norm-rules.html
Original file line number Diff line number Diff line change
Expand Up @@ -130,14 +130,18 @@ <h3>my-chapter_name</h3>
<td><a href="test.html#norm:inline">norm:inline</a></td>
</tr>
<tr>
<td rowspan=2 id="no_tag">no_tag</td>
<td>Normative rule <b>without</b> tag/tags</td>
<td rowspan=3 id="no_tag">no_tag</td>
<td>Normative rule <b>without</b> tag/tags and <b>nested <b>bold</b> cases</b>.</td>
<td>Rule's "summary" property</td>
</tr>
<tr>
<td>This normative rule has no references to the standard. This should only be used in extraordinary circumstances.<br>It does include a link to <a href="#table1">table1</a> (another normative rule).<br>Has basic adoc formatting such as <b>bold</b>, ita<i>lics</i>, <code>monospace</code>, 2<sup>superscript</sup>, <sub>subscript</sub>, <span class="underline">underline</span>,<br>and &#8804; (Unicode text for less-than-equals-to) and &#8800; (Unicode decimal value for not-equal-to).<br></td>
<td>Rule's "note" property</td>
</tr>
<tr>
<td>Let's try a nested <b><i>bold italics</i></b> case or all 3 <b><i><code>bold italic monospace</code></i></b> too.</td>
<td>Rule's "description" property</td>
</tr>
<tr>
<td rowspan=1 id="inline-with-hash">inline-with-hash</td>
<td>includes a hash # symbol.</td>
Expand Down
3 changes: 2 additions & 1 deletion tests/norm-rule/expected/test-norm-rules.json
Original file line number Diff line number Diff line change
Expand Up @@ -23,8 +23,9 @@
"name": "no_tag",
"def_filename": "tests/norm-rule/test.yaml",
"chapter_name": "my-chapter_name",
"summary": "Normative rule *without* tag/tags",
"summary": "Normative rule *without* tag/tags and *nested **bold** cases*.",
"note": "This normative rule has no references to the standard. This should only be used in extraordinary circumstances.\nIt does include a link to <<table1>> (another normative rule).\nHas basic adoc formatting such as *bold*, ita__lics__, `monospace`, 2^superscript^, ~subscript~, [.underline]#underline#,\nand &le; (Unicode text for less-than-equals-to) and &#8800; (Unicode decimal value for not-equal-to).\n",
"description": "Let's try a nested *_bold italics_* case or all 3 *_`bold italic monospace`_* too.",
"tags": []
},
{
Expand Down
Binary file modified tests/norm-rule/expected/test-norm-rules.xlsx
Binary file not shown.
3 changes: 2 additions & 1 deletion tests/norm-rule/test.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,8 @@ normative_rule_definitions:
instances: [Zicsr, ABC]
tag: "norm:inline"
- name: no_tag
summary: Normative rule *without* tag/tags
summary: Normative rule *without* tag/tags and *nested **bold** cases*.
description: Let's try a nested *_bold italics_* case or all 3 *_`bold italic monospace`_* too.
note: |
This normative rule has no references to the standard. This should only be used in extraordinary circumstances.
It does include a link to <<table1>> (another normative rule).
Expand Down
73 changes: 49 additions & 24 deletions tools/create_normative_rules.rb
Original file line number Diff line number Diff line change
Expand Up @@ -568,16 +568,16 @@ module Adoc2HTML

# Apply constrained formatting pair transformation
# Single delimiter, bounded by whitespace/punctuation
# Matches: *text*, _text_, ^text^, ~text~
# Example: "That is *strong* stuff!" or "This is *strong*!"
#
# @param text [String] The text to transform
# @param delimiter [String] The formatting delimiter (e.g., '*', '_', '^', '~')
# @param delimiter [String] The formatting delimiter (e.g., '*', '_', '`')
# @param recursive [Boolean] Whether to recursively process nested formatting
# @yield [content] Block that transforms the captured content
# @yieldparam content [String] The text between the delimiters
# @yieldreturn [String] The transformed content
# @return [String] The text with formatting applied
def constrained_format_pattern(text, delimiter, &block)
def constrained_format_pattern(text, delimiter, recursive: false, &block)
escaped_delimiter = Regexp.escape(delimiter)
# (?:^|\s) - start of line or space before
# \K - keep assertion (excludes preceding pattern from match)
Expand All @@ -586,32 +586,41 @@ def constrained_format_pattern(text, delimiter, &block)
# #{escaped_delimiter} - single closing mark
# (?=[,;".?!\s]|$) - followed by punctuation, space, or end of line
pattern = /(?:^|\s)\K#{escaped_delimiter}(\S(?:(?!\s).*?(?<!\s))?)#{escaped_delimiter}(?=[,;".?!\s]|$)/
text.gsub(pattern) { block.call($1) }
text.gsub(pattern) do
content = $1
# Recursively process nested formatting if enabled
content = convert_nested(content) if recursive
block.call(content)
end
end

# Apply unconstrained formatting pair transformation
# Double delimiter, can be used anywhere
# Matches: **text**, __text__, ^^text^^, ~~text~~
# Example: "Sara**h**" or "**man**ual"
#
# @param text [String] The text to transform
# @param delimiter [String] The formatting delimiter (e.g., '*', '_', '^', '~')
# @param delimiter [String] The formatting delimiter (e.g., '*', '_', '`')
# @param recursive [Boolean] Whether to recursively process nested formatting
# @yield [content] Block that transforms the captured content
# @yieldparam content [String] The text between the delimiters
# @yieldreturn [String] The transformed content
# @return [String] The text with formatting applied
def unconstrained_format_pattern(text, delimiter, &block)
def unconstrained_format_pattern(text, delimiter, recursive: false, &block)
escaped_delimiter = Regexp.escape(delimiter)
# #{escaped_delimiter}{2} - double opening mark
# (.+?) - any text (non-greedy)
# #{escaped_delimiter}{2} - double closing mark
pattern = /#{escaped_delimiter}{2}(.+?)#{escaped_delimiter}{2}/
text.gsub(pattern) { block.call($1) }
text.gsub(pattern) do
content = $1
# Recursively process nested formatting if enabled
content = convert_nested(content) if recursive
block.call(content)
end
end

# Apply superscript/subscript formatting transformation
# Single delimiter, can be used anywhere, but text must be continuous (no spaces)
# Matches: ^text^, ~text~ where text contains no spaces
# Example: "2^32^" or "X~i~"
#
# @param text [String] The text to transform
Expand All @@ -625,26 +634,43 @@ def continuous_format_pattern(text, delimiter, &block)
# #{escaped_delimiter} - single opening mark
# (\S+?) - continuous non-space text (no spaces allowed)
# #{escaped_delimiter} - single closing mark
# Note: Superscript/subscript don't support nesting in AsciiDoc
pattern = /#{escaped_delimiter}(\S+?)#{escaped_delimiter}/
text.gsub(pattern) { block.call($1) }
end

# Convert bold notation: *foo* -> <b>foo</b>
def convert_bold(text)
text = constrained_format_pattern(text, "*") { |content| "<b>#{content}</b>" }
text = unconstrained_format_pattern(text, "*") { |content| "<b>#{content}</b>" }
# Convert formatting within already-captured content.
# This processes unconstrained (double delimiters) first, then constrained (single delimiters),
# which is an order based on delimiter type, not on innermost-to-outermost nesting.
def convert_nested(text)
result = text.dup
# Process unconstrained first (double delimiters)
result = unconstrained_format_pattern(result, "*", recursive: true) { |content| "<b>#{content}</b>" }
result = unconstrained_format_pattern(result, "_", recursive: true) { |content| "<i>#{content}</i>" }
result = unconstrained_format_pattern(result, "`", recursive: true) { |content| "<code>#{content}</code>" }
# Then process constrained (single delimiters)
result = constrained_format_pattern(result, "*", recursive: true) { |content| "<b>#{content}</b>" }
result = constrained_format_pattern(result, "_", recursive: true) { |content| "<i>#{content}</i>" }
result = constrained_format_pattern(result, "`", recursive: true) { |content| "<code>#{content}</code>" }
result
end

# Convert italics notation: _bar_ -> <i>bar</i>
def convert_italics(text)
text = constrained_format_pattern(text, "_") { |content| "<i>#{content}</i>" }
text = unconstrained_format_pattern(text, "_") { |content| "<i>#{content}</i>" }
# Convert unconstrained bold, italics, monospaces notation.
# For example, **foo**bar -> <b>foo</b>bar
# Supports nesting when recursive: true
def convert_unconstrained(text)
text = unconstrained_format_pattern(text, "*", recursive: true) { |content| "<b>#{content}</b>" }
text = unconstrained_format_pattern(text, "_", recursive: true) { |content| "<i>#{content}</i>" }
unconstrained_format_pattern(text, "`", recursive: true) { |content| "<code>#{content}</code>" }
end

# Convert monospace notation: `zort` -> <code>zort</code>
def convert_monospace(text)
text = constrained_format_pattern(text, "`") { |content| "<code>#{content}</code>" }
text = unconstrained_format_pattern(text, "`") { |content| "<code>#{content}</code>" }
# Convert constrained bold, italics, monospaces notation.
# For example, *foo* -> <b>foo</b>
# Supports nesting when recursive: true
def convert_constrained(text)
text = constrained_format_pattern(text, "*", recursive: true) { |content| "<b>#{content}</b>" }
text = constrained_format_pattern(text, "_", recursive: true) { |content| "<i>#{content}</i>" }
constrained_format_pattern(text, "`", recursive: true) { |content| "<code>#{content}</code>" }
end

# Convert superscript notation: 2^32^ -> 2<sup>32</sup>
Expand Down Expand Up @@ -729,9 +755,8 @@ def convert_unicode_names(text)
# Apply all format conversions (keeping numeric entities).
def convert(text)
result = text.dup
result = convert_bold(result)
result = convert_italics(result)
result = convert_monospace(result)
result = convert_unconstrained(result)
result = convert_constrained(result)
result = convert_superscript(result)
result = convert_subscript(result)
result = convert_underline(result)
Expand Down