Skip to content

Inline: Escaped characters stripped vs preserved at word boundaries #278

@nlopes

Description

@nlopes

Problem

acdc and asciidoctor handle backslash escapes differently at word boundaries.

Reproduction

Text with \^caret at word boundary.
Also \*asterisk at boundary.
Mid-word es\*cape test.

acdc output:

<p>Text with ^caret at word boundary.</p>
<p>Also *asterisk at boundary.</p>
<p>Mid-word es*cape test.</p>

asciidoctor output:

<p>Text with \^caret at word boundary.</p>
<p>Also \*asterisk at boundary.</p>
<p>Mid-word es\*cape test.</p>

Analysis

  • acdc: Always strips the backslash, outputs the literal character
  • asciidoctor: Keeps the backslash when it's at a word boundary (not actually escaping anything)

The difference is subtle: when \* isn't actually preventing formatting (because *text without closing * wouldn't be bold anyway), asciidoctor preserves the backslash literally.

Expected behavior

Match asciidoctor: only strip backslash when it's actually preventing a formatting interpretation.

Implementation notes

This requires changes to inline processing in acdc-parser/src/grammar/inline_processing.rs or the escape handling in converters/core/src/substitutions.rs.

The logic needs to check: "Would this character sequence have been interpreted as formatting if the backslash wasn't there?" Only strip the backslash if yes.

Workaround

Use passthrough: pass:[*] or pass:[\*]

Priority

Low - edge case, passthrough workaround available.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions