Skip to content

sushi compose markdown processing fix#79

Open
Bhavesh2404 wants to merge 1 commit intodevfrom
markdown_processing_fix
Open

sushi compose markdown processing fix#79
Bhavesh2404 wants to merge 1 commit intodevfrom
markdown_processing_fix

Conversation

@Bhavesh2404
Copy link
Copy Markdown

@Bhavesh2404 Bhavesh2404 commented Dec 26, 2025

Problem :: Markdown processing NOT working for example :: {<bold-300|blue-500|text>}

Analysis MARKDOWN : Old View System vs New Compose System

OLD VIEW SYSTEM (MarkdownParser.java)

Architecture: TWO-PHASE Processing

Phase 1: Stack-Based Delimiter Matching (getPreProcessedSpannableBuilder)

  • Processes: **bold**, _italic_, ~~strikethrough~~
  • Uses a stack to match opening/closing delimiters
  • Runs first, converts delimiters to spans

Phase 2: Regex-Based Transformation (parseRegexTransformation)

  • Processes complex patterns with regex
  • Critical Order (line 167-172):
    1. FontWeightProcessor - <weight-size|text> ← Runs FIRST
    2. ForeGroundColorProcessor - {color|text} ← Runs SECOND
    3. BackgroundColorProcessor - {bg:color|text}

Why {<bold-300|blue-500|text>} WORKS in old system:

Input: "{<bold-300|blue-500|You saved>}"

Step 1: FontWeightProcessor runs FIRST
- Regex: (\<)(.+?)(\|)(.+?)(\>)
- Matches: <bold-300|blue-500|You saved>
- TEXT_GROUP (4): "blue-500|You saved"  ← Extracts EVERYTHING after |
- Applies: bold-300 span
- Output: "{blue-500|You saved}"

Step 2: ForeGroundColorProcessor runs SECOND
- Regex: (\{)(.+?)(\|)(.+?)(\})
- Matches: {blue-500|You saved}
- COLOR_GROUP (2): "blue-500"
- TEXT_GROUP (4): "You saved"
- Applies: blue-500 color
- Output: "You saved" (with bold + color)

NEW COMPOSE SYSTEM (MarkdownParser.kt)

Architecture: SINGLE-PHASE Sequential Processing

All processors run in a fixed chain (lines 38-47):

  1. BoldProcessor - **text**
  2. ItalicProcessor - _text_
  3. StrikethroughProcessor - ~~text~~
  4. ColorProcessor - {color|text} ← Runs FIRST (among complex)
  5. FontWeightProcessor - <weight-size|text> ← Runs SECOND
  6. LinkProcessor - [text](url)
  7. UnderlineAnnotaterProcessor - <u>text<u>

Why {<bold-300|blue-500|text>} FAILS in new system:

Input: "{<bold-300|blue-500|You saved>}"

Step 1: ColorProcessor runs FIRST
- Regex: (\{)(.+?)(\|)((.|\\n)+?)(\})
- Matches: {<bold-300|blue-500|You saved>}
- COLOR_GROUP (2): "<bold-300|blue-500"  ← Invalid color name!
- TEXT_GROUP (4): "You saved"
- parseColor("<bold-300|blue-500") fails
- Transformation skipped
- Output: "{<bold-300|blue-500|You saved>}" (unchanged)

Step 2: FontWeightProcessor can't match
- The outer {} prevents the <> pattern from matching
- Output: "{<bold-300|blue-500|You saved>}" (still unchanged)

Key Architectural Differences

Aspect Old View System New Compose System
Processing Model Two-phase (delimiter then regex) Single-phase sequential
Font vs Color Order Font FIRST, Color SECOND Color FIRST, Font SECOND
Regex Complexity .+? (non-greedy) ((.|\\n)+?) (multiline support)
Nested Syntax {<font|color|text>} works <font|{color|text}> works
Error Handling Silent failures kotlin.runCatching
Platform Android View (Java/Kotlin) Compose Multiplatform (Kotlin)

Solution Options

Since you cannot change backend response, here are 3 approaches:

Option 1: Swap Processor Order (Recommended)

Change MarkdownParser.kt to match old system's order:

MarkdownParser.Builder()
    .processor(BoldProcessor())
    .processor(ItalicProcessor())
    .processor(StrikethroughProcessor())
    .processor(FontWeightProcessor())  // Move BEFORE ColorProcessor
    .processor(ColorProcessor())       // Move AFTER FontWeightProcessor
    .processor(LinkProcessor())
    .processor(UnderlineAnnotaterProcessor())
    .build()

Pros: Maintains backward compatibility with existing markdown
Cons: May break any new code expecting current order

Option 2: Syntax Preprocessing Layer

Add a preprocessor that converts old syntax to new:

fun preprocessOldSyntax(text: String): String {
    // Convert {<weight-size|color|text>} → <weight-size|{color|text}>
    return text.replace(
        Regex("""\{<([^|]+)\|([^|]+)\|([^>]+)>\}""")
    ) { match ->
        val weight = match.groupValues[1]
        val color = match.groupValues[2]
        val text = match.groupValues[3]
        "<$weight|{$color|$text}>"
    }
}

Pros: No processor order changes, isolated fix
Cons: Extra processing overhead, may miss edge cases

Option 3: Smart Nested Pattern Detection

Enhance ColorProcessor to detect and skip nested patterns:

override fun process(props: MarkdownParserProps, src: AnnotatedString): AnnotatedString {
    val matchResults = REGEX.findAll(src)
    
    matchResults.forEach { matchResult ->
        val colorGroup = matchResult.groups[COLOR_GROUP]?.value
        
        // Skip if color contains angle brackets (nested font syntax)
        if (colorGroup?.contains('<') == true) {
            return@forEach  // Skip this match
        }
        
        // Process normally...
    }
}

Pros: Surgical fix, no global changes
Cons: Complex logic, may affect performance

Recommended Approach

Option 1 (Swap Processor Order) is the best solution because:

  1. Maintains backward compatibility with all existing markdown from backend
  2. Minimal code changes - just reorder processors
  3. Matches proven behavior of old View system
  4. No performance overhead compared to preprocessing
  5. No risk of missing edge cases unlike regex preprocessing

The Problem in Detail

Your backend sends: "{<bold-300|blue-500|You saved ₹220, including ₹41 with>} {<bold-300|cider-500|Gold>}"

This has nested syntax: outer curly braces {} wrapping inner angle brackets <>

Current (Broken) Processing Flow

Processor Order:

  1. BoldProcessor (**text**)
  2. ItalicProcessor (_text_)
  3. StrikethroughProcessor (~~text~~)
  4. ColorProcessor ({color|text}) ← Problem starts here
  5. FontWeightProcessor (<weight-size|text>)
  6. LinkProcessor
  7. UnderlineAnnotaterProcessor

Execution Step-by-Step:

Input Text:

"{<bold-300|blue-500|You saved>}"

Step 1-3: BoldProcessor, ItalicProcessor, StrikethroughProcessor

  • Look for **, _, ~~ patterns
  • None found
  • Text unchanged: "{<bold-300|blue-500|You saved>}"

Step 4: ColorProcessor runs

ColorProcessor regex: (\{)(.+?)(\|)((.|\\n)+?)(\})

This matches:

  • Opening brace: {
  • COLOR_GROUP (group 2): <bold-300|blue-500 ← PROBLEM!
  • Pipe separator: |
  • TEXT_GROUP (group 4): You saved
  • Closing brace: }

The regex captures <bold-300|blue-500 as the "color" parameter because it greedily takes everything between { and the first |.

ColorProcessor calls parseColor("<bold-300|blue-500"):

// In ColorProcessor.kt parseColor() method
private fun parseColor(color: String): Color? {
    // Check for direct color name
    var parsedColor: ColorSpec? = ColorName.fromColorName(color)
    // ColorName.fromColorName("<bold-300|blue-500") → null
    
    if (color.contains("-")) {
        // Split by "-"
        val colorObjectString = color.split("-")
        // Result: ["<bold", "300", "blue", "500"]
        
        // Try to parse as "colorName-variation"
        if (colorObjectString.size == 2) {
            val name = ColorName.fromColorName("<bold")  // null
            val tint = ColorVariation.fromInt(300)        // valid
            // name is null, so returns null
        }
        
        // Try to parse as "colorName-variation-alpha"
        if (colorObjectString.size == 3) {
            // ["<bold", "300", "blue"] - doesn't match expected pattern
        }
    }
    
    return null  // Parsing failed!
}

Since parsing fails, the transformation is skipped. Text remains: "{<bold-300|blue-500|You saved>}"

Step 5: FontWeightProcessor runs

FontWeightProcessor regex: (\<)(.+?)(\|)((.|\\n)+?)(\>)

This looks for pattern: <...>

In "{<bold-300|blue-500|You saved>}", the angle brackets are trapped inside the curly braces. The regex cannot match because:

  • It starts looking for <
  • Finds < at position after {
  • Tries to find matching >
  • But the pattern also requires extracting groups between < and >
  • The outer {} interfere with proper matching

Text remains: "{<bold-300|blue-500|You saved>}"STILL UNSTYLED

Step 6-7: LinkProcessor, UnderlineAnnotaterProcessor

  • No matches
  • Final output: "{<bold-300|blue-500|You saved>}" ← Raw text, no styling

Fixed (Working) Processing Flow

New Processor Order (after swap):

  1. BoldProcessor (**text**)
  2. ItalicProcessor (_text_)
  3. StrikethroughProcessor (~~text~~)
  4. FontWeightProcessor (<weight-size|text>) ← Now runs FIRST
  5. ColorProcessor ({color|text}) ← Now runs SECOND
  6. LinkProcessor
  7. UnderlineAnnotaterProcessor

Execution Step-by-Step:

Input Text:

"{<bold-300|blue-500|You saved>}"

Step 1-3: BoldProcessor, ItalicProcessor, StrikethroughProcessor

  • No matches
  • Text unchanged: "{<bold-300|blue-500|You saved>}"

Step 4: FontWeightProcessor runs FIRST

FontWeightProcessor regex: (\<)(.+?)(\|)((.|\\n)+?)(\>)

This matches:

  • Opening angle bracket: <
  • FONT_GROUP (group 2): bold-300
  • Pipe separator: |
  • TEXT_GROUP (group 4): blue-500|You saved ← Takes EVERYTHING after first | until >
  • Closing angle bracket: >

Key insight: The regex captures blue-500|You saved as the TEXT_GROUP because it uses (.+?) which matches everything (including the second |) until it hits >.

Now FontWeightProcessor processes this:

// Extract font data
val fontDataList = "bold-300".split("-")
// Result: ["bold", "300"]

// FONT_WEIGHT_INDEX = 0: "bold"
// FONT_SIZE_INDEX = 1: "300"

// Get font properties
val fontWeight = getFontWeight("bold")  // FontWeight.Bold
val fontSize = getFontSize("300", props) // SushiTextSize300

// Build transformed text
val transformedText = "blue-500|You saved"  // From TEXT_GROUP

// Apply styling and replace
// Replace: "<bold-300|blue-500|You saved>" 
// With: "blue-500|You saved"
// Add SpanStyle: FontWeight.Bold + fontSize=300

return AnnotatedString("blue-500|You saved", spanStyle=bold-300)

After FontWeightProcessor, the text becomes:

"{blue-500|You saved}"  // Now has bold-300 styling

The outer {} remain, but the inner <> pattern has been consumed and replaced with styled text.

Step 5: ColorProcessor runs SECOND

ColorProcessor regex: (\{)(.+?)(\|)((.|\\n)+?)(\})

This matches:

  • Opening brace: {
  • COLOR_GROUP (group 2): blue-500 ← NOW VALID!
  • Pipe separator: |
  • TEXT_GROUP (group 4): You saved
  • Closing brace: }

ColorProcessor calls parseColor("blue-500"):

private fun parseColor(color: String): Color? {
    if (color.contains("-")) {
        val colorObjectString = color.split("-")
        // Result: ["blue", "500"]
        
        if (colorObjectString.size == 2) {
            val name = ColorName.fromColorName("blue")  // ColorName.Blue ✓
            val tint = ColorVariation.fromInt(500)      // Variation500 ✓
            
            parsedColor = getColor(name, tint, SushiTheme.colors)
            // Returns Color(blue-500) ✓
        }
    }
    return parsedColor.value  // Valid color!
}

Parsing succeeds! ColorProcessor transforms:

// Replace: "{blue-500|You saved}"
// With: "You saved"
// Add SpanStyle: color=blue-500

return AnnotatedString("You saved", spanStyle=[bold-300, blue-500])

After ColorProcessor, the text becomes:

"You saved"  // Has BOTH bold-300 AND blue-500 styling ✓

Step 6-7: LinkProcessor, UnderlineAnnotaterProcessor

  • No matches
  • Final output: "You saved" with bold-300 font styling + blue-500 color ← FULLY STYLED!

Why Order Matters: The Critical Difference

Current Order (Color → Font): FAILS

Input:  {<bold-300|blue-500|text>}
        ↓
ColorProcessor sees: COLOR="<bold-300|blue-500", TEXT="text"
        ↓
parseColor("<bold-300|blue-500") → null (invalid)
        ↓
Skipped, text unchanged: {<bold-300|blue-500|text>}
        ↓
FontWeightProcessor cannot match (blocked by {})
        ↓
Output: {<bold-300|blue-500|text>} (unstyled)

Fixed Order (Font → Color): WORKS

Input:  {<bold-300|blue-500|text>}
        ↓
FontWeightProcessor sees: FONT="bold-300", TEXT="blue-500|text"
        ↓
Applies bold-300, removes <>
        ↓
Output: {blue-500|text} (with bold-300 styling)
        ↓
ColorProcessor sees: COLOR="blue-500", TEXT="text"
        ↓
parseColor("blue-500") → Color ✓
        ↓
Applies blue-500, removes {}
        ↓
Output: text (with bold-300 + blue-500 styling)

The Regex Behavior Explained

Both processors use non-greedy matching (.+?) but capture different amounts:

FontWeightProcessor: (\<)(.+?)(\|)((.|\\n)+?)(\>)

  • Stops at first | for FONT_GROUP
  • Captures everything until > for TEXT_GROUP
  • This means <bold-300|blue-500|text> extracts:
    • FONT: bold-300
    • TEXT: blue-500|text ← Includes the second | and everything after

ColorProcessor: (\{)(.+?)(\|)((.|\\n)+?)(\})

  • Stops at first | for COLOR_GROUP
  • Captures everything until } for TEXT_GROUP
  • This means {blue-500|text} extracts:
    • COLOR: blue-500
    • TEXT: text

When FontWeightProcessor runs first, it "unwraps" the inner <> layer, leaving a clean {color|text} pattern for ColorProcessor to handle correctly.

Summary

Swapping the processor order fixes the issue because:

  1. FontWeightProcessor processes the inner layer first - extracts and applies <bold-300|...>, consuming the angle brackets
  2. This exposes clean {color|text} syntax - no more nested brackets confusing the ColorProcessor
  3. ColorProcessor then processes successfully - parses valid color names and applies styling
  4. Result: Both font weight/size AND color are applied correctly

This matches the proven behavior of the old View system, ensuring backward compatibility with all existing backend markdown.


Before ::

image

After :::

image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant