|
| 1 | +# Markdown + MathJax Integration Guide |
| 2 | + |
| 3 | +## Overview |
| 4 | + |
| 5 | +This document describes how the markdown-it upgrade integrates with MathJax for LaTeX math rendering support. |
| 6 | + |
| 7 | +## Key Principle: Separation of Concerns |
| 8 | + |
| 9 | +**Markdown parser and MathJax are independent layers:** |
| 10 | + |
| 11 | +``` |
| 12 | +┌──────────────────────────────────────────┐ |
| 13 | +│ User Input (Markdown + LaTeX) │ |
| 14 | +│ "The equation $E = mc^2$ is famous" │ |
| 15 | +└──────────────────────────────────────────┘ |
| 16 | + ↓ |
| 17 | +┌──────────────────────────────────────────┐ |
| 18 | +│ Markdown Parser (markdown-it) │ |
| 19 | +│ - Converts markdown → HTML │ |
| 20 | +│ - PRESERVES $...$ blocks verbatim │ |
| 21 | +│ - Does NOT process LaTeX │ |
| 22 | +└──────────────────────────────────────────┘ |
| 23 | + ↓ |
| 24 | +┌──────────────────────────────────────────┐ |
| 25 | +│ HTML Output │ |
| 26 | +│ <p>The equation $E = mc^2$ is famous</p>│ |
| 27 | +└──────────────────────────────────────────┘ |
| 28 | + ↓ |
| 29 | +┌──────────────────────────────────────────┐ |
| 30 | +│ MathJax (Client-side) │ |
| 31 | +│ - Scans HTML for $...$ delimiters │ |
| 32 | +│ - Renders LaTeX → formatted math │ |
| 33 | +│ - Independent of markdown │ |
| 34 | +└──────────────────────────────────────────┘ |
| 35 | +``` |
| 36 | + |
| 37 | +## Problem: Markdown Processing Breaks LaTeX |
| 38 | + |
| 39 | +Without protection, markdown parsers process text inside `$...$` delimiters: |
| 40 | + |
| 41 | +### Example 1: Underscores → Emphasis |
| 42 | +```markdown |
| 43 | +Input: $a_b$ |
| 44 | +Wrong: $a<em>b</em>$ ← Markdown processed underscore as italic |
| 45 | +Right: $a_b$ ← Preserved verbatim for MathJax |
| 46 | +``` |
| 47 | + |
| 48 | +### Example 2: Asterisks → Bold |
| 49 | +```markdown |
| 50 | +Input: $x * y * z$ |
| 51 | +Wrong: $x <strong> y </strong> z$ ← Markdown processed * as bold |
| 52 | +Right: $x * y * z$ ← Preserved verbatim |
| 53 | +``` |
| 54 | + |
| 55 | +### Example 3: Subscripts/Superscripts |
| 56 | +```markdown |
| 57 | +Input: $x_{123}^{456}$ |
| 58 | +Wrong: $x<sub>{123}</sub><sup>{456}</sup>$ ← HTML entities |
| 59 | +Right: $x_{123}^{456}$ ← LaTeX preserved |
| 60 | +``` |
| 61 | + |
| 62 | +## Solution: Math Delimiter Protection |
| 63 | + |
| 64 | +### Two-Part Approach |
| 65 | + |
| 66 | +#### 1. Code-Friendly Mode (Disables Underscore Emphasis) |
| 67 | + |
| 68 | +When `ENABLE_MATHJAX = True`, automatically enable code-friendly mode: |
| 69 | + |
| 70 | +**Backend (Python):** |
| 71 | +```python |
| 72 | +# askbot/utils/markup.py |
| 73 | +if askbot_settings.ENABLE_MATHJAX or askbot_settings.MARKUP_CODE_FRIENDLY: |
| 74 | + md.disable('emphasis') # Disables both * and _ emphasis |
| 75 | +``` |
| 76 | + |
| 77 | +**Frontend (JavaScript):** |
| 78 | +```javascript |
| 79 | +// askbot/media/wmd/askbot_converter.js |
| 80 | +if (settings.mathjaxEnabled || settings.markupCodeFriendly) { |
| 81 | + this._md.disable('emphasis'); |
| 82 | +} |
| 83 | +``` |
| 84 | + |
| 85 | +#### 2. Math Delimiter Protection (Treats Math as Verbatim) |
| 86 | + |
| 87 | +Add custom markdown-it rules to detect and preserve `$...$` and `$$...$$` blocks: |
| 88 | + |
| 89 | +**Backend (Python):** |
| 90 | +```python |
| 91 | +# askbot/utils/markdown_plugins/math_protect.py |
| 92 | +def math_inline_rule(state, silent): |
| 93 | + """Detect $...$ and create verbatim token""" |
| 94 | + if state.src[pos] == '$': |
| 95 | + # Find closing $ |
| 96 | + # Create token with content preserved |
| 97 | + token = state.push('math_inline', '', 0) |
| 98 | + token.content = matched_text |
| 99 | + return True |
| 100 | + |
| 101 | +def math_protect_plugin(md): |
| 102 | + md.inline.ruler.before('escape', 'math_inline', math_inline_rule) |
| 103 | + md.renderer.rules.math_inline = lambda tokens, idx: tokens[idx].content |
| 104 | +``` |
| 105 | + |
| 106 | +**Frontend (JavaScript):** |
| 107 | +```javascript |
| 108 | +// askbot/media/wmd/askbot_converter.js |
| 109 | +AskbotMarkdownConverter.prototype._protectMathDelimiters = function() { |
| 110 | + function mathInlineRule(state, silent) { |
| 111 | + // Detect $...$ |
| 112 | + // Create token with verbatim content |
| 113 | + } |
| 114 | + |
| 115 | + this._md.inline.ruler.before('escape', 'math_inline', mathInlineRule); |
| 116 | + this._md.renderer.rules.math_inline = function(tokens, idx) { |
| 117 | + return tokens[idx].content; // Render as-is |
| 118 | + }; |
| 119 | +}; |
| 120 | +``` |
| 121 | + |
| 122 | +## Implementation Locations |
| 123 | + |
| 124 | +### Phase 1: Backend (Python) |
| 125 | + |
| 126 | +**Files Modified:** |
| 127 | +- `askbot/utils/markup.py` - Main converter configuration |
| 128 | +- `askbot/utils/markdown_plugins/math_protect.py` - NEW plugin (to be created) |
| 129 | + |
| 130 | +**Changes:** |
| 131 | +```python |
| 132 | +# In get_md_converter(): |
| 133 | + |
| 134 | +# 1. Auto-enable code-friendly mode |
| 135 | +if askbot_settings.ENABLE_MATHJAX or askbot_settings.MARKUP_CODE_FRIENDLY: |
| 136 | + md.disable('emphasis') |
| 137 | + |
| 138 | +# 2. Protect math delimiters |
| 139 | +if askbot_settings.ENABLE_MATHJAX: |
| 140 | + from askbot.utils.markdown_plugins.math_protect import math_protect_plugin |
| 141 | + md.use(math_protect_plugin) |
| 142 | +``` |
| 143 | + |
| 144 | +### Phase 2: Frontend (JavaScript) |
| 145 | + |
| 146 | +**Files Modified:** |
| 147 | +- `askbot/media/wmd/askbot_converter.js` - Add `_protectMathDelimiters()` method |
| 148 | + |
| 149 | +**Changes:** |
| 150 | +```javascript |
| 151 | +// In _configureSettings(): |
| 152 | + |
| 153 | +// 1. Code-friendly mode |
| 154 | +if (settings.mathjaxEnabled || settings.markupCodeFriendly) { |
| 155 | + this._md.disable('emphasis'); |
| 156 | +} |
| 157 | + |
| 158 | +// 2. Math delimiter protection |
| 159 | +if (settings.mathjaxEnabled) { |
| 160 | + this._protectMathDelimiters(); |
| 161 | +} |
| 162 | +``` |
| 163 | + |
| 164 | +### Phase 3: Testing |
| 165 | + |
| 166 | +**Test Cases Added:** |
| 167 | + |
| 168 | +**Backend Tests** (`askbot/tests/test_markdown_integration.py`): |
| 169 | +- `test_mathjax_math_delimiters_preserved()` - Check $...$ preserved |
| 170 | +- `test_mathjax_underscores_not_emphasis()` - Check underscores work |
| 171 | +- `test_mathjax_complex_latex()` - Check complex expressions |
| 172 | + |
| 173 | +**Edge Case Tests** (`askbot/tests/test_markdown_edge_cases.py`): |
| 174 | +- `test_mathjax_inline_math_preserved()` - Inline math |
| 175 | +- `test_mathjax_display_math_preserved()` - Display math |
| 176 | +- `test_mathjax_underscores_not_processed()` - Underscore handling |
| 177 | + |
| 178 | +## Configuration |
| 179 | + |
| 180 | +### Settings Interaction |
| 181 | + |
| 182 | +| Setting | Effect | |
| 183 | +|---------|--------| |
| 184 | +| `ENABLE_MATHJAX = False` | Normal markdown (underscores create emphasis) | |
| 185 | +| `ENABLE_MATHJAX = True` | Code-friendly mode + math protection enabled | |
| 186 | +| `MARKUP_CODE_FRIENDLY = True` | Code-friendly mode (no underscore emphasis) | |
| 187 | + |
| 188 | +### Automatic Behavior |
| 189 | + |
| 190 | +When admin enables MathJax in settings (`ENABLE_MATHJAX = True`): |
| 191 | + |
| 192 | +1. ✅ Code-friendly mode activates automatically |
| 193 | +2. ✅ Underscore emphasis disabled |
| 194 | +3. ✅ Math delimiter protection enabled |
| 195 | +4. ✅ LaTeX content preserved verbatim |
| 196 | +5. ✅ MathJax can render math on client-side |
| 197 | + |
| 198 | +## Testing Strategy |
| 199 | + |
| 200 | +### Manual Testing Checklist |
| 201 | + |
| 202 | +**Backend (Python):** |
| 203 | +```bash |
| 204 | +cd testproject/ |
| 205 | +python manage.py shell |
| 206 | + |
| 207 | +from askbot.utils.markup import get_md_converter |
| 208 | +md = get_md_converter() |
| 209 | + |
| 210 | +# Test 1: Inline math preserved |
| 211 | +text = "The equation $E = mc^2$ is famous" |
| 212 | +html = md.render(text) |
| 213 | +assert '$E = mc^2$' in html |
| 214 | + |
| 215 | +# Test 2: Underscores not processed |
| 216 | +text = "$a_b$ and $x_{123}$" |
| 217 | +html = md.render(text) |
| 218 | +assert '$a_b$' in html |
| 219 | +assert '<em>' not in html |
| 220 | +assert '<sub>' not in html |
| 221 | + |
| 222 | +# Test 3: Display math preserved |
| 223 | +text = "$$\\int_0^1 x dx$$" |
| 224 | +html = md.render(text) |
| 225 | +assert '$$' in html |
| 226 | +assert '\\int' in html |
| 227 | +``` |
| 228 | + |
| 229 | +**Frontend (JavaScript):** |
| 230 | +```javascript |
| 231 | +// In browser console |
| 232 | +var converter = new AskbotMarkdownConverter(); |
| 233 | + |
| 234 | +// Test 1: Inline math |
| 235 | +var html = converter.makeHtml("The equation $E = mc^2$ is famous"); |
| 236 | +console.log(html); // Should contain $E = mc^2$ |
| 237 | + |
| 238 | +// Test 2: Underscores |
| 239 | +var html = converter.makeHtml("$a_b$ and $x_{123}$"); |
| 240 | +console.log(html); // Should NOT have <em> or <sub> tags |
| 241 | + |
| 242 | +// Test 3: MathJax rendering (if enabled) |
| 243 | +// After typing in editor, MathJax should render the math |
| 244 | +``` |
| 245 | + |
| 246 | +### Automated Testing |
| 247 | + |
| 248 | +**Run backend tests:** |
| 249 | +```bash |
| 250 | +cd testproject/ |
| 251 | +python manage.py test askbot.tests.test_markdown_integration -k mathjax |
| 252 | +python manage.py test askbot.tests.test_markdown_edge_cases -k mathjax |
| 253 | +``` |
| 254 | + |
| 255 | +**Run frontend tests:** |
| 256 | +```bash |
| 257 | +# Browser-based testing (Selenium/Playwright) |
| 258 | +python manage.py test askbot.tests.test_markdown_frontend -k mathjax |
| 259 | +``` |
| 260 | + |
| 261 | +## Edge Cases and Gotchas |
| 262 | + |
| 263 | +### 1. Dollar Signs in Regular Text |
| 264 | + |
| 265 | +**Problem:** `$100 and $200` might be treated as math |
| 266 | + |
| 267 | +**Solution:** |
| 268 | +- Math detection requires closing `$` |
| 269 | +- `$100` alone won't match (no closing delimiter) |
| 270 | +- To write literal `$`, use `\$` or just `$` (works fine) |
| 271 | + |
| 272 | +### 2. Escaped Dollar Signs |
| 273 | + |
| 274 | +**Problem:** How to write literal `$` in text? |
| 275 | + |
| 276 | +**Solution:** |
| 277 | +- Just use `$` - it's fine if not followed by matching `$` |
| 278 | +- Or use `\$` if you want to be explicit |
| 279 | + |
| 280 | +### 3. Nested Delimiters |
| 281 | + |
| 282 | +**Problem:** `$outer $inner$ outer$` |
| 283 | + |
| 284 | +**Solution:** |
| 285 | +- Current implementation uses simple matching |
| 286 | +- First `$` matches with next `$` |
| 287 | +- Nested delimiters not supported (rare in LaTeX) |
| 288 | + |
| 289 | +### 4. Display Math on Own Line |
| 290 | + |
| 291 | +**Problem:** `$$..$$` should be block-level, not inline |
| 292 | + |
| 293 | +**Solution:** |
| 294 | +- Put `$$` on separate lines: |
| 295 | + ```markdown |
| 296 | + Text before |
| 297 | + |
| 298 | + $$ |
| 299 | + \int_0^1 x dx |
| 300 | + $$ |
| 301 | + |
| 302 | + Text after |
| 303 | + ``` |
| 304 | + |
| 305 | +## Relationship to MathJax Upgrade |
| 306 | + |
| 307 | +This math protection is **orthogonal** to the MathJax v2→v4 upgrade: |
| 308 | + |
| 309 | +| Concern | Markdown Upgrade | MathJax Upgrade | |
| 310 | +|---------|------------------|-----------------| |
| 311 | +| **Scope** | Preserve LaTeX in HTML | Render LaTeX visually | |
| 312 | +| **When** | During markdown→HTML | After HTML loaded | |
| 313 | +| **Where** | Server + client | Client only | |
| 314 | +| **Plugin** | Math delimiter protection | MathJax library | |
| 315 | + |
| 316 | +**Both projects are independent:** |
| 317 | +- Markdown upgrade works with MathJax v2 OR v4 |
| 318 | +- MathJax upgrade doesn't require markdown upgrade |
| 319 | +- But doing both together is ideal |
| 320 | + |
| 321 | +## Migration Notes |
| 322 | + |
| 323 | +### For Existing Installations |
| 324 | + |
| 325 | +**Before markdown-it upgrade:** |
| 326 | +- MathJax works (if configured) |
| 327 | +- Some edge cases may fail (e.g., complex underscores) |
| 328 | + |
| 329 | +**After markdown-it upgrade:** |
| 330 | +- MathJax still works |
| 331 | +- Better math delimiter protection |
| 332 | +- More reliable with complex LaTeX |
| 333 | + |
| 334 | +**No content re-rendering needed:** |
| 335 | +- Math delimiters already in database |
| 336 | +- MathJax processes them client-side |
| 337 | +- No server-side changes to stored HTML |
| 338 | + |
| 339 | +## Future Enhancements |
| 340 | + |
| 341 | +### Potential Improvements (Not in Current Scope) |
| 342 | + |
| 343 | +1. **Server-Side Math Rendering** |
| 344 | + - Render LaTeX → SVG on server |
| 345 | + - Better SEO, faster initial load |
| 346 | + - Complexity: High, security concerns |
| 347 | + |
| 348 | +2. **Alternative Delimiters** |
| 349 | + - Support `\(...\)` and `\[...\]` |
| 350 | + - CommonMark math extension |
| 351 | + - Complexity: Medium |
| 352 | + |
| 353 | +3. **Math Block Plugin** |
| 354 | + - Dedicated fenced code blocks: ` ```math ` |
| 355 | + - Clearer syntax |
| 356 | + - Complexity: Low |
| 357 | + |
| 358 | +4. **Syntax Validation** |
| 359 | + - Validate LaTeX syntax in editor |
| 360 | + - Show errors before save |
| 361 | + - Complexity: Medium |
| 362 | + |
| 363 | +## References |
| 364 | + |
| 365 | +### Documentation |
| 366 | +- [MathJax Documentation](https://docs.mathjax.org/) |
| 367 | +- [markdown-it Documentation](https://markdown-it.github.io/) |
| 368 | +- [CommonMark Spec](https://spec.commonmark.org/) |
| 369 | + |
| 370 | +### Related Tasks |
| 371 | +- `tasks/markdown-upgrade-phase1-backend.md` - Backend implementation |
| 372 | +- `tasks/markdown-upgrade-phase2-frontend.md` - Frontend implementation |
| 373 | +- `tasks/markdown-upgrade-phase3-testing.md` - Testing strategy |
| 374 | +- `tasks/upgrade-mathjax.md` - MathJax v4 upgrade plan |
| 375 | + |
| 376 | +### Code Locations |
| 377 | +- `askbot/conf/markup.py:54-65` - MathJax settings |
| 378 | +- `askbot/jinja2/meta/bottom_scripts.html:192-205` - MathJax loading |
| 379 | +- `askbot/media/wmd/askbot_converter.js` - Editor preview |
| 380 | +- `askbot/utils/markup.py` - Backend markdown conversion |
| 381 | + |
| 382 | +## Summary |
| 383 | + |
| 384 | +**Key Takeaways:** |
| 385 | + |
| 386 | +1. ✅ Markdown parser does NOT render MathJax - it preserves delimiters |
| 387 | +2. ✅ Two mechanisms: code-friendly mode + delimiter protection |
| 388 | +3. ✅ Works on both backend (Python) and frontend (JavaScript) |
| 389 | +4. ✅ Automatic when `ENABLE_MATHJAX = True` |
| 390 | +5. ✅ No breaking changes to existing math content |
| 391 | +6. ✅ Independent of MathJax version (v2/v3/v4) |
| 392 | + |
| 393 | +**The relationship is simple:** |
| 394 | +- Markdown creates HTML with `$...$` preserved |
| 395 | +- MathJax renders `$...$` into beautiful math |
| 396 | +- They never need to "talk" to each other |
0 commit comments