Skip to content

Commit 1e01ee2

Browse files
committed
added plan tasks/upgrade-mathjax.md; updated the markdown-upgrade plans for the latex support
1 parent d487c1e commit 1e01ee2

File tree

5 files changed

+1036
-72
lines changed

5 files changed

+1036
-72
lines changed
Lines changed: 396 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,396 @@
1+
# Markdown + MathJax Integration Guide
2+
3+
## Overview
4+
5+
This document describes how the markdown-it upgrade integrates with MathJax for LaTeX math rendering support.
6+
7+
## Key Principle: Separation of Concerns
8+
9+
**Markdown parser and MathJax are independent layers:**
10+
11+
```
12+
┌──────────────────────────────────────────┐
13+
│ User Input (Markdown + LaTeX) │
14+
│ "The equation $E = mc^2$ is famous" │
15+
└──────────────────────────────────────────┘
16+
17+
┌──────────────────────────────────────────┐
18+
│ Markdown Parser (markdown-it) │
19+
│ - Converts markdown → HTML │
20+
│ - PRESERVES $...$ blocks verbatim │
21+
│ - Does NOT process LaTeX │
22+
└──────────────────────────────────────────┘
23+
24+
┌──────────────────────────────────────────┐
25+
│ HTML Output │
26+
│ <p>The equation $E = mc^2$ is famous</p>│
27+
└──────────────────────────────────────────┘
28+
29+
┌──────────────────────────────────────────┐
30+
│ MathJax (Client-side) │
31+
│ - Scans HTML for $...$ delimiters │
32+
│ - Renders LaTeX → formatted math │
33+
│ - Independent of markdown │
34+
└──────────────────────────────────────────┘
35+
```
36+
37+
## Problem: Markdown Processing Breaks LaTeX
38+
39+
Without protection, markdown parsers process text inside `$...$` delimiters:
40+
41+
### Example 1: Underscores → Emphasis
42+
```markdown
43+
Input: $a_b$
44+
Wrong: $a<em>b</em>$ ← Markdown processed underscore as italic
45+
Right: $a_b$ ← Preserved verbatim for MathJax
46+
```
47+
48+
### Example 2: Asterisks → Bold
49+
```markdown
50+
Input: $x * y * z$
51+
Wrong: $x <strong> y </strong> z$ ← Markdown processed * as bold
52+
Right: $x * y * z$ ← Preserved verbatim
53+
```
54+
55+
### Example 3: Subscripts/Superscripts
56+
```markdown
57+
Input: $x_{123}^{456}$
58+
Wrong: $x<sub>{123}</sub><sup>{456}</sup>$ ← HTML entities
59+
Right: $x_{123}^{456}$ ← LaTeX preserved
60+
```
61+
62+
## Solution: Math Delimiter Protection
63+
64+
### Two-Part Approach
65+
66+
#### 1. Code-Friendly Mode (Disables Underscore Emphasis)
67+
68+
When `ENABLE_MATHJAX = True`, automatically enable code-friendly mode:
69+
70+
**Backend (Python):**
71+
```python
72+
# askbot/utils/markup.py
73+
if askbot_settings.ENABLE_MATHJAX or askbot_settings.MARKUP_CODE_FRIENDLY:
74+
md.disable('emphasis') # Disables both * and _ emphasis
75+
```
76+
77+
**Frontend (JavaScript):**
78+
```javascript
79+
// askbot/media/wmd/askbot_converter.js
80+
if (settings.mathjaxEnabled || settings.markupCodeFriendly) {
81+
this._md.disable('emphasis');
82+
}
83+
```
84+
85+
#### 2. Math Delimiter Protection (Treats Math as Verbatim)
86+
87+
Add custom markdown-it rules to detect and preserve `$...$` and `$$...$$` blocks:
88+
89+
**Backend (Python):**
90+
```python
91+
# askbot/utils/markdown_plugins/math_protect.py
92+
def math_inline_rule(state, silent):
93+
"""Detect $...$ and create verbatim token"""
94+
if state.src[pos] == '$':
95+
# Find closing $
96+
# Create token with content preserved
97+
token = state.push('math_inline', '', 0)
98+
token.content = matched_text
99+
return True
100+
101+
def math_protect_plugin(md):
102+
md.inline.ruler.before('escape', 'math_inline', math_inline_rule)
103+
md.renderer.rules.math_inline = lambda tokens, idx: tokens[idx].content
104+
```
105+
106+
**Frontend (JavaScript):**
107+
```javascript
108+
// askbot/media/wmd/askbot_converter.js
109+
AskbotMarkdownConverter.prototype._protectMathDelimiters = function() {
110+
function mathInlineRule(state, silent) {
111+
// Detect $...$
112+
// Create token with verbatim content
113+
}
114+
115+
this._md.inline.ruler.before('escape', 'math_inline', mathInlineRule);
116+
this._md.renderer.rules.math_inline = function(tokens, idx) {
117+
return tokens[idx].content; // Render as-is
118+
};
119+
};
120+
```
121+
122+
## Implementation Locations
123+
124+
### Phase 1: Backend (Python)
125+
126+
**Files Modified:**
127+
- `askbot/utils/markup.py` - Main converter configuration
128+
- `askbot/utils/markdown_plugins/math_protect.py` - NEW plugin (to be created)
129+
130+
**Changes:**
131+
```python
132+
# In get_md_converter():
133+
134+
# 1. Auto-enable code-friendly mode
135+
if askbot_settings.ENABLE_MATHJAX or askbot_settings.MARKUP_CODE_FRIENDLY:
136+
md.disable('emphasis')
137+
138+
# 2. Protect math delimiters
139+
if askbot_settings.ENABLE_MATHJAX:
140+
from askbot.utils.markdown_plugins.math_protect import math_protect_plugin
141+
md.use(math_protect_plugin)
142+
```
143+
144+
### Phase 2: Frontend (JavaScript)
145+
146+
**Files Modified:**
147+
- `askbot/media/wmd/askbot_converter.js` - Add `_protectMathDelimiters()` method
148+
149+
**Changes:**
150+
```javascript
151+
// In _configureSettings():
152+
153+
// 1. Code-friendly mode
154+
if (settings.mathjaxEnabled || settings.markupCodeFriendly) {
155+
this._md.disable('emphasis');
156+
}
157+
158+
// 2. Math delimiter protection
159+
if (settings.mathjaxEnabled) {
160+
this._protectMathDelimiters();
161+
}
162+
```
163+
164+
### Phase 3: Testing
165+
166+
**Test Cases Added:**
167+
168+
**Backend Tests** (`askbot/tests/test_markdown_integration.py`):
169+
- `test_mathjax_math_delimiters_preserved()` - Check $...$ preserved
170+
- `test_mathjax_underscores_not_emphasis()` - Check underscores work
171+
- `test_mathjax_complex_latex()` - Check complex expressions
172+
173+
**Edge Case Tests** (`askbot/tests/test_markdown_edge_cases.py`):
174+
- `test_mathjax_inline_math_preserved()` - Inline math
175+
- `test_mathjax_display_math_preserved()` - Display math
176+
- `test_mathjax_underscores_not_processed()` - Underscore handling
177+
178+
## Configuration
179+
180+
### Settings Interaction
181+
182+
| Setting | Effect |
183+
|---------|--------|
184+
| `ENABLE_MATHJAX = False` | Normal markdown (underscores create emphasis) |
185+
| `ENABLE_MATHJAX = True` | Code-friendly mode + math protection enabled |
186+
| `MARKUP_CODE_FRIENDLY = True` | Code-friendly mode (no underscore emphasis) |
187+
188+
### Automatic Behavior
189+
190+
When admin enables MathJax in settings (`ENABLE_MATHJAX = True`):
191+
192+
1. ✅ Code-friendly mode activates automatically
193+
2. ✅ Underscore emphasis disabled
194+
3. ✅ Math delimiter protection enabled
195+
4. ✅ LaTeX content preserved verbatim
196+
5. ✅ MathJax can render math on client-side
197+
198+
## Testing Strategy
199+
200+
### Manual Testing Checklist
201+
202+
**Backend (Python):**
203+
```bash
204+
cd testproject/
205+
python manage.py shell
206+
207+
from askbot.utils.markup import get_md_converter
208+
md = get_md_converter()
209+
210+
# Test 1: Inline math preserved
211+
text = "The equation $E = mc^2$ is famous"
212+
html = md.render(text)
213+
assert '$E = mc^2$' in html
214+
215+
# Test 2: Underscores not processed
216+
text = "$a_b$ and $x_{123}$"
217+
html = md.render(text)
218+
assert '$a_b$' in html
219+
assert '<em>' not in html
220+
assert '<sub>' not in html
221+
222+
# Test 3: Display math preserved
223+
text = "$$\\int_0^1 x dx$$"
224+
html = md.render(text)
225+
assert '$$' in html
226+
assert '\\int' in html
227+
```
228+
229+
**Frontend (JavaScript):**
230+
```javascript
231+
// In browser console
232+
var converter = new AskbotMarkdownConverter();
233+
234+
// Test 1: Inline math
235+
var html = converter.makeHtml("The equation $E = mc^2$ is famous");
236+
console.log(html); // Should contain $E = mc^2$
237+
238+
// Test 2: Underscores
239+
var html = converter.makeHtml("$a_b$ and $x_{123}$");
240+
console.log(html); // Should NOT have <em> or <sub> tags
241+
242+
// Test 3: MathJax rendering (if enabled)
243+
// After typing in editor, MathJax should render the math
244+
```
245+
246+
### Automated Testing
247+
248+
**Run backend tests:**
249+
```bash
250+
cd testproject/
251+
python manage.py test askbot.tests.test_markdown_integration -k mathjax
252+
python manage.py test askbot.tests.test_markdown_edge_cases -k mathjax
253+
```
254+
255+
**Run frontend tests:**
256+
```bash
257+
# Browser-based testing (Selenium/Playwright)
258+
python manage.py test askbot.tests.test_markdown_frontend -k mathjax
259+
```
260+
261+
## Edge Cases and Gotchas
262+
263+
### 1. Dollar Signs in Regular Text
264+
265+
**Problem:** `$100 and $200` might be treated as math
266+
267+
**Solution:**
268+
- Math detection requires closing `$`
269+
- `$100` alone won't match (no closing delimiter)
270+
- To write literal `$`, use `\$` or just `$` (works fine)
271+
272+
### 2. Escaped Dollar Signs
273+
274+
**Problem:** How to write literal `$` in text?
275+
276+
**Solution:**
277+
- Just use `$` - it's fine if not followed by matching `$`
278+
- Or use `\$` if you want to be explicit
279+
280+
### 3. Nested Delimiters
281+
282+
**Problem:** `$outer $inner$ outer$`
283+
284+
**Solution:**
285+
- Current implementation uses simple matching
286+
- First `$` matches with next `$`
287+
- Nested delimiters not supported (rare in LaTeX)
288+
289+
### 4. Display Math on Own Line
290+
291+
**Problem:** `$$..$$` should be block-level, not inline
292+
293+
**Solution:**
294+
- Put `$$` on separate lines:
295+
```markdown
296+
Text before
297+
298+
$$
299+
\int_0^1 x dx
300+
$$
301+
302+
Text after
303+
```
304+
305+
## Relationship to MathJax Upgrade
306+
307+
This math protection is **orthogonal** to the MathJax v2→v4 upgrade:
308+
309+
| Concern | Markdown Upgrade | MathJax Upgrade |
310+
|---------|------------------|-----------------|
311+
| **Scope** | Preserve LaTeX in HTML | Render LaTeX visually |
312+
| **When** | During markdown→HTML | After HTML loaded |
313+
| **Where** | Server + client | Client only |
314+
| **Plugin** | Math delimiter protection | MathJax library |
315+
316+
**Both projects are independent:**
317+
- Markdown upgrade works with MathJax v2 OR v4
318+
- MathJax upgrade doesn't require markdown upgrade
319+
- But doing both together is ideal
320+
321+
## Migration Notes
322+
323+
### For Existing Installations
324+
325+
**Before markdown-it upgrade:**
326+
- MathJax works (if configured)
327+
- Some edge cases may fail (e.g., complex underscores)
328+
329+
**After markdown-it upgrade:**
330+
- MathJax still works
331+
- Better math delimiter protection
332+
- More reliable with complex LaTeX
333+
334+
**No content re-rendering needed:**
335+
- Math delimiters already in database
336+
- MathJax processes them client-side
337+
- No server-side changes to stored HTML
338+
339+
## Future Enhancements
340+
341+
### Potential Improvements (Not in Current Scope)
342+
343+
1. **Server-Side Math Rendering**
344+
- Render LaTeX → SVG on server
345+
- Better SEO, faster initial load
346+
- Complexity: High, security concerns
347+
348+
2. **Alternative Delimiters**
349+
- Support `\(...\)` and `\[...\]`
350+
- CommonMark math extension
351+
- Complexity: Medium
352+
353+
3. **Math Block Plugin**
354+
- Dedicated fenced code blocks: ` ```math `
355+
- Clearer syntax
356+
- Complexity: Low
357+
358+
4. **Syntax Validation**
359+
- Validate LaTeX syntax in editor
360+
- Show errors before save
361+
- Complexity: Medium
362+
363+
## References
364+
365+
### Documentation
366+
- [MathJax Documentation](https://docs.mathjax.org/)
367+
- [markdown-it Documentation](https://markdown-it.github.io/)
368+
- [CommonMark Spec](https://spec.commonmark.org/)
369+
370+
### Related Tasks
371+
- `tasks/markdown-upgrade-phase1-backend.md` - Backend implementation
372+
- `tasks/markdown-upgrade-phase2-frontend.md` - Frontend implementation
373+
- `tasks/markdown-upgrade-phase3-testing.md` - Testing strategy
374+
- `tasks/upgrade-mathjax.md` - MathJax v4 upgrade plan
375+
376+
### Code Locations
377+
- `askbot/conf/markup.py:54-65` - MathJax settings
378+
- `askbot/jinja2/meta/bottom_scripts.html:192-205` - MathJax loading
379+
- `askbot/media/wmd/askbot_converter.js` - Editor preview
380+
- `askbot/utils/markup.py` - Backend markdown conversion
381+
382+
## Summary
383+
384+
**Key Takeaways:**
385+
386+
1. ✅ Markdown parser does NOT render MathJax - it preserves delimiters
387+
2. ✅ Two mechanisms: code-friendly mode + delimiter protection
388+
3. ✅ Works on both backend (Python) and frontend (JavaScript)
389+
4. ✅ Automatic when `ENABLE_MATHJAX = True`
390+
5. ✅ No breaking changes to existing math content
391+
6. ✅ Independent of MathJax version (v2/v3/v4)
392+
393+
**The relationship is simple:**
394+
- Markdown creates HTML with `$...$` preserved
395+
- MathJax renders `$...$` into beautiful math
396+
- They never need to "talk" to each other

0 commit comments

Comments
 (0)