Experimental 'batteries included' client-side markdown parser & renderer written in pure TypeScript.
This same readme as a demo: md2.at
There are already many excellent, battle-tested markdown parsing / rendering libraries and utilities available in js/ts ecosystem. However, none of those were fully suitable for me in my daily work in another contexts where ease of use, lightness and privacy are essential requirements.
So, I decided to create a tool that would allow me to visualize any markdown in an accessible way with as little effort as possible. This also worked as a nice reminder and bit of a learning experience in working with modern js/ts lower level capabilities, and I think this can work also as an example of how JIT-compiled javaScript can take an advantage of contiguous memory layout for storing state. Although not optimized yet, it can already make a quite a difference in performance and memory usage.
I have tried this with quite large .md files (+100Mb), that contained pretty much only code. As the basic syntax-highlighting is built-in and those blocks are fairly heavy to render, they performed surprisingly well, even on my phone.
As an example to use the parser/renderer I created small service md2.at which is just a client-side typescript on free static hosting (render.com). This small "service" allows me to append any publicly available .md -file into the service's url and I get shareable/embeddable visualization for that markdown.
This example service is still in very early stages but it is going to stay
Aimed for making markdown visualizations more accessible while maintaining efficiency and privacy.
- Zero external dependencies in build
The implementation keeps external dependencies out of the hot path and focuses on predictable, byte-level processing. Both HTML and Canvas renderers are included so the same parse result can be examined in different output backends.
The project is still in an early phase, the public API and packaging will evolve before the planned publication later this year.
The intent of this repository is not to compete with broad Markdown frameworks but to provide accessible visualisation while keeping the ratio between performance and supported features reasonable.
- Single-pass parsing implemented with byte spans rather than string slicing.
- No regular expressions in the core parser; all matching is done with explicit scans.
- Two renderers: an HTML renderer that emits escaped markup and a Canvas renderer for visual inspection.
- Arena-style byte buffer to reduce allocations while building output.
- Test coverage that includes golden tests, property-based fuzzing, and targeted benchmarks.
- GitHub Flavored Markdown coverage for tables, task items, and strikethrough.
- URL allowlisting and HTML escaping enabled by default.
- ESM exports suitable for browser bundlers and server-side usage.
- Optional dark/light UI presets with persisted preference and theme builder integration.
- Headings (H1-H6):
# Heading - Blockquotes:
> Quote - Lists:
- Unordered:
- Itemor* Itemor+ Item - Ordered:
1. Item - Task Lists:
- [ ] Uncheckedor- [x] Checked
- Unordered:
- Horizontal Rules:
---or*** - Code Blocks: Fenced with
```or~~~ - Inline Code:
`code` - Emphasis:
*italic*or_italic_ - Strong:
**bold**or__bold__ - Strikethrough:
~~struck~~ - Links:
[text](url) - Images:
 - Autolinks: Automatic linking of
http://,https://, andwww.URLs - Tables:
| Header | Header |\n|--------|--------|\n| Cell | Cell | - Info Blocks:
::: info,::: warning,::: error,::: success
Syntax Highlighting language specs are not complete and probably contain still many issues, few that I'm already aware of and working towards to fix those.
To keep things lightweight, this is probably going to be an optional plugin based feature in the future to get correct grammars for different languages.
So far built-in basic syntax highlighters cover the following languages:
- JavaScript / TypeScript
- Python
- Java
- C / C++
- C#
- Go
- Rust
- Swift
- Kotlin
- Scala
- Dart
- Ruby
- PHP
- Shell scripts (bash/sh/zsh)
- PowerShell
- Lua
- Perl
- Haskell
- Elixir
- Erlang
- Clojure
- R
- SQL
- JSON
- YAML
- TOML
- INI / config files
- Dockerfile
- Make / Makefile
- F#
- HTML / XML / SVG
Additional languages can be registered at runtime with registerHighlightLanguage.
I have experimental setup of using precompiled language specs in runtime to reduce overhead of compiling those but this is not optimal way to do things and might look bad as the code containes block of base64 encoded binary representation that is consumed by highlihting. This code is used to generate the precompiled.ts file.
Should work with both, browser and SSR.
import { MDParser, u8 } from 'smdp';
const parser = new MDParser({
// Security: disable raw HTML blocks by default
allowRawHtml: false,
// Custom URL allowlist (optional)
urlAllowlist: (url) => url.startsWith('https://') || url.startsWith('mailto:'),
});
const markdown = '# Hello World\n\nThis is **bold** text with ~~strikethrough~~ and `code`.';
parser.parse(u8(markdown)).then(html => {
console.log(html);
// Output: <h1>Hello World</h1>\n<p>This is <strong>bold</strong> text with <del>strikethrough</del> and <code>code</code>.</p>\n
});Works only in browser, still work in progress
import { MDParser, u8 } from 'smdp';
const parser = new MDParser();
const canvas = document.createElement('canvas');
canvas.width = 800;
const markdown = `# Hello Canvas
This is **bold** text with ~~strikethrough~~.
- [ ] Task list item
- [x] Completed task
| Header 1 | Header 2 |
|----------|----------|
| Cell 1 | Cell 2 |
\`\`\`javascript
function hello() {
console.log('world');
}
\`\`\``;
parser.renderToCanvas(u8(markdown), canvas);
document.body.appendChild(canvas);// In Node.js or SSR environments, only HTML parsing is available
import { MDParser, u8 } from 'smdp';
const parser = new MDParser();
const markdown = '# Server-Side Rendering\n\nWorks without DOM APIs.';
parser.parse(u8(markdown)).then(html => {
console.log(html);
});
// Canvas rendering is not available in SSR environments
// parser.renderToCanvas(u8(markdown), canvas); // ❌ Not availableUse /book/<entry-url> to treat a markdown document as a book entry that links to other chapters.
github.com/.../blob/...chapter links are automatically converted toraw.githubusercontent.com/...for fetching.- Relative chapter links (for example
./chapter-2.md) are resolved against each chapter file URL. - Linked markdown chapters are discovered and prefetched in the background.
Example:
https://md2.at/book/https://github.com/owner/repo/blob/main/docs/README.md
When a chapter link is opened, the selected part is stored in ?part=<chapter-url> so deep links remain shareable.
import { highlightCodeBlock } from 'smdp/highlight';
const code = 'function fibonacci(n) {\n if (n <= 1) return n;\n return fibonacci(n - 1) + fibonacci(n - 2);\n}';
const highlighted = highlightCodeBlock(new TextEncoder().encode(code), 'javascript');
console.log(new TextDecoder().decode(highlighted));import { createThemeBuilder } from 'smdp/theme';
const builder = createThemeBuilder()
.withMeta({ colorScheme: 'light', fontFamily: '"IBM Plex Sans", system-ui, sans-serif' })
.withTokens({
bgBase: '#f5f6fa',
textPrimary: '#1f2933',
accent: '#2563eb',
codeKw: '#7c3aed',
});
// Option 1: apply directly to the current document
builder.apply(); // defaults to document.documentElement
// Option 2: inject scoped CSS (useful for SSR or style encapsulation)
const themeCss = builder.buildCss(':root');The demo includes a palette button that opens a theme editor. The editor uses the same ThemeBuilder helper exposed through the public API and updates CSS variables in place.
- Privacy: there is no telemetry or analytics built in the code. Requests occur only when loading external Markdown that the user specifies to be loaded from trusted source.
- Licensing: the entire codebase is released under the MIT License.
- AI usage: we highly value carefully hand-crafted code while recognising that LLMs, applied with intent and review, can accelerate exploration without diluting quality.
The parser is split into logical modules:
types.ts: TypeScript type definitions and interfacesconstants.ts: Pre-encoded HTML tags and styling constantsutils.ts: Byte-level utility functions for parsingarena.ts: Memory-efficient HTML buffer with geometric growthline-parser.ts: Line span generator for input splittinginline-parser.ts: Inline token generator (emphasis, code, links, etc.)block-parser.ts: Block-level structure parser (headings, lists, code blocks, etc.)html-renderer.ts: HTML output renderercanvas-renderer.ts: Canvas output rendererindex.ts: Main MDParser class and public API
The core pipeline is built around byte ranges rather than strings. The process is:
- Line segmentation:
lineSpanswalks the Uint8Array, recording start/end offsets for each line. No copies are made, and the raw array is never converted to strings at this stage. - Block parsing:
blocksiterates through the line spans once, emitting events such asheading,listOpen,listItem,codeOpen, etc. Indentation, fences, and info blocks are resolved here. Since block parsing is single-pass, nested structures (lists-in-lists, blockquotes) are tracked via a small stack structure. - Inline parsing: For ranges that require inline formatting (links, emphasis, code spans),
inlineTokensperforms another byte-level pass within the line boundaries. It produces typed tokens (text,link,img,code,autolink,strike, ...). Multiple passes are avoided by piggybacking on the already segmented line spans. - Rendering: Both renderers consume the block/inlines event stream without reparsing. The HTML renderer writes directly into an arena-like buffer (see
arena.ts), which grows geometrically to limit reallocations. The Canvas renderer replays the same stream into 2D drawing commands, relying on the same inline tokenization for highlighting and styling.
Important details:
- Writer: The HTML renderer calls
HtmlArena.writeEscapedand related methods that operate on byte slices, so writing out HTML stays allocation-friendly and avoids intermediate strings. Only at the end isUint8Arrayconverted back to a string (TextDecoder). - Syntax highlighting: The highlighting path is decoupled from the markdown parser. When a fenced code block is found, the captured byte ranges are passed to
highlightCodeBlock. Highlighting uses a generative tokenizer compiled from language specs (or precompiled data), then writes markup via the same arena-like approach. - Canvas rendering:
renderToCanvasFromBlocksshares the block event stream but renders into a canvas context. It keeps cached font measurements, performs line-wrapping per block, and triggers a rerender when images finish loading. Virtual scrolling is used when the rendered height exceeds twice the viewport.
// High-level structure: see src/parser/index.ts
export class MDParser {
async parse(u8arr: Uint8Array) {
return renderHTMLFromBlocks(u8arr, this.options);
}
renderToCanvas(u8arr: Uint8Array, canvas: HTMLCanvasElement) {
renderToCanvasFromBlocks(u8arr, canvas, this.options);
}
}
// renderHTMLFromBlocks (simplified) in src/parser/html-renderer.ts
for (const ev of blocks(u8)) {
switch (ev.type) {
case 'heading':
arena.writeBytes(TAG.hPre[ev.level - 1]);
renderInline(u8, ev.s, ev.e, arena, options);
arena.writeBytes(TAG.hClose[ev.level - 1]);
break;
case 'codeOpen':
codeBuffer = [];
break;
case 'codeText':
codeBuffer.push({ s: ev.s, e: ev.e });
break;
case 'codeClose':
const highlighted = await highlightCodeBlock(join(codeBuffer), codeLang);
arena.writeBytes(highlighted);
codeBuffer = null;
break;
// ...other block types (lists, blockquotes, tables, info blocks)
}
}
// inlineTokens (see src/parser/inline-parser.ts) walks a byte slice and emits tokens
if (c === 0x5b /* '[' */) {
const close = findBracket(u8, i + 1, e, 0x5d);
if (close !== -1) {
const hrefStart = close + 2; // '(' after ']'
const hrefEnd = findBracket(u8, hrefStart, e, 0x29);
tokens.push({ kind: 'link', textS: i + 1, textE: close, hrefS: hrefStart, hrefE: hrefEnd });
}
}
// Canvas renderer consumes the same events (src/parser/canvas-renderer.ts)
for (const ev of blocks(u8)) {
switch (ev.type) {
case 'paraLine':
renderInlineToCanvas(ev.s, ev.e, ctx, currentX, currentY);
break;
case 'img':
const src = resolveUrlRelativeToBase(...);
const cached = loadImage(src, rerender);
drawImageOrPlaceholder(cached, ctx, currentX, currentY);
break;
// ...other block rendering
}
}- Predictable performance: Byte-range processing and arena-like buffers keep allocations low, which shows up in the included micro-benchmarks (
npm run test:bench). - Single-pass correctness: Blocks are identified without backtracking, inline parsing respects boundaries established by the block layer (for example, emphasis is never resolved inside code spans).
- Separation of concerns: HTML and Canvas renderers consume the same block/inline events so new renderers (e.g., PDF or terminal) can be added without touching the parser core.
- Themeable UI: The public theme builder feeds both the default UI and consumer customizations; the new light/dark presets are simply predefined token sets.
- Streaming input: Although the parser is single-pass, it still expects the full Uint8Array. Enabling incremental parsing (e.g., processing chunks from a stream) would reduce memory spikes for very large documents.
- Error recovery: Inline parsing errs on the side of stopping at malformed constructs. Better error recovery could keep rendering intact even when Markdown is intentionally or accidentally broken.
- Extensibility hooks: Callbacks for custom block/inline tokens could be surfaced. Today, extensions require forking the parser.
- Canvas accessibility: The Canvas renderer focuses on presentation. To serve assistive technologies, a hybrid mode that emits both Canvas and hidden HTML (or ARIA descriptions) would close the accessibility gap.
- More grammars: The highlighting pipeline accepts additional grammars, but coverage remains limited to the precompiled set. Expanding that library or providing an easier authoring path is on the roadmap.
Main parser class.
Parses Markdown (as Uint8Array) and returns a Promise that resolves to an HTML string. Pass overrides.baseUrl to rewrite relative links and image sources against the fetched document's origin.
Renders Markdown (as Uint8Array) to an HTML5 Canvas.
Utility function to convert a string to Uint8Array using UTF-8 encoding.
InlineToken: Token types for inline parsingBlockEvent: Event types for block parsingLineSpan: Line position informationTextStyle: Styling information for canvas renderingDrawResult: Canvas drawing result coordinates
You can also use the individual parsers and renderers:
lineSpans(u8: Uint8Array): Generator yielding line spansinlineTokens(u8: Uint8Array, s: number, e: number): Generator yielding inline tokensblocks(u8: Uint8Array): Generator yielding block eventsrenderHTMLFromBlocks(u8: Uint8Array, options?: ParserOptions): Render blocks to HTMLrenderToCanvasFromBlocks(u8: Uint8Array, canvas: HTMLCanvasElement, options?: ParserOptions): Render blocks to canvas
The project uses modern TypeScript with strict type checking enabled:
- ES2022 target
- ESNext modules
- Strict mode enabled
- Bundler module resolution
- Comprehensive linting rules
Test suites include golden comparisons, property-based checks, and micro-benchmarks:
# Run all tests
npm test
# Run specific test suites
npm run test:golden # Golden tests for parser output
npm run test:property # Property-based tests for parser invariants
npm run test:bench # Performance benchmarks
# Watch mode for development
npm run test:watchThis is designed to work with Vite or similar modern bundlers.
npm install
npm run dev
npm run buildMIT