A configurable Markdown parser and HTML renderer built from scratch using regex for inline formatting.
- Regular Expressions: Using
regexcrate for pattern matching - Enums for AST: Representing document structure
- Trait System: Creating extensible rendering architecture
- Module Organization: Structuring larger projects with multiple modules
- Builder Pattern: Ergonomic configuration design
- Integration Testing: Testing complete workflows
- Documentation: Writing comprehensive doc comments
- Headers (
#through######) - Paragraphs
- Unordered lists (
-)
- Bold (
**text**) - Italic (
*text*) Code(`text`)- Links (
[text](url))
cargo runConverts test.md to text.html using the default configuration.
Input (Markdown):
# Main Title
This is a paragraph with **bold** and *italic* text.
- First item
- Second item with `code`
- Third [link](https://example.com)Output (HTML):
<h1>Main Title</h1>
<p>This is a paragraph with <strong>bold</strong> and <em>italic</em> text.</p>
<ul>
<li>First item</li>
<li>Second item with <code>code</code></li>
<li>Third <a href="https://example.com">link</a></li>
</ul>markdown_to_html_converter/
├── src/
│ ├── lib.rs # Public API, re-exports
│ ├── types.rs # Core types (Config, MarkdownElement, Renderer trait)
│ ├── parser.rs # Markdown parsing logic
│ ├── html.rs # HTML rendering implementation
│ └── file.rs # File I/O operations
├── tests/
│ └── integration_test.rs
└── Cargo.toml
#[derive(Debug)]
pub enum MarkdownElement {
Header(u8, String), // level, text
Paragraph(String),
List(String),
}pub trait Renderer {
fn render(&self, elements: &[MarkdownElement]) -> Result<String>;
fn render_element(&self, element: &MarkdownElement) -> Result<String>;
}let config = Config::new("input.md", "output.html")
.with_full_html(true)
.with_max_header_level(4);static BOLD_REGEX: LazyLock<Regex> =
LazyLock::new(|| Regex::new(r"\*\*([^*]+)\*\*").unwrap());
pub fn parse_inner(text: &str) -> String {
let replaced = BOLD_REGEX
.replace_all(&text, "<strong>$1</strong>")
.to_string();
// ... more replacements
}pub fn group_list(html_el: &Vec<String>) -> Vec<String> {
let mut new_html = Vec::new();
let mut new_group = Vec::new();
for i in 0..html_el.len() {
let cur = html_el.get(i).unwrap();
let next = html_el.get(i + 1);
if cur.starts_with("<li>") {
new_group.push(cur.clone());
} else {
new_html.push(cur.clone());
}
// Wrap group when next element isn't a list item
if next.is_some() && !next.unwrap().starts_with("<li>") && new_group.len() > 0 {
let list = new_group.join("\n");
new_group.clear();
let prop_list = format!("<ul>\n{}\n</ul>", list);
new_html.push(prop_list);
}
}
new_html
}#[test]
fn test_full_conversion_pipeline() {
let dir = tempdir().unwrap();
let input_path = dir.path().join("test.md");
let output_path = dir.path().join("test.html");
fs::write(&input_path, "# Test Header\n\nParagraph").unwrap();
let config = Config::new(
input_path.to_str().unwrap(),
output_path.to_str().unwrap()
);
let renderer = HtmlRenderer::new(config);
renderer.convert_file().unwrap();
let output = fs::read_to_string(&output_path).unwrap();
assert!(output.contains("<h1>Test Header</h1>"));
}- LazyLock for Regex: Compiling regex patterns once at initialization
- Trait Objects: Creating extensible systems with trait-based architecture
- Module Privacy: Using
pubstrategically for clean APIs - Doc Comments: Writing comprehensive documentation with examples
- Error Propagation: Using
anyhow::Resultfor flexible error handling - Integration Testing: Testing complete workflows with
tempfile - Re-exports: Using
lib.rsto create convenient public APIs - String Replacement: Chaining
.replace_all()for multiple transformations
- Parser validation for different markdown constructs
- Header level validation
- Individual element rendering
- Full file conversion pipeline
- Error handling for invalid inputs
- Temporary file management with
tempfile
use markdown_to_html_converter::{Config, HtmlRenderer, parse_md};
// Parse markdown
let config = Config::default();
let elements = parse_md(markdown_content, &config)?;
// Render to HTML
let renderer = HtmlRenderer::new(config);
let html = renderer.render(&elements)?;
// Or convert file directly
let config = Config::new("input.md", "output.html");
let renderer = HtmlRenderer::new(config);
renderer.convert_file()?;- Code blocks with syntax highlighting
- Ordered lists (numbered)
- Blockquotes
- Horizontal rules
- Tables
- Images
- Nested lists
- Better handling of nested formatting (e.g., bold inside italic)
- Custom rendering backends (LaTeX, plain text)
- Incremental parsing for large documents
- Complex nested formatting requires lookahead/lookbehind regex (not supported in Rust's regex crate)
- Example:
**bold with *italic* inside**may not render perfectly - Workaround: Use simpler formatting or implement a proper parser
- Chapter 7: Managing Growing Projects with Packages, Crates, and Modules
- Chapter 10: Generic Types, Traits, and Lifetimes
- Chapter 14: More About Cargo and Crates.io
[dependencies]
anyhow = "1.0"
regex = "1.10"
[dev-dependencies]
tempfile = "3.8"Status: ✅ Completed | Difficulty: Intermediate-Advanced