Showcase: Advanced Markdown Chunker – content-aware Markdown chunking for Dify RAG #29635

asukhodko · 2025-12-14T10:17:03Z

asukhodko
Dec 14, 2025

Hi everyone,

I’ve been working with Markdown-based knowledge bases in Dify for a while, and I kept bumping into the same issues with naive chunking: code blocks getting split in the middle, nested lists broken apart, and chunks that don’t really match how the doc is structured.

So I ended up building a plugin called Advanced Markdown Chunker and wanted to share it here.

The idea is to make Markdown chunking a bit smarter and more aligned with real-world docs. Instead of using a single fixed strategy, the plugin looks at the document and decides which of four internal strategies fits best for that particular file:

a code-aware mode for API docs / technical references / tutorials with lots of fenced code blocks;
a list-aware mode for changelogs, release notes, feature lists, outlines, etc.;
a more structural mode for docs with deeper heading hierarchies;
and a fallback strategy for mixed or simple Markdown that doesn’t strongly lean in any direction.

On top of that, it tries hard not to break Markdown in silly places. Code blocks, tables and lists are kept intact, and chunks “remember” which headers they belong to so you don’t lose the document structure when you index it. Neighbouring chunks also share some overlap (up to ~35%) so context doesn’t abruptly stop at a boundary.

Each chunk can optionally include some metadata like:

content type (text/code/list/table),
the header path,
source line numbers,
a bit of previous/next overlap context.

That tends to help a lot when you’re debugging retrieval or building filters on top of a vector store. All parsing and chunking run fully locally inside Dify – no external APIs involved and no LLM needed just to split Markdown.

This is mainly aimed at Markdown-heavy RAG setups: docs, API/SDK guides with a lot of code, changelogs / release notes, technical specs, architecture docs, etc.

Links:

GitHub repo: https://github.com/asukhodko/dify-markdown-chunker
Plugin PR to dify-plugins (for the Marketplace): feat(tools): add markdown_chunker - Advanced Markdown Chunker dify-plugins#1685

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Showcase: Advanced Markdown Chunker – content-aware Markdown chunking for Dify RAG #29635

Uh oh!

{{title}}

Uh oh!

Replies: 0 comments

Select a reply

Uh oh!

Showcase: Advanced Markdown Chunker – content-aware Markdown chunking for Dify RAG #29635

Uh oh!

asukhodko Dec 14, 2025

Replies: 0 comments

asukhodko
Dec 14, 2025