**Effortlessly convert raw Markdown to Telegram plain text
- MessageEntity pairs.**
Say goodbye to MarkdownV2 escaping headaches! This library parses Markdown (including LLM output, GitHub READMEs, etc.)
and produces (text, entities) tuples that can be sent directly via the Telegram Bot API β no parse_mode needed.
- No matter the format or length, it can be easily handled!
- Entity offsets are measured in UTF-16 code units, exactly as Telegram requires.
- We also support LaTeX-to-Unicode conversion, expandable block quotes, and Mermaid diagram rendering.
- Built on pyromark (Rust pulldown-cmark bindings) for speed and correctness.
Note
v1.0.0 is a breaking change from 0.x. The output is now (str, list[MessageEntity]) instead of a MarkdownV2 string.
The old markdownify() and standardize() functions have been removed.
Currently in release candidate. Install with pip install telegramify-markdown --pre to try it.
The default pip install telegramify-markdown (without --pre) still installs the stable 0.5.x version.
| convert() | convert() | telegramify() |
|---|---|---|
![]() |
![]() |
![]() |
Requires Python 3.10+. Currently in release candidate β use the pre-release flag for your package manager.
# uv (recommended)
uv add telegramify-markdown --prerelease=allow
uv add "telegramify-markdown[mermaid]" --prerelease=allow
# pip
pip install telegramify-markdown --pre
pip install "telegramify-markdown[mermaid]" --pre
# PDM
pdm add telegramify-markdown --prerelease
pdm add "telegramify-markdown[mermaid]" --prerelease
# Poetry
poetry add telegramify-markdown --allow-prereleases
poetry add "telegramify-markdown[mermaid]" --allow-prereleases- If you just want to send static text and don't want to worry about formatting β use
convert() - If you are developing an LLM application or need to send potentially super-long text β use
telegramify() - If you need to split
convert()output manually β usesplit_entities()
from telebot import TeleBot
from telegramify_markdown import convert
bot = TeleBot("YOUR_TOKEN")
md = "**Bold**, _italic_, and `code`."
text, entities = convert(md)
bot.send_message(
chat_id,
text,
entities=[e.to_dict() for e in entities],
)No parse_mode parameter β Telegram reads the entities directly.
For LLM output or long documents, telegramify() splits text, extracts code blocks as files,
and renders Mermaid diagrams as images:
import asyncio
from telebot import TeleBot
from telegramify_markdown import telegramify
from telegramify_markdown.content import ContentType
bot = TeleBot("YOUR_TOKEN")
md = """
# Report
Here is some analysis with **bold** and _italic_ text.
```python
print("hello world")
```
And a diagram:
```mermaid
graph TD
A-->B
```
"""
async def send():
results = await telegramify(md, max_message_length=4090)
for item in results:
if item.content_type == ContentType.TEXT:
bot.send_message(
chat_id,
item.text,
entities=[e.to_dict() for e in item.entities],
)
elif item.content_type == ContentType.PHOTO:
bot.send_photo(
chat_id,
(item.file_name, item.file_data),
caption=item.caption_text or None,
caption_entities=[e.to_dict() for e in item.caption_entities] or None,
)
elif item.content_type == ContentType.FILE:
bot.send_document(
chat_id,
(item.file_name, item.file_data),
caption=item.caption_text or None,
caption_entities=[e.to_dict() for e in item.caption_entities] or None,
)
asyncio.run(send())If you use convert() but need to split long output yourself:
from telegramify_markdown import convert, split_entities
text, entities = convert(long_markdown)
for chunk_text, chunk_entities in split_entities(text, entities, max_utf16_len=4096):
bot.send_message(
chat_id,
chunk_text,
entities=[e.to_dict() for e in chunk_entities],
)Customize heading symbols, link symbols, and expandable citation behavior:
from telegramify_markdown.config import get_runtime_config
cfg = get_runtime_config()
cfg.markdown_symbol.heading_level_1 = "π"
cfg.markdown_symbol.link = "π"
cfg.cite_expandable = True # Long quotes become expandable_blockquote
# For clean output without emoji heading prefixes:
# cfg.markdown_symbol.heading_level_1 = ""
# cfg.markdown_symbol.heading_level_2 = ""
# cfg.markdown_symbol.heading_level_3 = ""
# cfg.markdown_symbol.heading_level_4 = ""Synchronous. Converts a Markdown string to plain text and a list of MessageEntity objects.
| Parameter | Type | Default | Description |
|---|---|---|---|
markdown |
str |
required | Raw Markdown text |
latex_escape |
bool |
True |
Convert LaTeX \(...\) and \[...\] to Unicode symbols |
Returns (text, entities) where text is plain text and entities is a list of MessageEntity.
Async. Full pipeline: converts Markdown, splits long messages, extracts code blocks as files, renders Mermaid diagrams as images.
| Parameter | Type | Default | Description |
|---|---|---|---|
content |
str |
required | Raw Markdown text |
max_message_length |
int |
4096 |
Max UTF-16 code units per text message |
latex_escape |
bool |
True |
Convert LaTeX to Unicode |
Returns an ordered list of Text, File, or Photo objects.
Split text + entities into chunks within a UTF-16 length limit. Splits at newline boundaries; entities spanning a split point are clipped into both chunks.
@dataclasses.dataclass(slots=True)
class MessageEntity:
type: str # "bold", "italic", "code", "pre", "text_link", etc.
offset: int # Start position in UTF-16 code units
length: int # Length in UTF-16 code units
url: str | None # For "text_link" entities
language: str | None # For "pre" entities (code block language)
custom_emoji_id: str | None # For "custom_emoji" entities
def to_dict(self) -> dict: ...| Class | Fields | Description |
|---|---|---|
Text |
text, entities, content_trace |
A text message segment |
File |
file_name, file_data, caption_text, caption_entities, content_trace |
An extracted code block |
Photo |
file_name, file_data, caption_text, caption_entities, content_trace |
A rendered Mermaid diagram |
Returns the length of a string in UTF-16 code units (what Telegram uses for offsets).
- Headings (Levels 1-6: H1-H2 bold+underline, H3-H4 bold, H5-H6 italic; H1-H4 with emoji prefix)
-
**Bold**,*Italic*,~~Strikethrough~~ -
||Spoiler|| -
[Links](url)and - Telegram custom emoji
 - Inline
codeand fenced code blocks - Block quotes
>(with expandable citation support) - Tables (rendered as monospace
preblocks) - Ordered and unordered lists
- Task lists
- [x]/- [ ] - Horizontal rules
--- - LaTeX math
\(...\)and\[...\](converted to Unicode) - Mermaid diagrams (rendered as images, requires
[mermaid]extra)
Copy this block into your AI assistant's context (e.g. CLAUDE.md, Cursor Rules, etc.) to get
accurate code generation for telegramify-markdown:
Click to expand context block
# telegramify-markdown integration guide
## Install
uv add telegramify-markdown --prerelease=allow # or: pip install telegramify-markdown --pre
## API (v1.0.0+) β outputs plain text + MessageEntity, NOT MarkdownV2 strings
### convert() β sync, single message
from telegramify_markdown import convert
text, entities = convert("**bold** and _italic_")
bot.send_message(chat_id, text, entities=[e.to_dict() for e in entities])
# Do NOT set parse_mode β entities replace it entirely.
### telegramify() β async, auto-splits long text, extracts code blocks as files
from telegramify_markdown import telegramify
from telegramify_markdown.content import ContentType
results = await telegramify(md, max_message_length=4090)
for item in results:
if item.content_type == ContentType.TEXT:
bot.send_message(chat_id, item.text, entities=[e.to_dict() for e in item.entities])
elif item.content_type == ContentType.FILE:
bot.send_document(chat_id, (item.file_name, item.file_data))
elif item.content_type == ContentType.PHOTO:
bot.send_photo(chat_id, (item.file_name, item.file_data))
### split_entities() β manual splitting for convert() output
from telegramify_markdown import convert, split_entities
text, entities = convert(long_md)
for chunk_text, chunk_entities in split_entities(text, entities, max_utf16_len=4096):
bot.send_message(chat_id, chunk_text, entities=[e.to_dict() for e in chunk_entities])
### Configuration
from telegramify_markdown.config import get_runtime_config
cfg = get_runtime_config()
cfg.markdown_symbol.heading_level_1 = "π"
cfg.cite_expandable = True
## Critical rules
- entities must be passed as list[dict] via [e.to_dict() for e in entities], NEVER as JSON string
- NEVER set parse_mode when sending with entities β they are mutually exclusive
- All entity offsets are UTF-16 code units. Use utf16_len() to measure text length.
- Requires Python 3.10+This library is inspired by npm:telegramify-markdown.
LaTeX escape is inspired by latex2unicode and @yym68686.
This project is licensed under the MIT License β see the LICENSE file for details.


