Voice transcription meets autonomous AI agent. Like OpenClaw for your voice memos — transcribe, chat, and let an AI agent work with your files, all through Telegram.
Self-host with your own API keys. Free forever. MIT licensed.

notesly.demo.0219.mp4
Transcription + AI agent in one bot. Two things that belong together but nobody combines:
- Transcription — send a voice memo, get a transcript with speaker labels and timestamps. Any language, any length. That's the entry point.
- Autonomous AI agent — the same bot holds a conversation, summarizes long recordings, and spawns a Claude Code Agent that reads your stored files to answer questions. Like OpenClaw, but specialized for voice-first workflows.
Your Telegram chat becomes a personal AI workspace where voice memos, transcripts, files, and conversations all live together.
- iPhone Voice Memos is the best recorder. One tap from the lock screen. No app to open, no meeting to join. It never crashes. Nothing else comes close.
- But voice memos are a dead end. You have hundreds on your phone. You'll never listen to them again.
- Recording apps solve the wrong problem. Granola, Otter, Fireflies — they replace Voice Memos instead of building on it.
- AI assistants don't know your context. ChatGPT, OpenClaw — powerful, but they can't search last Tuesday's meeting or find what your manager said about the deadline.
- What's missing is the bridge. Transcription that feeds into an AI agent with memory of everything you've recorded and stored.
Send anything to the Telegram bot. It figures out what to do.
| You send | Bot does |
|---|---|
| Voice memo or audio/video file | Transcribes with speaker labels + timestamps. Long recordings get an AI summary. |
| Text message | AI chat — ask questions, get help, have a conversation. |
| Text about your files — "what did we discuss yesterday?" | Searches your stored transcripts and files, answers with context. |
| Any other file (PDF, image, doc...) | Stores it for you. Ask about it later. |
All files — audio, transcripts, uploads — are stored locally and optionally synced to S3. Survives container restarts.
- Record with Apple Voice Memos (or any voice recorder on your phone)
- Share the recording directly to Telegram — no opening another app, no exporting, no emailing yourself
- Read the full transcript — with speaker labels and timestamps — right in the chat
Works with any language. Handles multiple speakers. Transcripts come back in under a minute.
You need two API keys to get started. Both are free:
| Key | What it's for | Where to get it | Cost |
|---|---|---|---|
TELEGRAM_BOT_TOKEN |
Receive and reply to messages | Message @BotFather on Telegram, send /newbot |
Free |
ASSEMBLY_API_KEY |
Transcription with speaker labels | assemblyai.com/app/account | Free tier included |
| Key | What it unlocks |
|---|---|
OPENAI_API_KEY |
AI chat + summarization. Works with any OpenAI-compatible API (OpenAI, OpenRouter, DigitalOcean, etc.) |
OPENAI_BASE_URL |
Custom endpoint (default: https://api.openai.com/v1) |
OPENAI_MODEL |
Model for chat + summarization (default: gpt-4o-mini) |
GLM_API_KEY |
File analysis via Claude Agent SDK. Ask questions about your stored files. |
GLM_MODEL |
Model for file analysis (default: glm-4-plus) |
ANTHROPIC_BASE_URL |
Anthropic-compatible endpoint (default: https://api.z.ai/api/anthropic). Works with Z.AI, Anthropic, or any compatible provider. |
S3_BUCKET |
S3 storage sync — all files mirrored to S3, restored on container restart |
BOT_NAME |
Storage prefix (default: transcribe-bot) |
- Open Telegram and search for @BotFather
- Send
/newbot - Pick a name for your bot (e.g., "My Transcriber")
- Pick a username (e.g.,
my_transcriber_bot) - BotFather gives you an API token — copy it
git clone <repo-url> && cd ai-meeting-notes-agent
cp .env.example .envEdit .env and fill in your API keys:
ASSEMBLY_API_KEY=your_assemblyai_key
TELEGRAM_BOT_TOKEN=your_bot_token
Start the bot:
uv run telegram_bot.pyThat's it. Send a voice memo to your bot on Telegram and get a transcript back.
docker compose up -dThe bot uses polling (no inbound ports needed), so it runs anywhere Docker runs — a $5 VPS, a Raspberry Pi, or your laptop.
Fork this repo, add secrets in GitHub repo settings, and push. It deploys automatically.
Required secrets:
| GitHub Secret | Value |
|---|---|
AWS_ACCESS_KEY_ID |
Your AWS access key |
AWS_SECRET_ACCESS_KEY |
Your AWS secret key |
ASSEMBLY_API_KEY |
Your AssemblyAI key |
TELEGRAM_BOT_TOKEN |
Your bot token from BotFather |
Optional secrets (for AI + storage features):
| GitHub Secret | Value |
|---|---|
OPENAI_API_KEY |
OpenAI-compatible API key for chat + summarization |
OPENAI_BASE_URL |
Custom endpoint URL |
OPENAI_MODEL |
Model name |
GLM_API_KEY |
API key for file analysis |
GLM_MODEL |
Model name for file analysis |
ANTHROPIC_BASE_URL |
Anthropic-compatible endpoint URL |
S3_BUCKET |
S3 bucket name for file sync |
BOT_NAME |
Storage prefix |
Every push to main builds and deploys to ECS Fargate. You can also trigger it manually from the Actions tab.
Set your API keys as environment variables in the Railway dashboard.
Got a folder full of voice memos you never transcribed? Do them all at once:
uv run transcribe.py -f /path/to/recordings/Voice notes from Telegram, iPhone Voice Memos, and any standard audio/video format: .m4a, .mp3, .ogg, .wav, .mp4, .mov, and more.
Like OpenClaw, the bot uses a messaging platform (Telegram) as the primary interface, with modular AI capabilities behind it:
- Telegram bot (
telegram_bot.py) — message router: voice → transcription, text → conversation, files → storage - Transcription — AssemblyAI with speaker diarization, auto language detection, multi-format support
- Conversation — OpenAI-compatible LLM for chat, summarization, and Q&A
- Claude Code Agent — autonomous agent (via Claude Agent SDK) that reads your stored files and answers questions with full context
- Storage — unified
data/{bot_name}/YYYY/MM/DD/structure, identical paths locally and on S3 - S3 sync — bidirectional: pulls from S3 on startup, pushes after every write
- Web dashboard — React pipeline visualization showing module status, deployment info, and live configuration
- Personalized notes — each participant gets notes relevant to them
- Calendar integration — auto-match recordings to meetings
- Team workspaces — shared transcripts across a group
See TECHNICAL.md for detailed architecture, configuration, Google Drive watcher, and deployment instructions.