update readme

Codegass · Codegass · commit 00b9f31c9339 · 2025-04-21T03:03:01.000-04:00
diff --git a/README.md b/README.md
@@ -1,144 +1,144 @@
 # N0Mail (M2 Alpha)
 
-AI Email Assistant based on `m2design.mdc`.
+AI Email Assistant
 
-## 目标 (Goals - M2 Alpha)
+## Goals (M2 Alpha)
 
-*   **自动简报**: 每日 08:00 自动生成 Markdown 简报 (`brief.md`) 保存本地目录 `~/.n0mail/briefs/`
-*   **手动触发**: CLI `n0mail brief run [--date]` 随时生成，15 min 缓存确保幂等
-*   **交互追问**: CLI `n0mail chat` REPL：RAG 检索本地邮件 → GPT-4o 流式回答
-*   **离线存储**: 本地 SQLite + ChromaDB 保存最近 45 天邮件元数据、正文、标签、摘要、嵌入
+*   **Automatic Briefing**: Automatically generate a Markdown briefing (`brief.md`) daily at 08:00 and save it to the local directory `~/.n0mail/briefs/`.
+*   **Manual Trigger**: Generate the briefing on demand using `n0mail brief run [--date]`, with a 15-minute cache to ensure idempotency.
+*   **Interactive Chat**: Use the CLI `n0mail chat` REPL: Perform RAG search on local emails → Stream answers using GPT-4o.
+*   **Offline Storage**: Store metadata, body, labels, summaries, and embeddings of the last 45 days of emails locally using SQLite + ChromaDB.
 
-## 功能 (Features - M2 Alpha)
+## Features (M2 Alpha)
 
-*   ✅ **F-1: Gmail OAuth**: `n0mail auth google` 完成 PKCE 流程 → 保存 token 于 `keyring`
-*   ✅ **F-2: 邮件同步**: `n0mail sync run` 使用 `historyId` (默认) 或时间范围 (`--days`) 增量抓取，或全量 (`--full`) 获取最新邮件。写入 DB 并生成嵌入向量存入 ChromaDB。
-*   ✅ **F-3: 零样本分类**: `n0mail process classify` GPT-4o function-call → `label` 字段写回 (默认处理所有未分类邮件)
-*   ✅ **F-4: 邮件摘要**: `n0mail process summarize` GPT-4o 摘要 → `summary` 字段写回 (Bulk/Promo 可选跳过)
-*   ✅ **F-5: 简报拼装**: `n0mail brief compose` 根据本地数据规则生成简报 / `n0mail brief generate` 使用 OpenAI 生成简报。
-*   ⏳ **F-6: 自动生成**: Cron (`n0mail cron enable`) → 调 `brief run --today`
-*   ✅ **F-7: CLI 交互**: `n0mail chat`：RAG 检索 → GPT-4o Stream
-*   ⏳ **F-8: 命令补全**: `/open id`, `/copy`, `/retry`
-*   ⏳ **F-9: 缓存策略**: `brief_cache` 表
+*   ✅ **F-1: Gmail OAuth**: `n0mail auth google` completes the PKCE flow → saves the token in `keyring`.
+*   ✅ **F-2: Email Sync**: `n0mail sync run` uses `historyId` (default) or a date range (`--days`) for incremental fetching, or `--full` for fetching the latest emails. Writes to the DB and generates embeddings stored in ChromaDB.
+*   ✅ **F-3: Zero-Shot Classification**: `n0mail process classify` uses GPT-4o function-call → writes the `label` field back (processes all unclassified emails by default).
+*   ✅ **F-4: Email Summarization**: `n0mail process summarize` uses GPT-4o for summarization → writes the `summary` field back (optionally skips Bulk/Promo).
+*   ✅ **F-5: Briefing Composition**: `n0mail brief compose` generates a briefing based on local data rules / `n0mail brief generate` uses OpenAI to generate the briefing.
+*   ⏳ **F-6: Automatic Generation**: Cron (`n0mail cron enable`) → calls `brief run --today`.
+*   ✅ **F-7: CLI Interaction**: `n0mail chat`: RAG retrieval → GPT-4o Stream.
+*   ⏳ **F-8: Command Completion**: `/open id`, `/copy`, `/retry`.
+*   ⏳ **F-9: Caching Strategy**: `brief_cache` table.
 
-## 使用方法 (Usage)
+## Usage
 
-1.  **安装依赖**: 
+1.  **Install Dependencies**:
     ```bash
     pip install poetry
     poetry install # Includes markdownify, beautifulsoup4
     ```
 
-2.  **配置**: 
-    *   从 Google Cloud Console 下载 OAuth 客户端 ID (桌面应用类型) 的 `credentials.json` 文件，并将其重命名或保存为 `client_secret_....json` 放到项目根目录。
-    *   创建 `.env` 文件在项目根目录，并添加你的 API 密钥 (根据你选择的 Provider):
+2.  **Configuration**:
+    *   Download the OAuth client ID (`credentials.json` file for Desktop application type) from the Google Cloud Console and rename or save it as `client_secret_....json` in the project root directory.
+    *   Create a `.env` file in the project root directory and add your API key (depending on the provider you choose):
         ```dotenv
-        # --- OpenAI (默认) ---
+        # --- OpenAI (Default) ---
         # OPENAI_API_KEY_N0MAIL="sk-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
-        
+
         # --- Ollama ---
         # LLM_PROVIDER=ollama 
         # EMBEDDING_PROVIDER=ollama
-        # OLLAMA_HOST="http://localhost:11434" # Ollama 服务地址
-        # # 指定 Ollama 模型 (示例)
+        # OLLAMA_HOST="http://localhost:11434" # Ollama service address
+        # # Specify Ollama models (example)
         # CLASSIFY_DEFAULT_MODEL=llama3:8b
         # SUMMARIZE_DEFAULT_MODEL=llama3:8b
-        # EMBEDDING_DEFAULT_MODEL=nomic-embed-text # 确保已 pull
+        # EMBEDDING_DEFAULT_MODEL=nomic-embed-text # Ensure it's pulled
         # BRIEF_DEFAULT_MODEL=llama3:instruct
         # CHAT_DEFAULT_MODEL=llama3:instruct
-        
-        # --- 也可以混合使用 Provider ---
+
+        # --- Can also mix providers ---
         # LLM_PROVIDER=openai
         # EMBEDDING_PROVIDER=ollama 
         # OPENAI_API_KEY_N0MAIL="sk-xxxxxxxx"
         # EMBEDDING_DEFAULT_MODEL=nomic-embed-text
         # OLLAMA_HOST="http://localhost:11434"
         ```
-    *   **(重要)** 确保 `client_secret_....json` 和 `.env*` 已添加到 `.gitignore` 文件中。
-    *   **(环境变量说明)**:
-        *   `LLM_PROVIDER`: 设置用于聊天、分类、摘要、简报生成的提供商。支持: `openai` (默认), `ollama`。
-        *   `EMBEDDING_PROVIDER`: 设置用于生成嵌入向量的提供商。支持: `openai` (默认), `ollama`。如果未设置，默认跟随 `LLM_PROVIDER`。
-        *   `OPENAI_API_KEY_N0MAIL`: OpenAI API 密钥 (如果使用 `openai` 提供商)。
-        *   `OLLAMA_HOST`: Ollama 服务地址 (如果使用 `ollama` 提供商)，默认 `http://localhost:11434`。
-        *   `CLASSIFY_DEFAULT_MODEL`: 分类默认模型 (默认: `gpt-4o-mini`)。
-        *   `SUMMARIZE_DEFAULT_MODEL`: 摘要默认模型 (默认: `gpt-4o-mini`)。
-        *   `EMBEDDING_DEFAULT_MODEL`: 嵌入默认模型 (OpenAI 默认: `text-embedding-3-small`, Ollama 需指定)。
-        *   `BRIEF_DEFAULT_MODEL`: 简报生成默认模型 (默认: `gpt-4o`)。
-        *   `CHAT_DEFAULT_MODEL`: 聊天默认模型 (默认: `gpt-4o`)。
-        *   (命令行中的 `--model` 或 `--embed-model` 选项会覆盖这些默认值)。
-
-3.  **运行命令** (使用 `poetry run n0mail <command>`):
+    *   **(Important)** Ensure `client_secret_....json` and `.env*` are added to your `.gitignore` file.
+    *   **(Environment Variable Explanation)**:
+        *   `LLM_PROVIDER`: Sets the provider for chat, classification, summarization, and briefing generation. Supported: `openai` (default), `ollama`.
+        *   `EMBEDDING_PROVIDER`: Sets the provider for generating embeddings. Supported: `openai` (default), `ollama`. Defaults to `LLM_PROVIDER` if not set.
+        *   `OPENAI_API_KEY_N0MAIL`: OpenAI API key (if using `openai` provider).
+        *   `OLLAMA_HOST`: Ollama service address (if using `ollama` provider), default `http://localhost:11434`.
+        *   `CLASSIFY_DEFAULT_MODEL`: Default model for classification (default: `gpt-4o-mini`).
+        *   `SUMMARIZE_DEFAULT_MODEL`: Default model for summarization (default: `gpt-4o-mini`).
+        *   `EMBEDDING_DEFAULT_MODEL`: Default model for embedding (OpenAI default: `text-embedding-3-small`, Ollama requires specification).
+        *   `BRIEF_DEFAULT_MODEL`: Default model for briefing generation (default: `gpt-4o`).
+        *   `CHAT_DEFAULT_MODEL`: Default model for chat (default: `gpt-4o`).
+        *   (Command-line options like `--model` or `--embed-model` override these defaults).
+
+3.  **Run Commands** (using `poetry run n0mail <command>`):
     ```bash
-    # --- 帮助 --- 
+    # --- Help --- 
     poetry run n0mail --help
     poetry run n0mail auth --help
     poetry run n0mail sync --help
     poetry run n0mail process --help
     poetry run n0mail brief --help
 
-    # --- 认证 --- 
-    # 首次运行进行 Google 授权
+    # --- Authentication --- 
+    # Run for the first time for Google authorization
     poetry run n0mail auth google
-    # 强制重新授权
+    # Force re-authorization
     poetry run n0mail auth google --force
 
-    # --- 同步 --- 
-    # 增量同步 (默认模式, 基于上次记录)
+    # --- Sync --- 
+    # Incremental sync (default mode, based on last record)
     poetry run n0mail sync run 
-    # 同步过去7天的邮件 (最多3000封)
+    # Sync emails from the past 7 days (max 3000)
     poetry run n0mail sync run --days 7
-    # 同步过去3天的邮件，最多只处理100封
+    # Sync emails from the past 3 days, process max 100 emails
     poetry run n0mail sync run --days 3 --max-emails 100
-    # 强制全量同步最新的3000封 (忽略天数和历史)
+    # Force full sync of the latest 3000 emails (ignores days and history)
     poetry run n0mail sync run --full
-    # 强制全量同步最新的50封
+    # Force full sync of the latest 50 emails
     poetry run n0mail sync run --full --max-emails 50
-    # 同步时不生成嵌入向量
+    # Sync without generating embeddings
     poetry run n0mail sync run --no-embed
-    # 指定嵌入时使用的文本分割块大小和重叠
+    # Specify chunk size and overlap for embedding text splitting
     poetry run n0mail sync run --chunk-size 8000 --chunk-overlap 100
 
-    # --- 处理 (需要OpenAI Key) --- 
-    # 分类所有未分类邮件
+    # --- Process (Requires OpenAI Key) --- 
+    # Classify all unclassified emails
     poetry run n0mail process classify
-    # 限制数量、强制重新分类、指定模型
+    # Limit number, force reclassify, specify model
     poetry run n0mail process classify --max-emails 10 --reclassify --model gpt-4o-mini
 
-    # 摘要所有未摘要邮件 (默认跳过 Bulk/Promo)
+    # Summarize all unsummarized emails (skips Bulk/Promo by default)
     poetry run n0mail process summarize
-    # 限制数量、强制重新摘要、指定模型
+    # Limit number, force resummarize, specify model
     poetry run n0mail process summarize --max-emails 10 --resummarize --model gpt-4o-mini
-    # 摘要邮件, 不跳过 Bulk/Promo
+    # Summarize emails, do not skip Bulk/Promo
     poetry run n0mail process summarize --no-skip-bulk
 
-    # --- 简报 --- 
-    # 生成过去1天的简报并打印 (基于规则)
+    # --- Briefing --- 
+    # Generate briefing for the past 1 day and print (rule-based)
     poetry run n0mail brief compose
-    # 生成过去3天的简报并保存到文件
+    # Generate briefing for the past 3 days and save to file
     poetry run n0mail brief compose --days 3 --output ~/briefs/$(date +%Y-%m-%d)-brief.md 
 
-    # 生成过去1天的简报并打印 (使用 OpenAI，需要 API Key)
+    # Generate briefing for the past 1 day and print (using OpenAI, requires API Key)
     poetry run n0mail brief generate
-    # 使用AI生成过去3天的简报, 包含 Bulk/Promo, 使用 gpt-4o-mini 模型, 保存到文件
+    # Use AI to generate briefing for past 3 days, include Bulk/Promo, use gpt-4o-mini model, save to file
     poetry run n0mail brief generate --days 3 --include-bulk --model gpt-4o-mini --output ~/briefs/$(date +%Y-%m-%d)-ai-brief.md
-    # 使用AI生成简报，限制发送给 AI 的邮件数量为 15
+    # Use AI to generate briefing, limit emails sent to AI to 15
     poetry run n0mail brief generate --max-emails 15
 
-    # --- 交互式聊天 (需要配置 Provider) ---
-    # 启动聊天会话 (使用配置的默认模型)
+    # --- Interactive Chat (Requires Provider Config) ---
+    # Start chat session (uses configured default model)
     poetry run n0mail chat
-    # 指定聊天和嵌入模型, 并开启 Debug 输出
+    # Specify chat and embedding models, enable Debug output
     poetry run n0mail chat --chat-model llama3:instruct --embedding-model nomic-embed-text --debug
-    # 指定初始简报的天数
+    # Specify number of days for initial briefing
     poetry run n0mail chat --brief-days 7
-    # (在聊天中输入 /quit 或 /exit 退出, /help 查看可用命令 - 如果实现)
+    # (Enter /quit or /exit in chat to leave, /help to see available commands - if implemented)
 
-    # --- 其他 --- 
-    # 查看版本
+    # --- Other --- 
+    # Check version
     poetry run n0mail version
     ```
 
-## 开发
+## Development
 
-*   运行测试: `poetry run pytest`
-*   代码检查/格式化: `poetry run ruff check . --fix` 和 `poetry run ruff format .`
+*   Run tests: `poetry run pytest`
+*   Lint/Format code: `poetry run ruff check . --fix` and `poetry run ruff format .`