docs: split bilingual readme and improve community metadata

cuixiaoling · cuixiaoling · commit 08d89e01ae55 · 2026-02-27T17:33:43.000+08:00
diff --git a/.github/ISSUE_TEMPLATE/bug_report.yml b/.github/ISSUE_TEMPLATE/bug_report.yml
@@ -0,0 +1,25 @@
+name: Bug report
+description: Report a reproducible bug in adapter behavior
+title: "[Bug]: "
+labels: [bug]
+body:
+  - type: textarea
+    id: summary
+    attributes:
+      label: Summary
+      description: What happened?
+    validations:
+      required: true
+  - type: textarea
+    id: steps
+    attributes:
+      label: Steps to reproduce
+      description: Provide minimal input and command
+    validations:
+      required: true
+  - type: textarea
+    id: expected
+    attributes:
+      label: Expected behavior
+    validations:
+      required: true
diff --git a/.github/ISSUE_TEMPLATE/feature_request.yml b/.github/ISSUE_TEMPLATE/feature_request.yml
@@ -0,0 +1,19 @@
+name: Feature request
+description: Propose an enhancement aligned with skill boundaries
+title: "[Feature]: "
+labels: [enhancement]
+body:
+  - type: textarea
+    id: problem
+    attributes:
+      label: Problem
+      description: What workflow is blocked today?
+    validations:
+      required: true
+  - type: textarea
+    id: proposal
+    attributes:
+      label: Proposal
+      description: What should be added or changed?
+    validations:
+      required: true
diff --git a/.github/pull_request_template.md b/.github/pull_request_template.md
@@ -0,0 +1,11 @@
+## Summary
+
+## What changed
+
+- 
+
+## Validation
+
+- [ ] `pytest -q`
+- [ ] CLI demo command executed
+- [ ] Docs updated if interfaces changed
diff --git a/CODE_OF_CONDUCT.md b/CODE_OF_CONDUCT.md
@@ -0,0 +1,9 @@
+# Code of Conduct
+
+This project follows a simple standard: be respectful, constructive, and professional.
+
+- Be specific when reporting issues.
+- Focus on behavior and reproducible facts.
+- Assume good intent and avoid personal attacks.
+
+Maintainers may moderate discussions and contributions that violate this policy.
diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md
@@ -0,0 +1,17 @@
+# Contributing
+
+## Setup
+
+```bash
+python3 -m venv .venv
+source .venv/bin/activate
+pip install -e ".[dev]"
+pytest -q
+```
+
+## Contribution Rules
+
+- Keep behavior aligned with `SKILL.md` and `references/segmentation-rules.md`.
+- Preserve code/image/key-term fidelity.
+- Include tests for behavior changes in segmentation or preservation logic.
+- Update docs (`README.md`, `README.zh-CN.md`, examples) when interfaces change.
diff --git a/README.md b/README.md
@@ -1,30 +1,32 @@
 # mdf-material-adapter
 
-Production-ready adapter that turns noisy course transcripts/documents into stable lesson segmentation candidates for MarkdownFlow generation.
+[中文文档](./README.zh-CN.md)
 
-## 适用场景
+Production-ready adapter that transforms noisy transcripts or course docs into stable lesson segmentation candidates for MarkdownFlow generation while preserving code blocks, images, and key terms.
 
-- 逐字稿噪声高（口头填充词、重复句）但需要稳定切课。
-- 文档混合代码块、图片、术语，需要保真迁移到后续教学脚本。
-- 在 MarkdownFlow 课节生成前，需要一个中间结构层。
+## Use Cases
 
-## 非适用场景
+- Raw transcripts are noisy (filler words, repeated phrases) and need deterministic lesson splits.
+- Source material mixes prose, code blocks, and images that must be preserved.
+- You need a structured intermediate artifact before MarkdownFlow lesson script generation.
 
-- 不适合用于改写课程结论或改变原始事实口径。
-- 不替代完整教学设计，不直接产出最终授课脚本。
-- 不做多模态 OCR 提取（仅处理文本中已有图片/代码标记）。
+## Non-Goals
 
-## Quickstart
+- It does not rewrite course conclusions or alter factual claims.
+- It is not a full instructional design system and does not output final teaching scripts.
+- It does not perform OCR or extract text from binary assets.
+
+## Quickstart (3 minutes)
 
 ```bash
-python -m venv .venv
+python3 -m venv .venv
 source .venv/bin/activate
-pip install -e .
+pip install -e ".[dev]"
 mdf-material-adapter --input examples/sample_input.md --output output.json
-python -m json.tool output.json | head -n 60
+python -m json.tool output.json | head -n 50
 ```
 
-## 输出 JSON 示例
+## JSON Output Example
 
 ```json
 {
@@ -48,30 +50,30 @@ python -m json.tool output.json | head -n 60
   "lesson_candidates": [
     {
       "lesson_id": "lesson-01",
-      "core_question": "你认为这段的核心问题是什么？"
+      "core_question": "What is the core learning question in this segment?"
     }
   ]
 }
 ```
 
-## 设计：Skill Core + Adapters
+## Design: Skill Core + Adapters
 
-- Skill Core (`src/mdf_material_adapter/core.py`): 去噪、语义分段、保真块索引、迁移线索。
-- CLI Adapter (`src/mdf_material_adapter/cli.py`): 文件输入输出，供脚本和流水线调用。
+- Skill Core (`src/mdf_material_adapter/core.py`): denoise, semantic segmentation, immutable block indexing, migration cues.
+- CLI Adapter (`src/mdf_material_adapter/cli.py`): file-in/file-out interface for scripts and pipelines.
 - Ecosystem Adapters:
   - OpenClaw: `tool.json` + `examples/openclaw_demo.md`
   - Claude: `examples/claude_function_calling.md`
   - Codex: `scripts/codex_task.md`
 
-## 与 AI-Shifu 的关系
+## Relationship to AI-Shifu
 
-该项目是 AI-Shifu 课程生产链路中的“资料适配”步骤之一，用于在正式教学脚本生成前提供稳定中间结构。
+This repository is one step in AI-Shifu's course production pipeline: adapting raw material before lesson-level script generation.
 
 - Website: https://ai-shifu.com
 
 ## Development
 
 ```bash
-pip install -e .[dev]
+pip install -e ".[dev]"
 pytest -q
 ```
diff --git a/README.zh-CN.md b/README.zh-CN.md
@@ -0,0 +1,73 @@
+# mdf-material-adapter
+
+[English README](./README.md)
+
+一个面向生产的课程资料适配器：将噪声逐字稿/课程文档转换为可稳定切分课节的中间结构，并保留代码块、图片与关键术语。
+
+## 适用场景
+
+- 逐字稿噪声高（口头填充词、重复句），但需要稳定切课。
+- 文档混合代码、图片、术语，需要高保真迁移。
+- 在 MarkdownFlow 授课脚本生成前，需要结构化中间层。
+
+## 非适用场景
+
+- 不用于改写课程结论或改变事实口径。
+- 不替代完整教学设计，不直接产出最终授课脚本。
+- 不做 OCR 或二进制资源文本提取。
+
+## 快速开始（3 分钟）
+
+```bash
+python3 -m venv .venv
+source .venv/bin/activate
+pip install -e ".[dev]"
+mdf-material-adapter --input examples/sample_input.md --output output.json
+python -m json.tool output.json | head -n 50
+```
+
+## 输出 JSON 示例
+
+```json
+{
+  "meta": {
+    "adapter": "mdf-material-adapter",
+    "version": "0.1.0",
+    "segment_count": 4
+  },
+  "ordered_segments": [
+    {
+      "segment_id": "seg-001",
+      "segment_type": "concept",
+      "preserve_block": "no"
+    },
+    {
+      "segment_id": "seg-002",
+      "segment_type": "code",
+      "preserve_block": "yes"
+    }
+  ]
+}
+```
+
+## 设计：Skill Core + Adapters
+
+- Skill Core（`src/mdf_material_adapter/core.py`）：去噪、语义分段、保真块索引、迁移线索。
+- CLI Adapter（`src/mdf_material_adapter/cli.py`）：文件输入输出，便于流水线调用。
+- 生态适配：
+  - OpenClaw：`tool.json` + `examples/openclaw_demo.md`
+  - Claude：`examples/claude_function_calling.md`
+  - Codex：`scripts/codex_task.md`
+
+## 与 AI-Shifu 的关系
+
+本项目是 AI-Shifu 课程生产链路中的“资料适配”步骤之一，用于在正式授课脚本生成前提供稳定中间结构。
+
+- 官网：https://ai-shifu.com
+
+## 开发
+
+```bash
+pip install -e ".[dev]"
+pytest -q
+```