release: v0.3.2 — 覆盖率 97%, CI 自动发布, CONTRIBUTING.md

liuxiaotong · claude · happy-otter · liuxiaotong · commit 4bb7382e63d2 · 2026-02-08T19:43:35.000+08:00
- 测试覆盖率 96% → 97% (3399 tests), spec_analyzer 67% → 100% - GitHub Actions 自动发布: tag push → PyPI + GitHub Release - CI 加 pip cache 加速 - 添加 CONTRIBUTING.md 贡献指南 - 精简 pyproject.toml full 依赖组 (复用 all 组) Generated with [Claude Code](https://claude.ai/code) via [Happy](https://happy.engineering) Co-Authored-By: Claude <noreply@anthropic.com> Co-Authored-By: Happy <yesreply@happy.engineering>
diff --git a/.github/workflows/ci.yml b/.github/workflows/ci.yml
@@ -20,6 +20,7 @@ jobs:
         uses: actions/setup-python@v5
         with:
           python-version: ${{ matrix.python-version }}
+          cache: "pip"
 
       - name: Install dependencies
         run: |
diff --git a/.github/workflows/release.yml b/.github/workflows/release.yml
@@ -0,0 +1,38 @@
+name: Release
+
+on:
+  push:
+    tags: ["v*"]
+
+jobs:
+  publish:
+    runs-on: ubuntu-latest
+    permissions:
+      id-token: write
+      contents: write
+
+    steps:
+      - uses: actions/checkout@v4
+
+      - name: Set up Python
+        uses: actions/setup-python@v5
+        with:
+          python-version: "3.12"
+          cache: "pip"
+
+      - name: Install build tools
+        run: pip install build
+
+      - name: Build package
+        run: python -m build
+
+      - name: Publish to PyPI
+        uses: pypa/gh-action-pypi-publish@release/v1
+        with:
+          password: ${{ secrets.PYPI_API_TOKEN }}
+
+      - name: Create GitHub Release
+        uses: softprops/action-gh-release@v2
+        with:
+          generate_release_notes: true
+          files: dist/*
diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md
@@ -0,0 +1,89 @@
+# Contributing to DataRecipe
+
+## Development Setup
+
+```bash
+# Clone and install
+git clone https://github.com/liuxiaotong/data-recipe.git
+cd data-recipe
+python -m venv .venv
+source .venv/bin/activate
+make dev          # Install with all dependencies
+
+# Install pre-commit hooks
+make hooks
+```
+
+## Development Workflow
+
+```bash
+make lint         # Run ruff linter
+make format       # Auto-format code
+make test         # Run tests (3294+ tests)
+make cov          # Run tests with coverage (96%+, minimum 80%)
+make typecheck    # Run mypy type checking
+make ci           # Run full CI pipeline locally
+```
+
+## Testing
+
+- Use `unittest.TestCase` style
+- Mock all external dependencies (LLM APIs, network requests, file I/O)
+- Place tests in `tests/test_<module>.py`
+- Aim for 90%+ coverage on new code
+
+```bash
+# Run specific test file
+pytest tests/test_analyzer.py -v
+
+# Run with coverage for a specific module
+pytest tests/ --cov=datarecipe.analyzer --cov-report=term-missing
+```
+
+## Code Style
+
+- **Formatter**: ruff (line-length 100)
+- **Target**: Python 3.10+ (`X | None` instead of `Optional[X]`)
+- **Imports**: sorted by ruff (`I` rule)
+- Pre-commit hooks enforce formatting on every commit
+
+## Project Structure
+
+```
+src/datarecipe/
+├── cli/                # CLI commands (7 modules)
+├── core/               # Deep analysis engine
+├── analyzers/          # Spec + LLM dataset analyzers
+├── generators/         # Document generators (markdown/JSON)
+├── extractors/         # Rubrics + prompt extraction
+├── parsers/            # PDF/Word/image parsing
+├── cost/               # Cost estimation models
+├── knowledge/          # Knowledge base + dataset catalog
+├── sources/            # Data sources (HuggingFace, GitHub, web)
+├── providers/          # Deployment providers
+├── constants.py        # Shared constants
+└── schema.py           # Data models
+```
+
+## Commit Messages
+
+Use conventional commit format in Chinese or English:
+
+```
+feat: 新增功能描述
+fix: 修复问题描述
+test: 测试相关
+docs: 文档更新
+chore: 构建/工具链
+refactor: 重构
+```
+
+## Releasing
+
+Releases are automated via GitHub Actions. To release:
+
+1. Update version in `pyproject.toml`, `src/datarecipe/__init__.py`, `src/datarecipe/cli/__init__.py`
+2. Update `CHANGELOG.md`
+3. Commit and tag: `git tag -a v0.X.Y -m "v0.X.Y"`
+4. Push: `git push origin main --tags`
+5. GitHub Actions will auto-publish to PyPI and create a GitHub Release
diff --git a/README.md b/README.md
@@ -6,8 +6,8 @@
 
 [![PyPI](https://img.shields.io/pypi/v/knowlyr-datarecipe?color=blue&v=1)](https://pypi.org/project/knowlyr-datarecipe/)
 [![Python 3.10+](https://img.shields.io/badge/python-3.10%E2%80%933.13-blue.svg)](https://www.python.org/downloads/)
-[![Tests](https://img.shields.io/badge/tests-3294_passed-brightgreen.svg)](#开发)
-[![Coverage](https://img.shields.io/badge/coverage-96%25-brightgreen.svg)](#开发)
+[![Tests](https://img.shields.io/badge/tests-3399_passed-brightgreen.svg)](#开发)
+[![Coverage](https://img.shields.io/badge/coverage-97%25-brightgreen.svg)](#开发)
 [![License: MIT](https://img.shields.io/badge/License-MIT-green.svg)](LICENSE)
 [![MCP](https://img.shields.io/badge/MCP-10_Tools-purple.svg)](#mcp-server)
 
@@ -406,7 +406,7 @@ make format
 make hooks
 ```
 
-**测试覆盖**: 35+ 个测试文件，3294 个测试用例，96% 语句覆盖率。
+**测试覆盖**: 35+ 个测试文件，3399 个测试用例，97% 语句覆盖率。
 
 **CI**: GitHub Actions，支持 Python 3.10 / 3.11 / 3.12 / 3.13，覆盖率阈值 80%。
 
diff --git a/pyproject.toml b/pyproject.toml
@@ -4,7 +4,7 @@ build-backend = "setuptools.build_meta"
 
 [project]
 name = "knowlyr-datarecipe"
-version = "0.3.1"
+version = "0.3.2"
 description = "AI dataset 'ingredients label' analyzer - reverse-engineer datasets, estimate costs, analyze quality, and generate production workflows"
 readme = "README.md"
 license = {text = "MIT"}
@@ -75,13 +75,7 @@ all = [
     "mcp>=1.0.0",
 ]
 full = [
-    "anthropic>=0.18.0",
-    "openai>=1.0.0",
-    "pymupdf>=1.23.0",
-    "sentence-transformers>=2.2.0",
-    "datasets>=2.14.0",
-    "datasketch>=1.6.0",
-    "jinja2>=3.1.0",
+    "knowlyr-datarecipe[all]",
     "tenacity>=8.0.0",
     "label-studio>=1.10.0",
     "scikit-learn>=1.3.0",
diff --git a/src/datarecipe/__init__.py b/src/datarecipe/__init__.py
@@ -1,6 +1,6 @@
 """DataRecipe - AI dataset analyzer: reverse-engineer datasets, estimate costs, analyze quality, generate workflows."""
 
-__version__ = "0.3.1"
+__version__ = "0.3.2"
 
 from datarecipe.analyzer import DatasetAnalyzer
 from datarecipe.schema import (
diff --git a/src/datarecipe/cli/__init__.py b/src/datarecipe/cli/__init__.py
@@ -12,7 +12,7 @@
 
 # Main CLI group
 @click.group()
-@click.version_option(version="0.3.1", prog_name="datarecipe")
+@click.version_option(version="0.3.2", prog_name="datarecipe")
 def main():
     """DataRecipe - Analyze AI dataset ingredients, estimate costs, and generate workflows."""
     pass
diff --git a/tests/test_cli_commands.py b/tests/test_cli_commands.py
@@ -92,7 +92,7 @@ def test_main_help(self):
     def test_main_version(self):
         result = self.runner.invoke(main, ["--version"])
         self.assertEqual(result.exit_code, 0)
-        self.assertIn("0.3.1", result.output)
+        self.assertIn("0.3.2", result.output)
 
     def test_main_no_args_shows_usage(self):
         result = self.runner.invoke(main, [])
diff --git a/tests/test_spec_analyzer.py b/tests/test_spec_analyzer.py