Skip to content

Commit f8f9c97

Browse files
committed
docs: address PR review — enrich CLAUDE.md and use dynamic discovery for pipelines/models
- CLAUDE.md: add project structure tree, code style conventions, testing guidance, PR guidelines, and dynamic discovery section (inspired by langchain AGENTS.md) - inference_api.md: replace hardcoded pipeline/model tables with instructions to read __all__ from source __init__.py files, avoiding staleness when pipelines/models are added or removed
1 parent 354698a commit f8f9c97

File tree

2 files changed

+63
-33
lines changed

2 files changed

+63
-33
lines changed

CLAUDE.md

Lines changed: 57 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -14,14 +14,45 @@ pytest tests/ # Tests (resource-intensive skipped by default)
1414
pre-commit run --all-files # Lint/format
1515
```
1616

17-
## Architecture
17+
## Project Structure
18+
19+
```
20+
PaddleOCR/
21+
├── paddleocr/ # Public API (3.x) — what users import
22+
│ ├── __init__.py # Top-level exports (__all__ is the source of truth)
23+
│ ├── _pipelines/ # High-level pipelines (OCR, PPStructureV3, etc.)
24+
│ ├── _models/ # Individual model wrappers (TextDetection, etc.)
25+
│ └── _cli.py # CLI entry point
26+
├── ppocr/ # Internal training framework (not user-facing)
27+
│ ├── modeling/ # Model architectures (Backbone, Neck, Head)
28+
│ ├── data/ # Data loading and augmentation
29+
│ ├── losses/ # Loss functions
30+
│ ├── metrics/ # Evaluation metrics
31+
│ └── postprocess/ # Post-processing
32+
├── tools/ # Train/infer/eval scripts (tools/train.py)
33+
├── configs/ # YAML configs organized by task (det/, rec/, table/, etc.)
34+
├── deploy/ # Deployment (C++, Docker, ONNX, mobile)
35+
├── tests/ # Tests (models/ + pipelines/)
36+
└── agent_docs/ # Detailed AI-readable documentation
37+
```
1838

1939
Two layers — understand which you're working in:
2040

21-
- **`paddleocr/`** — Public API (3.x). `_pipelines/` has high-level pipelines (OCR, PPStructureV3), `_models/` has individual model wrappers (TextDetection, TextRecognition). Users import from here.
22-
- **`ppocr/`** — Internal training framework. Model architectures, data loading, losses, metrics, postprocessing. Used by `tools/train.py`, not by end users.
41+
- **`paddleocr/`** — Public API (3.x). `_pipelines/` has high-level pipelines, `_models/` has individual model wrappers. Users import from here.
42+
- **`ppocr/`** — Internal training framework. Used by `tools/train.py`, not by end users.
43+
44+
## Discovering Available Pipelines & Models
45+
46+
**Do NOT rely on hardcoded lists.** Always discover dynamically from source:
47+
48+
- **Pipelines**: Read `__all__` in `paddleocr/_pipelines/__init__.py`
49+
- **Models**: Read `__all__` in `paddleocr/_models/__init__.py`
50+
- **All public exports**: Read `__all__` in `paddleocr/__init__.py`
51+
52+
Each pipeline inherits from `PaddleXPipelineWrapper` (in `_pipelines/base.py`).
53+
Each model inherits from `PaddleXPredictorWrapper` (in `_models/base.py`).
2354

24-
Other directories: `tools/` (train/infer/eval scripts), `configs/` (YAML configs by task), `deploy/` (C++, Docker, ONNX, mobile), `tests/` (models/ + pipelines/).
55+
To understand a specific pipeline or model, read its source file in the corresponding directory.
2556

2657
## Critical: 3.x API Only
2758

@@ -31,6 +62,28 @@ PaddleOCR 3.x is **not backwards compatible** with 2.x. Never generate 2.x-style
3162
- `PPStructure` is removed — use `PPStructureV3`
3263
- For single-task inference, use model classes (`TextDetection`, `TextRecognition`) not `det`/`rec` params
3364

65+
## Code Style & Conventions
66+
67+
- Follow existing patterns in the file you're modifying
68+
- Use type hints for function signatures
69+
- Use `pre-commit run --all-files` to lint before committing — this runs ruff, trailing whitespace fixes, and other checks
70+
- Error messages should be clear and actionable
71+
- No `eval()`, `exec()`, or `pickle` on user-controlled input
72+
73+
## Testing
74+
75+
- Tests live in `tests/` with subdirectories `models/` and `pipelines/`
76+
- Run with `pytest tests/` — resource-intensive tests are skipped by default
77+
- When adding a new pipeline or model, add corresponding tests
78+
- Test the public API (`.predict()`, result object methods), not internal implementation details
79+
80+
## PR & Commit Guidelines
81+
82+
- PR titles: concise, lowercase, descriptive of what changed
83+
- PR descriptions: explain the "why", not just the "what"
84+
- Keep PRs focused — one logical change per PR
85+
- Ensure `pre-commit run --all-files` passes before pushing
86+
3487
## Detailed Docs
3588

3689
Read these as needed — don't load them all upfront:

agent_docs/inference_api.md

Lines changed: 6 additions & 29 deletions
Original file line numberDiff line numberDiff line change
@@ -15,38 +15,15 @@ for res in result:
1515

1616
## Available Pipelines
1717

18-
All imported from `paddleocr`:
19-
20-
| Pipeline | Purpose |
21-
|----------|---------|
22-
| `PaddleOCR` | Full OCR (detection + recognition) |
23-
| `PaddleOCRVL` | Vision-language OCR (v1, v1.5) |
24-
| `PPStructureV3` | Document structure: tables, formulas, layout |
25-
| `PPChatOCRv4Doc` | LLM-powered document analysis |
26-
| `DocUnderstanding` | VLM-based document QA |
27-
| `FormulaRecognitionPipeline` | Math formula recognition |
28-
| `SealRecognition` | Seal text detection + recognition |
29-
| `TableRecognitionPipelineV2` | Table structure recognition |
30-
| `DocPreprocessor` | Orientation, unwarping |
31-
| `PPDocTranslation` | Document translation |
18+
All imported from `paddleocr`. **To get the current list**, read `__all__` in `paddleocr/_pipelines/__init__.py`.
19+
20+
Common pipelines include `PaddleOCR` (full OCR), `PPStructureV3` (document structure), `DocUnderstanding` (VLM-based QA), but the authoritative list lives in the source. Each pipeline has its own file in `paddleocr/_pipelines/` — read the file to understand its constructor parameters and capabilities.
3221

3322
## Available Individual Models
3423

35-
| Model | Purpose |
36-
|-------|---------|
37-
| `TextDetection` | Detect text regions |
38-
| `TextRecognition` | Recognize text content |
39-
| `LayoutDetection` | Detect document layout regions |
40-
| `TableClassification` | Classify table types |
41-
| `TableCellsDetection` | Detect table cells |
42-
| `TableStructureRecognition` | Recognize table structure |
43-
| `SealTextDetection` | Detect seal text |
44-
| `FormulaRecognition` | Recognize formulas |
45-
| `ChartParsing` | Parse charts |
46-
| `DocVLM` | Document vision-language model |
47-
| `DocImgOrientationClassification` | Classify document orientation |
48-
| `TextImageUnwarping` | Unwarp distorted text images |
49-
| `TextLineOrientationClassification` | Classify text line orientation |
24+
All imported from `paddleocr`. **To get the current list**, read `__all__` in `paddleocr/_models/__init__.py`.
25+
26+
Common models include `TextDetection`, `TextRecognition`, `LayoutDetection`, but the authoritative list lives in the source. Each model has its own file in `paddleocr/_models/` — read the file to understand its parameters and default model names.
5027

5128
## PaddleOCR Constructor Parameters
5229

0 commit comments

Comments
 (0)