You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
# DeepSeek OCR mode (requires weights under /path/to/deepseek-ocr/DeepSeek-OCR)
53
-
./dependency_setup/setup_glossapi.sh \
54
-
--mode deepseek \
49
+
# DeepSeek OCR runtime (uv-managed)
50
+
./dependency_setup/setup_deepseek_uv.sh \
55
51
--venv dependency_setup/.venvs/deepseek \
56
-
--weights-dir /path/to/deepseek-ocr \
52
+
--model-root /path/to/deepseek-ocr-2-model \
53
+
--download-model \
57
54
--run-tests --smoke-test
58
55
```
59
56
60
-
Pass `--download-deepseek`if you need the script to fetch weights automatically; otherwise it looks for `${REPO_ROOT}/deepseek-ocr/DeepSeek-OCR` unless you override `--weights-dir`. Check `dependency_setup/dependency_notes.md` for the latest pins, caveats, and validation history. The script also installs the Rust extensions in editable mode so local changes are picked up immediately.
57
+
`setup_glossapi.sh --mode deepseek`now delegates to the same uv-based installer. `setup_deepseek_uv.sh` uses `uv venv` + `uv sync`, installs the Rust extensions in editable mode, and can download `deepseek-ai/DeepSeek-OCR-2` with `huggingface_hub`.
61
58
62
59
**DeepSeek runtime checklist**
63
-
- Run `python -m glossapi.ocr.deepseek.preflight`(from your DeepSeek venv) to fail fast if the CLI would fall back to the stub.
64
-
- Export these to force the real CLI and avoid silent stub output:
60
+
- Run `python -m glossapi.ocr.deepseek.preflight` from the DeepSeek venv to fail fast before OCR.
61
+
- Export these to force the real runtime and avoid silent stub output:
- The default fallback locations already point at the in-repo Transformers runner and `${REPO_ROOT}/deepseek-ocr-2-model/DeepSeek-OCR-2`.
68
+
-`flash-attn` is optional. The runner uses `flash_attention_2` when available and falls back to `eager` otherwise.
75
69
76
70
## Choose Your Install Path
77
71
78
72
| Scenario | Commands | Notes |
79
73
| --- | --- | --- |
80
74
| Pip users |`pip install glossapi`| Fast vanilla evaluation with minimal dependencies. |
81
-
| Mode automation (recommended) |`./dependency_setup/setup_glossapi.sh --mode {vanilla\|rapidocr\|deepseek}`| Creates an isolated venv per mode, installs Rust crates, and can run the relevant pytest subset. |
75
+
| Docling environment |`./dependency_setup/setup_glossapi.sh --mode docling`| Creates the main GlossAPI venv for extraction, cleaning, sectioning, and enrichment. |
76
+
| DeepSeek environment |`./dependency_setup/setup_deepseek_uv.sh`| Creates a separate uv-managed OCR runtime pinned to the tested Transformers/Torch stack. |
82
77
| Manual editable install |`pip install -e .` after cloning | Keep this if you prefer to manage dependencies by hand. |
0 commit comments