Skip to content

Commit 0190322

Browse files
committed
Add Python 3.12 support and modern package management
This commit adds support for Python 3.12 and provides an alternative installation method using UV, while maintaining full backward compatibility with existing pip-based workflows. Changes: - Add pyproject.toml with PEP 621 compliant project metadata - Fix numpy compatibility: constrain to <2.0 for Python 3.12 support - Relax torch/torchvision constraints to allow Python 3.12 compatible versions - Fix OpenCV version constraints to avoid known compatibility issues - Add UV as an optional, faster installation method in README - Update .gitignore for modern Python tooling All changes maintain backward compatibility with pip and requirements.txt. Tested with Python 3.12.9 on macOS with the Dolphin-1.5 model.
1 parent fcbf033 commit 0190322

File tree

3 files changed

+55
-6
lines changed

3 files changed

+55
-6
lines changed

.gitignore

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -152,3 +152,7 @@ Desktop.ini
152152

153153
fusion_result.json
154154
kernel_meta/
155+
156+
# UV package manager
157+
uv.lock
158+
.python-version

README.md

Lines changed: 30 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -121,7 +121,7 @@ Try our demo on [Demo-Dolphin](https://huggingface.co/spaces/ByteDance/Dolphin).
121121
3. Download the pre-trained models of *Dolphin-v2*:
122122

123123
Visit our Huggingface [model card](https://huggingface.co/ByteDance/Dolphin-v2), or download model by:
124-
124+
125125
```bash
126126
# Download the model from Hugging Face Hub
127127
git lfs install
@@ -131,27 +131,51 @@ Try our demo on [Demo-Dolphin](https://huggingface.co/spaces/ByteDance/Dolphin).
131131
huggingface-cli download ByteDance/Dolphin-v2 --local-dir ./hf_model
132132
```
133133

134+
### Alternative: Using UV
135+
136+
For faster dependency resolution, you can use [UV](https://docs.astral.sh/uv/) as an alternative to pip:
137+
138+
1. Install UV:
139+
```bash
140+
# On macOS and Linux
141+
curl -LsSf https://astral.sh/uv/install.sh | sh
142+
143+
# On Windows
144+
powershell -c "irm https://astral.sh/uv/install.ps1 | iex"
145+
```
146+
147+
2. Install dependencies:
148+
```bash
149+
uv sync
150+
```
151+
152+
3. Download the model:
153+
```bash
154+
uv run huggingface-cli download ByteDance/Dolphin-v2 --local-dir ./hf_model
155+
```
156+
134157
## ⚡ Inference
135158

136159
Dolphin provides two inference frameworks with support for two parsing granularities:
137160
- **Page-level Parsing**: Parse the entire document page into a structured JSON and Markdown format
138161
- **Element-level Parsing**: Parse individual document elements (text, table, formula)
139162

163+
**Note:** If you installed using UV, prefix all python commands with `uv run`, e.g., `uv run python demo_page.py ...`
140164

141165
### 📄 Page-level Parsing
142166

143167
```bash
144168
# Process a single document image
145169
python demo_page.py --model_path ./hf_model --save_dir ./results \
146-
--input_path ./demo/page_imgs/page_1.png
170+
--input_path ./demo/page_imgs/page_1.png
147171

148172
# Process a single document pdf
149173
python demo_page.py --model_path ./hf_model --save_dir ./results \
150-
--input_path ./demo/page_imgs/page_6.pdf
174+
--input_path ./demo/page_imgs/page_6.pdf
151175

152176
# Process all documents in a directory
153177
python demo_page.py --model_path ./hf_model --save_dir ./results \
154-
--input_path ./demo/page_imgs
178+
--input_path ./demo/page_imgs
155179

156180
# Process with custom batch size for parallel element decoding
157181
python demo_page.py --model_path ./hf_model --save_dir ./results \
@@ -173,14 +197,14 @@ python demo_element.py --model_path ./hf_model --save_dir ./results \
173197
# Process a single document image
174198
python demo_layout.py --model_path ./hf_model --save_dir ./results \
175199
--input_path ./demo/page_imgs/page_1.png \
176-
200+
177201
# Process a single PDF document
178202
python demo_layout.py --model_path ./hf_model --save_dir ./results \
179203
--input_path ./demo/page_imgs/page_6.pdf \
180204

181205
# Process all documents in a directory
182206
python demo_layout.py --model_path ./hf_model --save_dir ./results \
183-
--input_path ./demo/page_imgs
207+
--input_path ./demo/page_imgs
184208
````
185209

186210

pyproject.toml

Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,24 @@
1+
[project]
2+
name = "dolphin"
3+
version = "1.5.0"
4+
description = "Document Image Parsing via Heterogeneous Anchor Prompting"
5+
readme = "README.md"
6+
requires-python = ">=3.9"
7+
license = { text = "MIT" }
8+
dependencies = [
9+
"numpy>=1.26.4,<2.0",
10+
"omegaconf==2.3.0",
11+
"opencv-python>=4.8.0,<4.11",
12+
"opencv-python-headless>=4.5.5,<4.6",
13+
"pillow>=9.3.0",
14+
"timm==0.5.4",
15+
"torch>=2.1.0",
16+
"torchvision>=0.16.0",
17+
"transformers==4.47.0",
18+
"accelerate==1.6.0",
19+
"pymupdf==1.26",
20+
]
21+
122
[tool.black]
223
line-length = 120
324
include = '\.pyi?$'

0 commit comments

Comments
 (0)