Skip to content

Commit 6fa8756

Browse files
committed
Add Python 3.12 support and modern package management
This commit adds support for Python 3.12 and provides an alternative installation method using UV, while maintaining full backward compatibility with existing pip-based workflows. Changes: - Add pyproject.toml with PEP 621 compliant project metadata - Fix numpy compatibility: constrain to <2.0 for Python 3.12 support - Relax torch/torchvision constraints to allow Python 3.12 compatible versions - Fix OpenCV version constraints to avoid known compatibility issues - Add UV as an optional, faster installation method in README - Update .gitignore for modern Python tooling All changes maintain backward compatibility with pip and requirements.txt. Tested with Python 3.12.9 on macOS with the Dolphin-1.5 model.
1 parent 40c68da commit 6fa8756

File tree

4 files changed

+55
-9
lines changed

4 files changed

+55
-9
lines changed

.gitignore

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -152,3 +152,7 @@ Desktop.ini
152152

153153
fusion_result.json
154154
kernel_meta/
155+
156+
# UV package manager
157+
uv.lock
158+
.python-version

README.md

Lines changed: 29 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -134,9 +134,6 @@ Try our demo on [Demo-Dolphin](https://huggingface.co/spaces/ByteDance/Dolphin).
134134
```
135135

136136
3. Download the pre-trained models of *Dolphin-1.5*:
137-
138-
Visit our Huggingface [model card](https://huggingface.co/ByteDance/Dolphin-1.5), or download model by:
139-
140137
```bash
141138
# Download the model from Hugging Face Hub
142139
git lfs install
@@ -146,27 +143,51 @@ Try our demo on [Demo-Dolphin](https://huggingface.co/spaces/ByteDance/Dolphin).
146143
huggingface-cli download ByteDance/Dolphin-1.5 --local-dir ./hf_model
147144
```
148145

146+
### Alternative: Using UV
147+
148+
For faster dependency resolution, you can use [UV](https://docs.astral.sh/uv/) as an alternative to pip:
149+
150+
1. Install UV:
151+
```bash
152+
# On macOS and Linux
153+
curl -LsSf https://astral.sh/uv/install.sh | sh
154+
155+
# On Windows
156+
powershell -c "irm https://astral.sh/uv/install.ps1 | iex"
157+
```
158+
159+
2. Install dependencies:
160+
```bash
161+
uv sync
162+
```
163+
164+
3. Download the model:
165+
```bash
166+
uv run huggingface-cli download ByteDance/Dolphin-1.5 --local-dir ./hf_model
167+
```
168+
149169
## ⚡ Inference
150170

151171
Dolphin provides two inference frameworks with support for two parsing granularities:
152172
- **Page-level Parsing**: Parse the entire document page into a structured JSON and Markdown format
153173
- **Element-level Parsing**: Parse individual document elements (text, table, formula)
154174

175+
**Note:** If you installed using UV, prefix all python commands with `uv run`, e.g., `uv run python demo_page.py ...`
155176

156177
### 📄 Page-level Parsing
157178

158179
```bash
159180
# Process a single document image
160181
python demo_page.py --model_path ./hf_model --save_dir ./results \
161-
--input_path ./demo/page_imgs/page_1.png
182+
--input_path ./demo/page_imgs/page_1.png
162183

163184
# Process a single document pdf
164185
python demo_page.py --model_path ./hf_model --save_dir ./results \
165-
--input_path ./demo/page_imgs/page_6.pdf
186+
--input_path ./demo/page_imgs/page_6.pdf
166187

167188
# Process all documents in a directory
168189
python demo_page.py --model_path ./hf_model --save_dir ./results \
169-
--input_path ./demo/page_imgs
190+
--input_path ./demo/page_imgs
170191

171192
# Process with custom batch size for parallel element decoding
172193
python demo_page.py --model_path ./hf_model --save_dir ./results \
@@ -188,14 +209,14 @@ python demo_element.py --model_path ./hf_model --save_dir ./results \
188209
# Process a single document image
189210
python demo_layout.py --model_path ./hf_model --save_dir ./results \
190211
--input_path ./demo/page_imgs/page_1.png \
191-
212+
192213
# Process a single PDF document
193214
python demo_layout.py --model_path ./hf_model --save_dir ./results \
194215
--input_path ./demo/page_imgs/page_6.pdf \
195216

196217
# Process all documents in a directory
197218
python demo_layout.py --model_path ./hf_model --save_dir ./results \
198-
--input_path ./demo/page_imgs
219+
--input_path ./demo/page_imgs
199220
````
200221

201222

pyproject.toml

Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,24 @@
1+
[project]
2+
name = "dolphin"
3+
version = "1.5.0"
4+
description = "Document Image Parsing via Heterogeneous Anchor Prompting"
5+
readme = "README.md"
6+
requires-python = ">=3.9"
7+
license = { text = "MIT" }
8+
dependencies = [
9+
"numpy>=1.26.4,<2.0",
10+
"omegaconf==2.3.0",
11+
"opencv-python>=4.8.0,<4.11",
12+
"opencv-python-headless>=4.5.5,<4.6",
13+
"pillow>=9.3.0",
14+
"timm==0.5.4",
15+
"torch>=2.1.0",
16+
"torchvision>=0.16.0",
17+
"transformers==4.47.0",
18+
"accelerate==1.6.0",
19+
"pymupdf==1.26",
20+
]
21+
122
[tool.black]
223
line-length = 120
324
include = '\.pyi?$'

requirements.txt

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
numpy==1.24.4
1+
numpy==1.26.4
22
omegaconf==2.3.0
33
opencv-python==4.11.0.86
44
opencv-python-headless==4.5.5.64

0 commit comments

Comments
 (0)