DeepSeek-OCR Portable

This is an easy-to-use offline OCR tool based on the DeepSeek-OCR 1280×1280 mode. It allows users to perform OCR without installing Python, Miniconda, CUDA Toolkit, or configuring environment variables. Just double-click and enjoy the powerful OCR experience powered by deep learning.

📋 Features

Performs OCR on images, including complex structures like tables, formulas, figures, and references
Output formats:
- Markdown OCR result (<filename>.md)
- Annotated image with bounding boxes (<filename>_with_boxes.jpg)
"Green software" mode — All dependencies and models require no manual downloads; fully offline after initial setup

🖥️ System Requirements

Windows 10 or Windows 11
NVIDIA GPU (≥ 4GB VRAM)
NVIDIA Driver ≥ 560.35

🚀 How to Use

Double-click init.bat (First run requires downloading models and dependencies - may take significant time)
A file selection window will appear — select the image you want to OCR
After processing, two files will be generated in the original image's directory:
- Markdown OCR result: <original_filename>.md
- Image with bounding boxes: <original_filename>_with_boxes.jpg

No need to install Python, Miniconda, or configure environment variables — all dependencies are automatically resolved!

📁 Project Structure

DeepSeek-OCR Portable/
├── env/                      # Portable Python environment
├── models/
│   └── DeepSeek-OCR/         # DeepSeek OCR model files
├── init.bat                  # One-click launch script (double-click to run)
├── run_ocr.bat               # Quick offline launch script (requires pre-downloaded models)
├── requirements.txt          # Python dependencies list
├── required_model_files.json # Model file list
├── check_model_files.py      # Model file existence checker
├── download_model_files.py   # Model download script
├── run_ocr.py                # OCR core logic
├── README.md                 # Documentation
├── README_zh.md              # Chinese documentation
└── LICENSE                   # MIT License

⚠️ Notes

Initial download may be slow (~10GB) — please be patient
If encountering "out of memory" errors:
- Close other GPU-intensive applications
- Modify IMAGE_SIZE in run_ocr.py to 1024 or 640
Currently Windows-only (no macOS/Linux support)
NVIDIA GPU required (CUDA 12.8 based) — AMD GPUs or CPU execution not supported

📝 Example Output

Suppose you select an image named document.jpg. After OCR processing, the following files will be generated in the original image's directory:

document.md — OCR result in Markdown format
document_with_boxes.jpg — Original image with detection boxes overlaid

🧠 Technical Details

Uses HuggingFace's transformers library to load local models
Uses torch.bfloat16 to reduce GPU memory usage
Uses tkinter for file selection UI
All dependencies are bundled in the env/ directory for true portability

📌 Developer Notes

You can customize the OCR behavior by modifying these parameters in run_ocr.py:

PROMPT = "<image>\n<|grounding|>Convert the document to markdown with full structure, including "
IMAGE_SIZE = 1280
BASE_SIZE = 1280
CROP_MODE = False
SAVE_RESULTS = True
TEST_COMPRESS = False

❤️ Acknowledgments

This project is built based on DeepSeek-OCR. Special thanks to the DeepSeek team for open-sourcing this high-quality OCR model.

Feel free to contribute, report issues, or improve this project!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

DeepSeek-OCR Portable

📋 Features

🖥️ System Requirements

🚀 How to Use

📁 Project Structure

⚠️ Notes

📝 Example Output

🧠 Technical Details

📌 Developer Notes

❤️ Acknowledgments

About

Uh oh!

Releases

Packages

Contributors 2

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 24 Commits
env		env
models		models
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
README_zh.md		README_zh.md
check_model_files.py		check_model_files.py
download_model_files.py		download_model_files.py
init.bat		init.bat
required_model_files.json		required_model_files.json
requirements.txt		requirements.txt
run_ocr.bat		run_ocr.bat
run_ocr.py		run_ocr.py

License

Triangle-GitHub/DeepSeek-OCR-Portable

Folders and files

Latest commit

History

Repository files navigation

DeepSeek-OCR Portable

📋 Features

🖥️ System Requirements

🚀 How to Use

📁 Project Structure

⚠️ Notes

📝 Example Output

🧠 Technical Details

📌 Developer Notes

❤️ Acknowledgments

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages