Skip to content

Commit 4ab8dcd

Browse files
authored
GLM-OCR README.md and concurrency modification (#549)
Co-authored-by: AlexKer <AlexKer@users.noreply.github.com>
1 parent de1ef4c commit 4ab8dcd

File tree

2 files changed

+76
-1
lines changed

2 files changed

+76
-1
lines changed

glm-ocr/README.md

Lines changed: 75 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,75 @@
1+
# GLM-OCR Truss Model
2+
3+
This is a Truss deployment of the [GLM-OCR](https://huggingface.co/zai-org/GLM-OCR) model for optical character recognition using vLLM engine on Baseten served on an L4 GPU. With only 0.9B parameters, GLM-OCR delivers strong OCR performance while being lightweight enough for high-concurrency and edge deployments.
4+
5+
GLM-OCR integrates the CogViT visual encoder, a lightweight cross-modal connector with efficient token downsampling, and a GLM-0.5B language decoder. It supports a two-stage pipeline (layout analysis + parallel recognition) for processing complex documents.
6+
7+
## Quick Start
8+
9+
### 1. Deploy to Baseten
10+
11+
```bash
12+
# Clone this repo and cd into this folder
13+
git clone https://github.com/basetenlabs/truss-examples.git
14+
cd truss-examples/glm-ocr
15+
16+
# Deploy the model
17+
truss push --publish
18+
# This assumes you have truss installed, if not follow the instructions here:
19+
# https://docs.baseten.co/development/model/build-your-first-model
20+
```
21+
22+
### 2. Test with the OpenAI Client
23+
24+
Replace the `api_key` and `base_url` in `test.py` with your specific deployment credentials and URL.
25+
26+
```bash
27+
pip install openai
28+
python test.py
29+
```
30+
31+
### 3. Project Structure
32+
33+
```
34+
glm-ocr/
35+
├── config.yaml # Truss configuration
36+
├── test.py # Test script (OpenAI client)
37+
└── README.md # Documentation
38+
```
39+
40+
## Model Information
41+
42+
- **Model**: [zai-org/GLM-OCR](https://huggingface.co/zai-org/GLM-OCR)
43+
- **Parameters**: 0.9B
44+
- **Framework**: vLLM (OpenAI-compatible API)
45+
- **GPU**: L4 (24GB)
46+
- **API**: OpenAI Chat Completions (`/v1/chat/completions`)
47+
48+
## Usage
49+
50+
### Using the OpenAI Client
51+
52+
```python
53+
from openai import OpenAI
54+
55+
client = OpenAI(
56+
api_key="YOUR_BASETEN_API_KEY",
57+
base_url="https://model-XXXX.api.baseten.co/deployment/YYYY/sync/v1"
58+
)
59+
60+
response = client.chat.completions.create(
61+
model="zai-org/GLM-OCR",
62+
messages=[{
63+
"role": "user",
64+
"content": [
65+
{"type": "image_url", "image_url": {"url": "https://example.com/document.png"}},
66+
{"type": "text", "text": "Text Recognition:"}
67+
]
68+
}],
69+
max_tokens=4096
70+
)
71+
72+
print(response.choices[0].message.content)
73+
```
74+
75+
The model accepts images via URL or base64-encoded data URIs, and returns recognized text in markdown format.

glm-ocr/config.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -28,6 +28,6 @@ resources:
2828
accelerator: L4
2929
use_gpu: true
3030
runtime:
31-
predict_concurrency: 128
31+
predict_concurrency: 32
3232
secrets:
3333
hf_access_token: null

0 commit comments

Comments
 (0)