You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
@@ -14,6 +14,14 @@ OmniReader is built for teams who routinely work with unstructured documents (e.
14
14
<p><em>HTML visualization showing metrics and comparison results from the OCR pipeline</em></p>
15
15
</div>
16
16
17
+
## 🔮 Use Cases
18
+
19
+
-**Document Processing Automation**: Extract structured data from invoices, receipts, and forms
20
+
-**Content Digitization**: Convert scanned documents and books into searchable digital content
21
+
-**Regulatory Compliance**: Extract and validate information from compliance documents
22
+
-**Data Migration**: Convert legacy paper documents into structured digital formats
23
+
-**Research & Analysis**: Extract data from academic papers, reports, and publications
24
+
17
25
## 🌟 Key Features
18
26
19
27
-**End-to-end workflow management** from evaluation to production deployment
@@ -44,6 +52,74 @@ OmniReader supports a wide range of OCR models, including:
44
52
45
53
> ⚠️ Note: For production deployments, we recommend using the non-GGUF hosted model versions via their respective APIs for better performance and accuracy. The Ollama models mentioned here are primarily for convenience.
46
54
55
+
### 🔧 OCR Processor Configuration
56
+
57
+
OmniReader supports multiple OCR processors to handle different models:
58
+
59
+
1.**litellm**: For using LiteLLM-compatible models including those from Mistral and other providers.
60
+
61
+
- Set API keys for your providers (e.g., `MISTRAL_API_KEY`)
62
+
-**Important**: When using `litellm` as the processor, you must specify the `provider` field in your model configuration.
63
+
64
+
2.**ollama**: For running local models through Ollama.
65
+
66
+
- Requires: [Ollama](https://ollama.com/) installed and running
67
+
- Set `OLLAMA_HOST` (defaults to "http://localhost:11434/api/generate")
68
+
- If using local models, they must be pulled before use with `ollama pull model_name`
69
+
70
+
3.**openai**: For using OpenAI models like GPT-4o.
71
+
- Set `OPENAI_API_KEY` environment variable
72
+
73
+
Example model configurations in your `configs/batch_pipeline.yaml`:
74
+
75
+
```yaml
76
+
models_registry:
77
+
- name: "gpt-4o-mini"
78
+
shorthand: "gpt4o"
79
+
ocr_processor: "openai"
80
+
# No provider needed for OpenAI
81
+
82
+
- name: "gemma3:27b"
83
+
shorthand: "gemma3"
84
+
ocr_processor: "ollama"
85
+
# No provider needed for Ollama
86
+
87
+
- name: "mistral/pixtral-12b-2409"
88
+
shorthand: "pixtral"
89
+
ocr_processor: "litellm"
90
+
provider: "mistral"# Provider field required for litellm processor
91
+
```
92
+
93
+
To add your own models, extend the `models_registry` with the appropriate processor and provider configurations based on the model source.
94
+
95
+
## 🛠️ Project Structure
96
+
97
+
```
98
+
omni-reader/
99
+
│
100
+
├── app.py # Streamlit UI for interactive document processing
101
+
├── assets/ # Sample images for ocr
102
+
├── configs/ # YAML configuration files
103
+
├── ground_truth_texts/ # Text files containing ground truth for evaluation
If using local models, ensure any Ollama models you want to use are pulled:
76
-
77
-
```bash
78
-
ollama pull gemma3:27b
79
-
ollama pull llava-phi3
80
-
ollama pull granite3.2-vision
81
-
```
82
-
83
143
### Set Up Your Environment
84
144
85
145
Configure your API keys:
@@ -93,11 +153,20 @@ export OLLAMA_HOST=base_url_for_ollama_host # defaults to "http://localhost:1143
93
153
### Run OmniReader
94
154
95
155
```bash
96
-
#Use the default config (config.yaml)
156
+
#Run the batch pipeline (default)
97
157
python run.py
98
158
159
+
# Run the evaluation pipeline
160
+
python run.py --eval
161
+
99
162
# Run with a custom config file
100
-
python run.py --config my_config.yaml
163
+
python run.py --config my_custom_config.yaml
164
+
165
+
# Run with custom input
166
+
python run.py --image-folder ./my_images
167
+
168
+
# List ground truth files
169
+
python run.py --list-ground-truth-files
101
170
```
102
171
103
172
### Interactive UI
@@ -120,66 +189,107 @@ streamlit run app.py
120
189
121
190
## ☁️ Cloud Deployment
122
191
123
-
OmniReader supports storing artifacts remotely and executing pipelines on cloud infrastructure:
192
+
OmniReader supports storing artifacts remotely and executing pipelines on cloud infrastructure. For this example, we'll use AWS, but you can use any cloud provider you want.
124
193
125
-
### Set Up Cloud Provider Integrations
194
+
### AWS Setup
126
195
127
-
```bash
128
-
# For AWS
129
-
zenml integration install aws s3
196
+
1.**Install required integrations**:
130
197
131
-
# For Azure
132
-
zenml integration install azure
198
+
```bash
199
+
zenml integration install aws s3
200
+
```
133
201
134
-
# For Google Cloud
135
-
zenml integration install gcp gcs
136
-
```
202
+
2.**Set up your AWS credentials**:
137
203
138
-
Run your pipeline in the cloud:
204
+
- Create an IAM role with appropriate permissions (S3, ECR, SageMaker)
205
+
- Configure your role ARN and region
139
206
140
-
```bash
141
-
# Configure your cloud stack
142
-
zenml stack register my-cloud-stack -a cloud-artifact-store -o cloud-orchestrator
143
-
```
207
+
3.**Register an AWS service connector**:
208
+
209
+
```bash
210
+
zenml service-connector register aws_connector \
211
+
--type aws \
212
+
--auth-method iam-role \
213
+
--role_arn=<ROLE_ARN> \
214
+
--region=<YOUR_REGION> \
215
+
--aws_access_key_id=<YOUR_ACCESS_KEY_ID> \
216
+
--aws_secret_access_key=<YOUR_SECRET_ACCESS_KEY>
217
+
```
218
+
219
+
4.**Configure stack components**:
220
+
221
+
a. **S3 Artifact Store**:
222
+
223
+
```bash
224
+
zenml artifact-store register s3_artifact_store \
225
+
-f s3 \
226
+
--path=s3://<YOUR_BUCKET_NAME> \
227
+
--connector aws_connector
228
+
```
144
229
145
-
For detailed configuration options and other components, refer to the ZenML documentation:
0 commit comments