You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
python run.py --image-paths assets/image1.jpg assets/image2.jpg --ground-truth manual --ground-truth-texts "Text from image 1" "Text from image 2"
107
-
108
112
# List available ground truth files
109
113
python run.py --list-ground-truth-files
110
114
111
-
# For quicker non-ZenML processing of a single image
115
+
# For quicker processing of a single image without metadata or artifact tracking
112
116
python run_compare_ocr.py --image assets/your_image.jpg --model both
113
117
```
114
118
@@ -126,9 +130,12 @@ The OCR comparison pipeline consists of the following components:
126
130
127
131
### Steps
128
132
129
-
1.**Gemma3 OCR Step**: Uses Ollama Gemma3 model for OCR
130
-
2.**Mistral OCR Step**: Uses Mistral's Pixtral model for OCR
131
-
3.**Evaluation Step**: Compares results and calculates metrics
133
+
1.**Model 1 OCR Step**: Processes images with the first model (default: Ollama Gemma3)
134
+
2.**Model 2 OCR Step**: Processes images with the second model (default: Pixtral 12B)
135
+
3.**Ground Truth Step**: Optional step that uses a reference model for evaluation (default: GPT-4o Mini)
136
+
4.**Evaluation Step**: Compares results and calculates metrics
137
+
138
+
The pipeline now supports configurable models, allowing you to easily swap out the models used for OCR comparison and ground truth generation via the YAML configuration file.
0 commit comments