You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
**Note**: The `sample_labels_3.csv` contains ground truth for only 1 of the 3 sample documents. For full dataset evaluation, use `sr_refactor_labels_5_5_25.csv`.
165
+
**Note**: The `sample_labels_3.csv` contains ground truth for 3 sample documents. For full dataset evaluation, use `sr_refactor_labels_5_5_25.csv`.
154
166
155
167
**What this does:**
156
168
- Loads ground truth labels from CSV
157
169
- Matches documents by doc_id
158
-
- Performs doc-by-doc comparison using Stickler
170
+
- Performs doc-by-doc comparison using SticklerEvaluationService
159
171
- Saves individual comparison results
160
172
- Aggregates metrics across all documents
161
173
- Generates comprehensive evaluation report
162
174
175
+
**Why use the simplified script?**
176
+
- 260 lines vs 671 lines (61% less code)
177
+
- Easier to understand and modify
178
+
- No temporary file overhead
179
+
- Direct integration with SticklerEvaluationService
0 commit comments