Skip to content

Commit af2c222

Browse files
cau-gitSaidgurbuzCopilot
authored
feat: campaign tools (#139)
* Initial version of campaign tools Signed-off-by: Christoph Auer <[email protected]> * Make mypy pass Signed-off-by: Christoph Auer <[email protected]> * Upgrade deps Signed-off-by: Christoph Auer <[email protected]> * add script to combine results in an excel sheet Signed-off-by: Saidgurbuz <[email protected]> * Pass mypy checks Signed-off-by: Christoph Auer <[email protected]> * Add element statistics to layout evaluator Signed-off-by: Christoph Auer <[email protected]> * Prepare to_value parsing Signed-off-by: Christoph Auer <[email protected]> * Refactor and update CVAT to Docling conversion and visualisation logic Signed-off-by: Christoph Auer <[email protected]> * Update docling_eval/cvat_tools/validator.py Co-authored-by: Copilot <[email protected]> Signed-off-by: Christoph Auer <[email protected]> * Update docling_eval/cvat_tools/parser.py Co-authored-by: Copilot <[email protected]> Signed-off-by: Christoph Auer <[email protected]> * Cleanup Signed-off-by: Christoph Auer <[email protected]> * Improve excel eval consolidation Signed-off-by: Christoph Auer <[email protected]> --------- Signed-off-by: Christoph Auer <[email protected]> Signed-off-by: Saidgurbuz <[email protected]> Signed-off-by: Christoph Auer <[email protected]> Co-authored-by: Saidgurbuz <[email protected]> Co-authored-by: Copilot <[email protected]>
1 parent bc60093 commit af2c222

22 files changed

+3429
-1025
lines changed
Lines changed: 177 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,177 @@
1+
# CVAT Evaluation Pipeline Utility
2+
3+
A flexible pipeline for evaluating CVAT annotations that converts CVAT XML files to DoclingDocument format and runs layout and document structure evaluations.
4+
5+
## Features
6+
7+
- Convert CVAT XML annotations to DoclingDocument JSON format
8+
- Create ground truth datasets from CVAT annotations
9+
- Create prediction datasets for evaluation
10+
- Run layout and document structure evaluations
11+
- Support for step-by-step or end-to-end execution
12+
- Configurable evaluation modalities
13+
14+
## Requirements
15+
16+
The utility requires the following inputs:
17+
1. **Images Directory**: Directory containing PNG image files
18+
2. **Ground Truth XML**: CVAT XML file with ground truth annotations
19+
3. **Prediction XML**: CVAT XML file with prediction annotations (different from ground truth)
20+
4. **Output Directory**: Directory where all pipeline outputs will be saved
21+
22+
## Usage
23+
24+
### Command Line Interface
25+
26+
```bash
27+
python cvat_evaluation_pipeline.py <images_dir> <output_dir> [OPTIONS]
28+
```
29+
30+
### Required Arguments
31+
32+
- `images_dir`: Directory containing PNG image files
33+
- `output_dir`: Output directory for pipeline results
34+
35+
### Optional Arguments
36+
37+
- `--gt-xml PATH`: Path to ground truth CVAT XML file
38+
- `--pred-xml PATH`: Path to prediction CVAT XML file
39+
- `--step {gt,pred,eval,full}`: Pipeline step to run (default: full)
40+
- `--modalities {layout,document_structure}`: Evaluation modalities to run (default: both)
41+
- `--verbose, -v`: Enable verbose logging
42+
43+
## Examples
44+
45+
### 1. Run Full Pipeline
46+
47+
Convert both ground truth and prediction CVAT XMLs, create datasets, and run evaluations:
48+
49+
```bash
50+
python cvat_evaluation_pipeline.py \
51+
/path/to/images \
52+
/path/to/output \
53+
--gt-xml /path/to/ground_truth.xml \
54+
--pred-xml /path/to/predictions.xml
55+
```
56+
57+
### 2. Run Step by Step
58+
59+
**Step 1: Create Ground Truth Dataset**
60+
```bash
61+
python cvat_evaluation_pipeline.py \
62+
/path/to/images \
63+
/path/to/output \
64+
--gt-xml /path/to/ground_truth.xml \
65+
--step gt
66+
```
67+
68+
**Step 2: Create Prediction Dataset**
69+
```bash
70+
python cvat_evaluation_pipeline.py \
71+
/path/to/images \
72+
/path/to/output \
73+
--pred-xml /path/to/predictions.xml \
74+
--step pred
75+
```
76+
77+
**Step 3: Run Evaluation**
78+
```bash
79+
python cvat_evaluation_pipeline.py \
80+
/path/to/images \
81+
/path/to/output \
82+
--step eval
83+
```
84+
85+
### 3. Run Specific Evaluation Modalities
86+
87+
Run only layout evaluation:
88+
```bash
89+
python cvat_evaluation_pipeline.py \
90+
/path/to/images \
91+
/path/to/output \
92+
--gt-xml /path/to/ground_truth.xml \
93+
--pred-xml /path/to/predictions.xml \
94+
--modalities layout
95+
```
96+
97+
Run only document structure evaluation:
98+
```bash
99+
python cvat_evaluation_pipeline.py \
100+
/path/to/images \
101+
/path/to/output \
102+
--gt-xml /path/to/ground_truth.xml \
103+
--pred-xml /path/to/predictions.xml \
104+
--modalities document_structure
105+
```
106+
107+
## Output Structure
108+
109+
The pipeline creates the following directory structure in the output directory:
110+
111+
```
112+
output_dir/
113+
├── ground_truth_json/ # Ground truth DoclingDocument JSON files
114+
│ ├── gt_image1.json
115+
│ └── gt_image2.json
116+
├── predictions_json/ # Prediction DoclingDocument JSON files
117+
│ ├── pred_image1.json
118+
│ └── pred_image2.json
119+
├── gt_dataset/ # Ground truth dataset
120+
│ ├── test/
121+
│ └── visualizations/
122+
├── eval_dataset/ # Evaluation dataset
123+
│ ├── test/
124+
│ └── visualizations/
125+
└── evaluation_results/ # Evaluation results
126+
├── layout_evaluation/
127+
└── document_structure_evaluation/
128+
```
129+
130+
## Pipeline Steps Explained
131+
132+
### Step 1: Ground Truth Dataset Creation
133+
- Converts ground truth CVAT XML to DoclingDocument JSON format
134+
- Creates a ground truth dataset using FileDatasetBuilder
135+
- Generates visualizations for quality inspection
136+
137+
### Step 2: Prediction Dataset Creation
138+
- Converts prediction CVAT XML to DoclingDocument JSON format
139+
- Creates a prediction dataset using FilePredictionProvider
140+
- Links predictions to the ground truth dataset for evaluation
141+
142+
### Step 3: Evaluation
143+
- Runs layout evaluation (mean Average Precision metrics)
144+
- Runs document structure evaluation (edit distance metrics)
145+
- Saves detailed evaluation results and visualizations
146+
147+
## Error Handling
148+
149+
The utility includes comprehensive error handling:
150+
- Validates input paths and file existence
151+
- Provides clear error messages for missing requirements
152+
- Continues processing other files if individual conversions fail
153+
- Logs warnings for failed conversions without stopping the pipeline
154+
155+
## Logging
156+
157+
The utility provides detailed logging with timestamps:
158+
- INFO level: Progress updates and results
159+
- WARNING level: Non-critical issues (e.g., failed conversions)
160+
- ERROR level: Critical errors that stop execution
161+
- Use `--verbose` flag for DEBUG level logging
162+
163+
## Integration with Existing Codebase
164+
165+
This utility is designed to work with the existing docling-eval framework and uses:
166+
- `docling_eval.cvat_tools.cvat_to_docling` for CVAT conversion
167+
- `docling_eval.dataset_builders.file_dataset_builder` for dataset creation
168+
- `docling_eval.prediction_providers.file_provider` for prediction datasets
169+
- `docling_eval.cli.main.evaluate` for running evaluations
170+
171+
## Tips for Best Results
172+
173+
1. **Image Naming**: Ensure PNG files have consistent naming that matches the CVAT annotations
174+
2. **XML Validation**: Verify that both ground truth and prediction XML files are valid CVAT exports
175+
3. **Output Space**: Ensure sufficient disk space for intermediate JSON files and datasets
176+
4. **Step-by-Step**: For large datasets, consider running steps separately for better resource management
177+
5. **Visualization**: Check the generated visualizations to verify conversion quality
Lines changed: 143 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,143 @@
1+
#!/usr/bin/env python3
2+
"""
3+
Script to collect images from CVAT XML annotation file.
4+
5+
This script:
6+
1. Parses a CVAT XML annotation file to extract image filenames
7+
2. Searches for these images in subdirectories containing cvat_tasks folders
8+
3. Only considers subdirectories that contain a 'cvat_tasks' folder
9+
4. Copies found images to an output directory
10+
"""
11+
12+
import argparse
13+
import shutil
14+
import sys
15+
import xml.etree.ElementTree as ET
16+
from pathlib import Path
17+
from typing import List, Set
18+
19+
20+
def extract_image_filenames(xml_path: Path) -> Set[str]:
21+
"""Extract image filenames from CVAT XML file."""
22+
try:
23+
tree = ET.parse(xml_path)
24+
root = tree.getroot()
25+
26+
# Find all image elements and extract their 'name' attributes
27+
image_filenames = set()
28+
for image_elem in root.findall(".//image"):
29+
name_attr = image_elem.get("name")
30+
if name_attr:
31+
image_filenames.add(name_attr)
32+
33+
return image_filenames
34+
except ET.ParseError as e:
35+
print(f"Error parsing XML file: {e}", file=sys.stderr)
36+
sys.exit(1)
37+
except Exception as e:
38+
print(f"Unexpected error reading XML file: {e}", file=sys.stderr)
39+
sys.exit(1)
40+
41+
42+
def find_images_in_subdirectories(
43+
root_dir: Path, image_filenames: Set[str]
44+
) -> dict[str, Path]:
45+
"""Find images in subdirectories that contain 'cvat_tasks' folder."""
46+
found_images = {}
47+
48+
# Walk through all subdirectories
49+
for subdir in root_dir.rglob("*"):
50+
if not subdir.is_dir():
51+
continue
52+
53+
# Check if this subdirectory contains a 'cvat_tasks' folder
54+
cvat_tasks_path = subdir / "cvat_tasks"
55+
if not cvat_tasks_path.exists() or not cvat_tasks_path.is_dir():
56+
continue
57+
58+
# Search recursively within this subdirectory for images
59+
for image_filename in image_filenames:
60+
# Look for the image in this directory and all its subdirectories
61+
for potential_image_path in subdir.rglob(image_filename):
62+
if potential_image_path.is_file():
63+
found_images[image_filename] = potential_image_path
64+
break # Found this image, move to next filename
65+
66+
return found_images
67+
68+
69+
def copy_images_to_output(found_images: dict[str, Path], output_dir: Path) -> None:
70+
"""Copy found images to output directory."""
71+
output_dir.mkdir(parents=True, exist_ok=True)
72+
73+
copied_count = 0
74+
for image_filename, source_path in found_images.items():
75+
dest_path = output_dir / image_filename
76+
77+
try:
78+
shutil.copy2(source_path, dest_path)
79+
print(f"Copied: {source_path} -> {dest_path}")
80+
copied_count += 1
81+
except Exception as e:
82+
print(f"Error copying {source_path}: {e}", file=sys.stderr)
83+
84+
print(f"\nSuccessfully copied {copied_count} images to {output_dir}")
85+
86+
87+
def main():
88+
parser = argparse.ArgumentParser(
89+
description="Collect images from CVAT XML annotation file"
90+
)
91+
parser.add_argument("xml_file", type=Path, help="Path to CVAT XML annotation file")
92+
parser.add_argument(
93+
"root_dir", type=Path, help="Root directory to search for images"
94+
)
95+
parser.add_argument(
96+
"output_dir", type=Path, help="Output directory for collected images"
97+
)
98+
99+
args = parser.parse_args()
100+
101+
# Validate input file exists
102+
if not args.xml_file.exists():
103+
print(f"Error: XML file '{args.xml_file}' does not exist", file=sys.stderr)
104+
sys.exit(1)
105+
106+
if not args.root_dir.exists():
107+
print(
108+
f"Error: Root directory '{args.root_dir}' does not exist", file=sys.stderr
109+
)
110+
sys.exit(1)
111+
112+
print(f"Parsing XML file: {args.xml_file}")
113+
image_filenames = extract_image_filenames(args.xml_file)
114+
print(f"Found {len(image_filenames)} image filenames in XML")
115+
116+
print(f"Searching for images in: {args.root_dir}")
117+
found_images = find_images_in_subdirectories(args.root_dir, image_filenames)
118+
print(
119+
f"Found {len(found_images)} images in subdirectories with 'cvat_tasks' folders"
120+
)
121+
122+
if not found_images:
123+
print("No images found. Exiting.")
124+
return
125+
126+
# Show which images were found
127+
print("\nFound images:")
128+
for filename, path in found_images.items():
129+
print(f" {filename} -> {path}")
130+
131+
# Show missing images
132+
missing_images = image_filenames - set(found_images.keys())
133+
if missing_images:
134+
print(f"\nMissing images ({len(missing_images)}):")
135+
for filename in sorted(missing_images):
136+
print(f" {filename}")
137+
138+
print(f"\nCopying images to: {args.output_dir}")
139+
copy_images_to_output(found_images, args.output_dir)
140+
141+
142+
if __name__ == "__main__":
143+
main()

0 commit comments

Comments
 (0)