|
| 1 | +# CVAT Evaluation Pipeline Utility |
| 2 | + |
| 3 | +A flexible pipeline for evaluating CVAT annotations that converts CVAT XML files to DoclingDocument format and runs layout and document structure evaluations. |
| 4 | + |
| 5 | +## Features |
| 6 | + |
| 7 | +- Convert CVAT XML annotations to DoclingDocument JSON format |
| 8 | +- Create ground truth datasets from CVAT annotations |
| 9 | +- Create prediction datasets for evaluation |
| 10 | +- Run layout and document structure evaluations |
| 11 | +- Support for step-by-step or end-to-end execution |
| 12 | +- Configurable evaluation modalities |
| 13 | + |
| 14 | +## Requirements |
| 15 | + |
| 16 | +The utility requires the following inputs: |
| 17 | +1. **Images Directory**: Directory containing PNG image files |
| 18 | +2. **Ground Truth XML**: CVAT XML file with ground truth annotations |
| 19 | +3. **Prediction XML**: CVAT XML file with prediction annotations (different from ground truth) |
| 20 | +4. **Output Directory**: Directory where all pipeline outputs will be saved |
| 21 | + |
| 22 | +## Usage |
| 23 | + |
| 24 | +### Command Line Interface |
| 25 | + |
| 26 | +```bash |
| 27 | +python cvat_evaluation_pipeline.py <images_dir> <output_dir> [OPTIONS] |
| 28 | +``` |
| 29 | + |
| 30 | +### Required Arguments |
| 31 | + |
| 32 | +- `images_dir`: Directory containing PNG image files |
| 33 | +- `output_dir`: Output directory for pipeline results |
| 34 | + |
| 35 | +### Optional Arguments |
| 36 | + |
| 37 | +- `--gt-xml PATH`: Path to ground truth CVAT XML file |
| 38 | +- `--pred-xml PATH`: Path to prediction CVAT XML file |
| 39 | +- `--step {gt,pred,eval,full}`: Pipeline step to run (default: full) |
| 40 | +- `--modalities {layout,document_structure}`: Evaluation modalities to run (default: both) |
| 41 | +- `--verbose, -v`: Enable verbose logging |
| 42 | + |
| 43 | +## Examples |
| 44 | + |
| 45 | +### 1. Run Full Pipeline |
| 46 | + |
| 47 | +Convert both ground truth and prediction CVAT XMLs, create datasets, and run evaluations: |
| 48 | + |
| 49 | +```bash |
| 50 | +python cvat_evaluation_pipeline.py \ |
| 51 | + /path/to/images \ |
| 52 | + /path/to/output \ |
| 53 | + --gt-xml /path/to/ground_truth.xml \ |
| 54 | + --pred-xml /path/to/predictions.xml |
| 55 | +``` |
| 56 | + |
| 57 | +### 2. Run Step by Step |
| 58 | + |
| 59 | +**Step 1: Create Ground Truth Dataset** |
| 60 | +```bash |
| 61 | +python cvat_evaluation_pipeline.py \ |
| 62 | + /path/to/images \ |
| 63 | + /path/to/output \ |
| 64 | + --gt-xml /path/to/ground_truth.xml \ |
| 65 | + --step gt |
| 66 | +``` |
| 67 | + |
| 68 | +**Step 2: Create Prediction Dataset** |
| 69 | +```bash |
| 70 | +python cvat_evaluation_pipeline.py \ |
| 71 | + /path/to/images \ |
| 72 | + /path/to/output \ |
| 73 | + --pred-xml /path/to/predictions.xml \ |
| 74 | + --step pred |
| 75 | +``` |
| 76 | + |
| 77 | +**Step 3: Run Evaluation** |
| 78 | +```bash |
| 79 | +python cvat_evaluation_pipeline.py \ |
| 80 | + /path/to/images \ |
| 81 | + /path/to/output \ |
| 82 | + --step eval |
| 83 | +``` |
| 84 | + |
| 85 | +### 3. Run Specific Evaluation Modalities |
| 86 | + |
| 87 | +Run only layout evaluation: |
| 88 | +```bash |
| 89 | +python cvat_evaluation_pipeline.py \ |
| 90 | + /path/to/images \ |
| 91 | + /path/to/output \ |
| 92 | + --gt-xml /path/to/ground_truth.xml \ |
| 93 | + --pred-xml /path/to/predictions.xml \ |
| 94 | + --modalities layout |
| 95 | +``` |
| 96 | + |
| 97 | +Run only document structure evaluation: |
| 98 | +```bash |
| 99 | +python cvat_evaluation_pipeline.py \ |
| 100 | + /path/to/images \ |
| 101 | + /path/to/output \ |
| 102 | + --gt-xml /path/to/ground_truth.xml \ |
| 103 | + --pred-xml /path/to/predictions.xml \ |
| 104 | + --modalities document_structure |
| 105 | +``` |
| 106 | + |
| 107 | +## Output Structure |
| 108 | + |
| 109 | +The pipeline creates the following directory structure in the output directory: |
| 110 | + |
| 111 | +``` |
| 112 | +output_dir/ |
| 113 | +├── ground_truth_json/ # Ground truth DoclingDocument JSON files |
| 114 | +│ ├── gt_image1.json |
| 115 | +│ └── gt_image2.json |
| 116 | +├── predictions_json/ # Prediction DoclingDocument JSON files |
| 117 | +│ ├── pred_image1.json |
| 118 | +│ └── pred_image2.json |
| 119 | +├── gt_dataset/ # Ground truth dataset |
| 120 | +│ ├── test/ |
| 121 | +│ └── visualizations/ |
| 122 | +├── eval_dataset/ # Evaluation dataset |
| 123 | +│ ├── test/ |
| 124 | +│ └── visualizations/ |
| 125 | +└── evaluation_results/ # Evaluation results |
| 126 | + ├── layout_evaluation/ |
| 127 | + └── document_structure_evaluation/ |
| 128 | +``` |
| 129 | + |
| 130 | +## Pipeline Steps Explained |
| 131 | + |
| 132 | +### Step 1: Ground Truth Dataset Creation |
| 133 | +- Converts ground truth CVAT XML to DoclingDocument JSON format |
| 134 | +- Creates a ground truth dataset using FileDatasetBuilder |
| 135 | +- Generates visualizations for quality inspection |
| 136 | + |
| 137 | +### Step 2: Prediction Dataset Creation |
| 138 | +- Converts prediction CVAT XML to DoclingDocument JSON format |
| 139 | +- Creates a prediction dataset using FilePredictionProvider |
| 140 | +- Links predictions to the ground truth dataset for evaluation |
| 141 | + |
| 142 | +### Step 3: Evaluation |
| 143 | +- Runs layout evaluation (mean Average Precision metrics) |
| 144 | +- Runs document structure evaluation (edit distance metrics) |
| 145 | +- Saves detailed evaluation results and visualizations |
| 146 | + |
| 147 | +## Error Handling |
| 148 | + |
| 149 | +The utility includes comprehensive error handling: |
| 150 | +- Validates input paths and file existence |
| 151 | +- Provides clear error messages for missing requirements |
| 152 | +- Continues processing other files if individual conversions fail |
| 153 | +- Logs warnings for failed conversions without stopping the pipeline |
| 154 | + |
| 155 | +## Logging |
| 156 | + |
| 157 | +The utility provides detailed logging with timestamps: |
| 158 | +- INFO level: Progress updates and results |
| 159 | +- WARNING level: Non-critical issues (e.g., failed conversions) |
| 160 | +- ERROR level: Critical errors that stop execution |
| 161 | +- Use `--verbose` flag for DEBUG level logging |
| 162 | + |
| 163 | +## Integration with Existing Codebase |
| 164 | + |
| 165 | +This utility is designed to work with the existing docling-eval framework and uses: |
| 166 | +- `docling_eval.cvat_tools.cvat_to_docling` for CVAT conversion |
| 167 | +- `docling_eval.dataset_builders.file_dataset_builder` for dataset creation |
| 168 | +- `docling_eval.prediction_providers.file_provider` for prediction datasets |
| 169 | +- `docling_eval.cli.main.evaluate` for running evaluations |
| 170 | + |
| 171 | +## Tips for Best Results |
| 172 | + |
| 173 | +1. **Image Naming**: Ensure PNG files have consistent naming that matches the CVAT annotations |
| 174 | +2. **XML Validation**: Verify that both ground truth and prediction XML files are valid CVAT exports |
| 175 | +3. **Output Space**: Ensure sufficient disk space for intermediate JSON files and datasets |
| 176 | +4. **Step-by-Step**: For large datasets, consider running steps separately for better resource management |
| 177 | +5. **Visualization**: Check the generated visualizations to verify conversion quality |
0 commit comments