Skip to content

zjykzj/DataFlow-CV

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

45 Commits
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

DataFlow-CV

Where Vibe Coding meets CV data. ๐ŸŒŠ Convert & visualize datasets. Built with the flow of Claude Code.

Python Version License PyPI Development Status GitHub Actions

A data processing library for computer vision datasets, focusing on format conversion and visualization between LabelMe, COCO, and YOLO formats. Provides both a CLI and Python API.

Table of Contents

Project Structure

dataflow/
โ”œโ”€โ”€ __init__.py              # Package exports and convenience functions
โ”œโ”€โ”€ cli.py                   # Command-line interface
โ”œโ”€โ”€ config.py                # Configuration management
โ”œโ”€โ”€ convert/                 # Format conversion module
โ”‚   โ”œโ”€โ”€ __init__.py
โ”‚   โ”œโ”€โ”€ base.py             # Converter base class
โ”‚   โ”œโ”€โ”€ coco_and_yolo.py    # COCO โ†” YOLO converters
โ”‚   โ”œโ”€โ”€ coco_and_labelme.py # COCO โ†” LabelMe converters
โ”‚   โ””โ”€โ”€ yolo_and_labelme.py # YOLO โ†” LabelMe converters
โ”œโ”€โ”€ visualize/               # Annotation visualization module
โ”‚   โ”œโ”€โ”€ __init__.py
โ”‚   โ”œโ”€โ”€ base.py            # Visualizer base class
โ”‚   โ”œโ”€โ”€ generic.py         # Generic visualizer base class using label handlers
โ”‚   โ”œโ”€โ”€ yolo.py            # YOLO annotation visualizer
โ”‚   โ”œโ”€โ”€ coco.py            # COCO annotation visualizer
โ”‚   โ””โ”€โ”€ labelme.py         # LabelMe annotation visualizer
โ””โ”€โ”€ label/                   # Label format handlers module
    โ”œโ”€โ”€ __init__.py
    โ”œโ”€โ”€ yolo.py            # YOLO format handler
    โ”œโ”€โ”€ coco.py            # COCO format handler
    โ””โ”€โ”€ labelme.py         # LabelMe format handler
tests/
โ”œโ”€โ”€ __init__.py
โ”œโ”€โ”€ convert/                # Conversion tests
โ”‚   โ”œโ”€โ”€ __init__.py
โ”‚   โ”œโ”€โ”€ test_coco_to_yolo.py
โ”‚   โ”œโ”€โ”€ test_yolo_to_coco.py
โ”‚   โ”œโ”€โ”€ test_coco_to_labelme.py
โ”‚   โ”œโ”€โ”€ test_labelme_to_coco.py
โ”‚   โ”œโ”€โ”€ test_labelme_to_yolo.py
โ”‚   โ””โ”€โ”€ test_yolo_to_labelme.py
โ”œโ”€โ”€ visualize/              # Visualization tests
โ”‚   โ”œโ”€โ”€ __init__.py
โ”‚   โ”œโ”€โ”€ test_yolo.py
โ”‚   โ”œโ”€โ”€ test_coco.py
โ”‚   โ”œโ”€โ”€ test_labelme.py
โ”‚   โ””โ”€โ”€ test_generic.py    # Generic visualizer tests
โ”œโ”€โ”€ run_tests.py           # Test runner
samples/
โ”œโ”€โ”€ __init__.py
โ”œโ”€โ”€ example_usage.py       # Quick usage demonstration
โ”œโ”€โ”€ template.py            # Example template for creating new examples
โ”œโ”€โ”€ cli/                   # CLI usage examples
โ”‚   โ”œโ”€โ”€ __init__.py
โ”‚   โ”œโ”€โ”€ convert/
โ”‚   โ”‚   โ”œโ”€โ”€ cli_coco_to_yolo.py
โ”‚   โ”‚   โ”œโ”€โ”€ cli_yolo_to_coco.py
โ”‚   โ”‚   โ”œโ”€โ”€ cli_coco_to_labelme.py
โ”‚   โ”‚   โ”œโ”€โ”€ cli_labelme_to_coco.py
โ”‚   โ”‚   โ”œโ”€โ”€ cli_labelme_to_yolo.py
โ”‚   โ”‚   โ””โ”€โ”€ cli_yolo_to_labelme.py
โ”‚   โ””โ”€โ”€ visualize/
โ”‚       โ”œโ”€โ”€ cli_yolo.py
โ”‚       โ”œโ”€โ”€ cli_coco.py
โ”‚       โ””โ”€โ”€ cli_labelme.py
โ””โ”€โ”€ api/                   # Python API examples
    โ”œโ”€โ”€ __init__.py
    โ”œโ”€โ”€ convert/
    โ”‚   โ”œโ”€โ”€ api_coco_to_yolo.py
    โ”‚   โ”œโ”€โ”€ api_yolo_to_coco.py
    โ”‚   โ”œโ”€โ”€ api_coco_to_labelme.py
    โ”‚   โ”œโ”€โ”€ api_labelme_to_coco.py
    โ”‚   โ”œโ”€โ”€ api_labelme_to_yolo.py
    โ”‚   โ””โ”€โ”€ api_yolo_to_labelme.py
    โ””โ”€โ”€ visualize/
        โ”œโ”€โ”€ api_yolo.py
        โ”œโ”€โ”€ api_coco.py
        โ””โ”€โ”€ api_labelme.py
docs/                       # Data format documentation
โ”œโ”€โ”€ README.md              # Documentation index
โ”œโ”€โ”€ yolo.md                # YOLO format specification
โ”œโ”€โ”€ labelme.md             # LabelMe format specification
โ””โ”€โ”€ coco.md                # COCO format specification

Requirements

Core Dependencies

  • Python 3.8 or higher
  • Linux environment (POSIX compatible, assumes POSIX paths)
  • click >= 8.1.0 โ€“ CLI framework
  • numpy >= 2.0.0 โ€“ numerical operations
  • opencv-python >= 4.8.0 โ€“ image processing (optional, used for some image operations)
  • Pillow >= 10.0.0 โ€“ image reading (optional, used for reading image dimensions)

Quick Start

Installation

# Regular installation from source
pip install .

# Install from PyPI
pip install dataflow-cv

Editable Installation (Development Mode)

Due to setuptools compatibility, use python setup.py develop instead of pip install -e .:

# Editable installation (development mode)
python setup.py develop

# After editable installation, use python -m dataflow.cli instead of the dataflow command
python -m dataflow.cli --help

Build System

The project uses setuptools with a pyproject.toml configuration. Distribution packages are built with python -m build.

# Build wheel and source distribution
python -m build

# Install from built wheel
pip install dist/dataflow_cv-*.whl

Command Line Usage

Global options: --verbose (-v) for progress output, --overwrite to replace existing files.

# COCO to YOLO conversion (use --segmentation for polygon annotations)
dataflow convert coco2yolo annotations.json output_dir/
dataflow convert coco2yolo annotations.json output_dir/ --segmentation

# YOLO to COCO conversion
dataflow convert yolo2coco images/ labels/ classes.names output.json

# COCO to LabelMe conversion (use --segmentation for polygon annotations)
dataflow convert coco2labelme annotations.json output_dir/
dataflow convert coco2labelme annotations.json output_dir/ --segmentation

# LabelMe to COCO conversion
dataflow convert labelme2coco labels/ classes.names output.json

# LabelMe to YOLO conversion (use --segmentation for polygon annotations)
dataflow convert labelme2yolo labels/ output_dir/
dataflow convert labelme2yolo labels/ output_dir/ --segmentation

# YOLO to LabelMe conversion
dataflow convert yolo2labelme images/ labels/ classes.names output_dir/

# Visualize YOLO annotations (use --save to export images)
dataflow visualize yolo images/ labels/ classes.names
dataflow visualize yolo images/ labels/ classes.names --save output_dir/

# Visualize COCO annotations (use --save to export images)
dataflow visualize coco images/ annotations.json
dataflow visualize coco images/ annotations.json --save output_dir/

# Visualize LabelMe annotations (use --save to export images)
dataflow visualize labelme images/ labels/
dataflow visualize labelme images/ labels/ --save output_dir/

# Show configuration
dataflow config

# Get help
dataflow --help
dataflow convert coco2yolo --help
dataflow visualize yolo --help
dataflow visualize labelme --help

See the CLI Reference below for detailed usage.

Python API Usage

import dataflow

# COCO to YOLO conversion (pass segmentation=True for polygon annotations)
result = dataflow.coco_to_yolo("annotations.json", "output_dir")
result = dataflow.coco_to_yolo("annotations.json", "output_dir", segmentation=True)
print(f"Processed {result['images_processed']} images")

# YOLO to COCO conversion
result = dataflow.yolo_to_coco("images/", "labels/", "classes.names", "output.json")
print(f"Generated {result['annotations_processed']} annotations")

# Additional conversions (import converters directly)
from dataflow.convert import (
    CocoToLabelMeConverter,
    LabelMeToCocoConverter,
    LabelMeToYoloConverter,
    YoloToLabelMeConverter
)

# COCO to LabelMe conversion
converter = CocoToLabelMeConverter()
result = converter.convert("annotations.json", "output_dir/", segmentation=True)
print(f"Converted {result['images_processed']} images to LabelMe format")

# LabelMe to COCO conversion
converter = LabelMeToCocoConverter()
result = converter.convert("labels/", "classes.names", "output.json")
print(f"Converted {result['annotations_processed']} annotations to COCO format")

# LabelMe to YOLO conversion
converter = LabelMeToYoloConverter()
result = converter.convert("labels/", "output_dir/")
print(f"Converted {result['images_processed']} images to YOLO format")

# YOLO to LabelMe conversion
converter = YoloToLabelMeConverter()
result = converter.convert("images/", "labels/", "classes.names", "output_dir/")
print(f"Converted {result['images_processed']} images to LabelMe format")

# Visualize YOLO annotations (save_dir is optional)
result = dataflow.visualize_yolo("images/", "labels/", "classes.names")
result = dataflow.visualize_yolo("images/", "labels/", "classes.names", save_dir="output_dir/")
print(f"Visualized {result['images_processed']} images")

# Visualize COCO annotations (save_dir is optional)
result = dataflow.visualize_coco("images/", "annotations.json")
result = dataflow.visualize_coco("images/", "annotations.json", save_dir="output_dir/")
print(f"Visualized {result['images_processed']} images")

# Visualize LabelMe annotations (save_dir is optional)
result = dataflow.visualize_labelme("images/", "labels/")
result = dataflow.visualize_labelme("images/", "labels/", save_dir="output_dir/")
print(f"Visualized {result['images_processed']} images")
print(f"Classes found: {result['classes_found']}")

CLI Reference

The CLI follows a hierarchical structure: dataflow <mainโ€‘task> <subโ€‘task> [arguments]. Global options can be placed before the main task.

Global Options

  • --verbose, -v: Enable verbose output (progress information)
  • --overwrite: Overwrite existing files

Conversion Commands

COCO to YOLO

dataflow convert coco2yolo COCO_JSON_PATH OUTPUT_DIR [--segmentation]
  • COCO_JSON_PATH: Path to COCO JSON annotation file
  • OUTPUT_DIR: Directory where labels/ and class.names will be created
  • --segmentation, -s: Handle segmentation annotations (polygon format)

YOLO to COCO

dataflow convert yolo2coco IMAGE_DIR YOLO_LABELS_DIR YOLO_CLASS_PATH COCO_JSON_PATH
  • IMAGE_DIR: Directory containing image files
  • YOLO_LABELS_DIR: Directory containing YOLO label files (.txt)
  • YOLO_CLASS_PATH: Path to YOLO class names file (e.g., class.names)
  • COCO_JSON_PATH: Path to save COCO JSON file

COCO to LabelMe

dataflow convert coco2labelme COCO_JSON_PATH OUTPUT_DIR [--segmentation]
  • COCO_JSON_PATH: Path to COCO JSON annotation file
  • OUTPUT_DIR: Directory where LabelMe JSON files will be created
  • --segmentation, -s: Handle segmentation annotations (polygon format)

LabelMe to COCO

dataflow convert labelme2coco LABEL_DIR CLASSES_PATH OUTPUT_JSON_PATH [--segmentation]
  • LABEL_DIR: Directory containing LabelMe JSON files
  • CLASSES_PATH: Path to class names file (e.g., class.names)
  • OUTPUT_JSON_PATH: Path to save COCO JSON file
  • --segmentation, -s: Handle segmentation annotations (polygon format)

LabelMe to YOLO

dataflow convert labelme2yolo LABEL_DIR OUTPUT_DIR [--segmentation]
  • LABEL_DIR: Directory containing LabelMe JSON files
  • OUTPUT_DIR: Directory where labels/ and class.names will be created
  • --segmentation, -s: Handle segmentation annotations (polygon format)

YOLO to LabelMe

dataflow convert yolo2labelme IMAGE_DIR LABEL_DIR CLASSES_PATH OUTPUT_DIR [--segmentation]
  • IMAGE_DIR: Directory containing image files
  • LABEL_DIR: Directory containing YOLO label files (.txt)
  • CLASSES_PATH: Path to YOLO class names file (e.g., class.names)
  • OUTPUT_DIR: Directory where LabelMe JSON files will be created
  • --segmentation, -s: Handle segmentation annotations (polygon format)

Visualization Commands

Visualize YOLO annotations

dataflow visualize yolo IMAGE_DIR LABEL_DIR CLASS_PATH [--save SAVE_DIR]
  • IMAGE_DIR: Directory containing image files
  • LABEL_DIR: Directory containing YOLO label files (.txt)
  • CLASS_PATH: Path to class names file (e.g., class.names)
  • --save SAVE_DIR: Optional directory to save visualized images

Visualize COCO annotations

dataflow visualize coco IMAGE_DIR ANNOTATION_JSON [--save SAVE_DIR]
  • IMAGE_DIR: Directory containing image files
  • ANNOTATION_JSON: Path to COCO JSON annotation file
  • --save SAVE_DIR: Optional directory to save visualized images

Visualize LabelMe annotations

dataflow visualize labelme IMAGE_DIR LABEL_DIR [--save SAVE_DIR]
  • IMAGE_DIR: Directory containing image files
  • LABEL_DIR: Directory containing LabelMe JSON files
  • --save SAVE_DIR: Optional directory to save visualized images

Configuration Command

dataflow config

Shows the current configuration (file extensions, default values, CLI context).

Getting Help

dataflow --help
dataflow convert --help
dataflow convert coco2yolo --help
dataflow convert yolo2coco --help
dataflow visualize --help
dataflow visualize yolo --help
dataflow visualize coco --help
dataflow visualize labelme --help

Segmentation Support

DataFlow-CV supports both bounding box and polygon segmentation annotations across all formats:

YOLO Segmentation Format

  • Detection format: class_id x_center y_center width height (normalized coordinates)
  • Segmentation format: class_id x1 y1 x2 y2 ... (polygon vertices, normalized)
  • YOLO segmentation files have the same .txt extension as detection files

COCO Segmentation Format

  • Polygon coordinates in segmentation field (list of [x1, y1, x2, y2, ...])
  • Both single-polygon and multi-polygon annotations are supported

LabelMe Segmentation Format

  • Rectangle shapes (shape_type: "rectangle") for bounding box annotations
  • Polygon shapes (shape_type: "polygon") for segmentation annotations
  • Each JSON file contains shapes array with annotation data

Usage Examples

# Convert COCO to YOLO with segmentation annotations
dataflow convert coco2yolo annotations.json output_dir/ --segmentation

# Visualize YOLO annotations in strict segmentation mode (only polygons)
dataflow visualize yolo images/ labels/ classes.names --segmentation

# Visualize COCO annotations in strict segmentation mode
dataflow visualize coco images/ annotations.json --segmentation

# Visualize LabelMe annotations in strict segmentation mode (only polygons)
dataflow visualize labelme images/ labels/ --segmentation

Python API

# Convert COCO to YOLO with segmentation
result = dataflow.coco_to_yolo("annotations.json", "output_dir", segmentation=True)

# Visualize in strict segmentation mode
result = dataflow.visualize_yolo("images/", "labels/", "classes.names", segmentation=True)
result = dataflow.visualize_labelme("images/", "labels/", segmentation=True)

Notes

  • Without the --segmentation flag, both bounding boxes and polygons are processed automatically
  • With --segmentation flag, only valid polygon annotations are processed (strict mode)
  • YOLO segmentation format requires at least 3 points (6 coordinates)
  • COCO segmentation polygons are automatically converted to YOLO normalized coordinates
  • LabelMe format supports both rectangle (shape_type: "rectangle") and polygon (shape_type: "polygon") shapes
  • In segmentation mode, LabelMe visualizer rejects rectangle shapes and only accepts polygon shapes

Running Tests

# Run all tests
python tests/run_tests.py

# Run specific test
python tests/run_tests.py --test TestCocoToYoloConverter

# With verbose output
python tests/run_tests.py -v

Examples

Check the samples/ directory for detailed usage examples:

  • samples/cli/convert/ - CLI conversion examples
  • samples/cli/visualize/ - CLI visualization examples
  • samples/api/convert/ - Python API conversion examples
  • samples/api/visualize/ - Python API visualization examples

Documentation

Detailed data format specifications are available in the docs/ directory:

These documents describe the annotation formats supported by DataFlow-CV, without covering tool usage.

License

MIT License ยฉ 2026 zjykzj

About

DataFlow-CV: Where Vibe Coding meets CV data. ๐ŸŒŠ Convert & visualize data.

Topics

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Languages