Refinements

daniil-lyakhov · daniil-lyakhov · commit d88ef3d26ae3 · 2025-04-14T22:18:11.000+02:00
diff --git a/backends/openvino/README.md b/backends/openvino/README.md
@@ -43,8 +43,8 @@ executorch
 Before you begin, ensure you have openvino installed and configured on your system:
 
 ```bash
-git clone https://github.com/openvinotoolkit/openvino.git
-cd openvino && git checkout releases/2025/1
+git clone https://github.com/daniil-lyakhov/openvino.git
+cd openvino && git checkout dl/executorch/yolo12
 git submodule update --init --recursive
 sudo ./install_build_dependencies.sh
 mkdir build && cd build
diff --git a/backends/openvino/scripts/openvino_build.sh b/backends/openvino/scripts/openvino_build.sh
@@ -33,8 +33,8 @@ main() {
               -DEXECUTORCH_BUILD_EXTENSION_TENSOR=ON \
               -DEXECUTORCH_BUILD_OPENVINO_EXECUTOR_RUNNER=ON \
               -DPYTHON_EXECUTABLE=python \
+              -DEXECUTORCH_LOG_LEVEL=Debug \
               -B"${build_dir}"
-              #-DEXECUTORCH_LOG_LEVEL=Debug \
 
 
         # Build the project
diff --git a/examples/models/yolo12/CMakeLists.txt b/examples/models/yolo12/CMakeLists.txt
@@ -61,13 +61,6 @@ list(APPEND _common_include_directories
      ${XNNPACK_ROOT}/third-party/pthreadpool/include
 )
 
-list(APPEND link_libraries extension_threadpool cpuinfo)
-list(APPEND _common_include_directories
-     ${XNNPACK_ROOT}/third-party/cpuinfo/include
-)
-
-message(STATUS ${link_libraries})
-
 set(PROJECT_SOURCES
     main.cpp
     inference.h
diff --git a/examples/models/yolo12/README.md b/examples/models/yolo12/README.md
@@ -1,87 +1,90 @@
-# YOLOv8/YOLOv5 C++ Inference with OpenCV DNN
+# YOLO12 Detection C++ Inference with ExecuTorch
 
-This example demonstrates how to perform inference using Ultralytics YOLO12 models in C++ leveraging the Executorch backends:
-- OpenVINO
-- XNNPACK.
+<p align="center">
+      <br>
+      <img src="./yolo12s_demo.gif" width=300>
+      <br>
+</p>
 
-## 🛠️ Usage
+This example demonstrates how to perform inference of Ultralytics YOLO12 family detection models in C++ leveraging the Executorch backends:
+- [OpenVINO](../../../backends/openvino/README.md)
+- [XNNPACK](../../../backends/xnnpack/README.md)
 
-Follow these steps to set up and run the C++ inference example:
 
-```bash
-# 1. Clone the Ultralytics repository
-git clone https://github.com/ultralytics/ultralytics
-cd ultralytics
+# Instructions
 
-# 2. Install Ultralytics Python package (needed for exporting models)
-pip install .
+### Step 1: Install ExecuTorch
 
-# 3. Navigate to the C++ example directory
-cd examples/YOLOv8-CPP-Inference
+To install ExecuTorch, follow this [guide](https://pytorch.org/executorch/stable/getting-started-setup.html).
 
-# 4. Export Models: Add yolov8*.onnx and/or yolov5*.onnx models (see export instructions below)
-#    Place the exported ONNX models in the current directory (YOLOv8-CPP-Inference).
+### Step 2: Install the backend of your choice
 
-# 5. Update Source Code: Edit main.cpp and set the 'projectBasePath' variable
-#    to the absolute path of the 'YOLOv8-CPP-Inference' directory on your system.
-#    Example: std::string projectBasePath = "/path/to/your/ultralytics/examples/YOLOv8-CPP-Inference";
+- [OpenVINO backend installation guide](../../../backends/openvino/README.md#build-instructions)
+- [XNNPACK backend installation guilde](https://pytorch.org/executorch/stable/tutorial-xnnpack-delegate-lowering.html#running-the-xnnpack-model-with-cmake)
 
-# 6. Configure OpenCV DNN Backend (Optional - CUDA):
-#    - The default CMakeLists.txt attempts to use CUDA for GPU acceleration with OpenCV DNN.
-#    - If your OpenCV build doesn't support CUDA/cuDNN, or you want CPU inference,
-#      remove the CUDA-related lines from CMakeLists.txt.
+### Step 3: Install the demo requirements
 
-# 7. Build the project
-mkdir build
-cd build
-cmake ..
-make
 
-# 8. Run the inference executable
-./Yolov8CPPInference
+Python demo requirements:
+```bash
+python -m pip install -r examples/models/yolo12/requirements.txt
 ```
 
-## ✨ Exporting YOLOv8 and YOLOv5 Models
+Demo infenrece dependency - OpenCV library:
+https://opencv.org/get-started/
 
-You need to export your trained PyTorch models to the [ONNX](https://onnx.ai/) format to use them with OpenCV DNN.
 
-**Exporting Ultralytics YOLOv8 Models:**
+### Step 4: Export the Yolo12 model to the ExecuTorch
 
-Use the Ultralytics CLI to export. Ensure you specify the desired `imgsz` and `opset`. For compatibility with this example, `opset=12` is recommended.
 
+OpenVINO:
 ```bash
-yolo export model=yolov8s.pt imgsz=640,480 format=onnx opset=12 # Example: 640x480 resolution
+python export_and_quantize.py --model_name yolo12s --input_dims=[1920,1080]  --backend openvino --device CPU
 ```
 
-**Exporting YOLOv5 Models:**
+XNNPACK:
+```bash
+python export_and_quantize.py --model_name yolo12s --input_dims=[1920,1080] --backend xnnpack
+```
 
-Use the `export.py` script from the YOLOv5 repository structure (included within the cloned `ultralytics` repo).
+> **_NOTE:_**  Quantization is comming soon!
 
+To get a full parameters description please use the following command:
 ```bash
-# Assuming you are in the 'ultralytics' base directory after cloning
-python export.py --weights yolov5s.pt --imgsz 640 480 --include onnx --opset 12 # Example: 640x480 resolution
+python export_and_quantize.py
 ```
+### Step 5: Build the demo project
 
-Place the generated `.onnx` files (e.g., `yolov8s.onnx`, `yolov5s.onnx`) into the `ultralytics/examples/YOLOv8-CPP-Inference/` directory.
+OpenVINO:
 
-**Example Output:**
-
-_yolov8s.onnx:_
+```bash
+cd examples/models/yolo12
+mkdir build && cd build
+cmake -DCMAKE_BUILD_TYPE=Release -DUSE_OPENVINO_BACKEND=ON ..
+make -j$(nproc)
+```
 
-![YOLOv8 ONNX Output](https://user-images.githubusercontent.com/40023722/217356132-a4cecf2e-2729-4acb-b80a-6559022d7707.png)
+XNNPACK:
 
-_yolov5s.onnx:_
+```bash
+cd examples/models/yolo12
+mkdir build && cd build
+cmake -DCMAKE_BUILD_TYPE=Release -DUSE_XNNPACK_BACKEND=ON ..
+make -j$(nproc)
+```
 
-![YOLOv5 ONNX Output](https://user-images.githubusercontent.com/40023722/217357005-07464492-d1da-42e3-98a7-fc753f87d5e6.png)
+### Step 6: Run the demo
 
-## 📝 Notes
+```bash
+./build/Yolo12DetectionDemo -model_path /path/to/exported/model -input_path /path/to/video/file -output_path /path/to/output/annotated/video
+```
 
-- This repository utilizes the [OpenCV DNN API](https://docs.opencv.org/4.x/d6/d0f/group__dnn.html) to run [ONNX](https://onnx.ai/) exported models of YOLOv5 and Ultralytics YOLOv8.
-- While not explicitly tested, it might theoretically work for other YOLO architectures like YOLOv6 and YOLOv7 if their ONNX export formats are compatible.
-- The example models are exported with a rectangular resolution (640x480), but the code should handle models exported with different resolutions. Consider using techniques like [letterboxing](https://docs.ultralytics.com/modes/predict/#letterbox) if your input images have different aspect ratios than the model's training resolution, especially for square `imgsz` exports.
-- The `main` branch version includes a simple GUI wrapper using [Qt](https://www.qt.io/). However, the core logic resides in the `Inference` class (`inference.h`, `inference.cpp`).
-- A key part of the `Inference` class demonstrates how to handle the output differences between YOLOv5 and YOLOv8 models, effectively transposing YOLOv8's output format to match the structure expected from YOLOv5 for consistent post-processing.
+To get a full parameters description please use the following command:
+```
+./build/Yolo12DetectionDemo --help
+```
 
-## 🤝 Contributing
 
-Contributions are welcome! If you find any issues or have suggestions for improvement, please feel free to open an issue or submit a pull request. See our [Contributing Guide](https://docs.ultralytics.com/help/contributing/) for more details.
+# Credits:
+Ultralytics examples: https://github.com/ultralytics/ultralytics/tree/main/examples
+Sample video: https://www.pexels.com/@shanu-1040189/
diff --git a/examples/models/yolo12/build.sh b/examples/models/yolo12/build.sh
@@ -1,4 +1,4 @@
 rm -r build
 mkdir build && cd build
-cmake -DCMAKE_BUILD_TYPE=Debug -DUSE_XNNPACK_BACKEND=ON -DUSE_OPENVINO_BACKEND=ON ..
+cmake -DCMAKE_BUILD_TYPE=Release -DUSE_XNNPACK_BACKEND=ON -DUSE_OPENVINO_BACKEND=ON ..
 make -j 30
diff --git a/examples/models/yolo12/export_and_quantize.py b/examples/models/yolo12/export_and_quantize.py
@@ -13,7 +13,6 @@
 
 import cv2
 import executorch
-import nncf.torch
 import numpy as np
 import torch
 from executorch.backends.openvino.partitioner import OpenvinoPartitioner
@@ -32,7 +31,6 @@
     to_edge_transform_and_lower,
 )
 from executorch.exir.backend.backend_details import CompileSpec
-from nncf.experimental.torch.fx import quantize_pt2e
 from torch.ao.quantization.quantize_pt2e import convert_pt2e, prepare_pt2e
 from torch.export.exported_program import ExportedProgram
 from torch.fx.passes.graph_drawer import FxGraphDrawer
@@ -82,45 +80,48 @@ def lower_to_openvino(
     subset_size: int,
     quantize: bool,
 ) -> ExecutorchProgramManager:
-    if quantize:
-        target_input_dims = tuple(example_args[0].shape[2:])
+    import nncf.torch
+    from nncf.experimental.torch.fx import quantize_pt2e
 
-        def ext_transform_fn(sample):
-            sample = transform_fn(sample)
-            return pad_to_target(sample, target_input_dims)
+    with nncf.torch.disable_patching():
+        if quantize:
+            target_input_dims = tuple(example_args[0].shape[2:])
 
-        quantizer = OpenVINOQuantizer(mode=QuantizationMode.INT8_TRANSFORMER)
-        quantizer.set_ignored_scope(
-            types=["mul", "sub", "sigmoid", "__getitem__"],
-            subgraphs=[nncf.Subgraph(inputs=["cat_18"], outputs=["output"])]
-        )
-        quantized_model = quantize_pt2e(
-            aten_dialect.module(),
-            quantizer,
-            nncf.Dataset(calibration_dataset, ext_transform_fn),
-            subset_size=subset_size,
-            smooth_quant=True,
-            fold_quantize=False
-        )
+            def ext_transform_fn(sample):
+                sample = transform_fn(sample)
+                return pad_to_target(sample, target_input_dims)
 
-        visualize_fx_model(quantized_model, "tmp_quantized_model.svg")
-        aten_dialect = torch.export.export(quantized_model, example_args)
-        # Convert to edge dialect and lower the module to the backend with a custom partitioner
-    compile_spec = [CompileSpec("device", device.encode())]
-    lowered_module: EdgeProgramManager = to_edge_transform_and_lower(
-        aten_dialect,
-        partitioner=[
-            OpenvinoPartitioner(compile_spec),
-        ],
-        compile_config=EdgeCompileConfig(
-            _skip_dim_order=True,
-        ),
-    )
+            quantizer = OpenVINOQuantizer(mode=QuantizationMode.INT8_TRANSFORMER)
+            quantizer.set_ignored_scope(
+                types=["mul", "sub", "sigmoid", "__getitem__"],
+            )
+            quantized_model = quantize_pt2e(
+                aten_dialect.module(),
+                quantizer,
+                nncf.Dataset(calibration_dataset, ext_transform_fn),
+                subset_size=subset_size,
+                smooth_quant=True,
+                fold_quantize=False,
+            )
 
-    # Apply backend-specific passes
-    return lowered_module.to_executorch(
-        config=executorch.exir.ExecutorchBackendConfig()
-    )
+            visualize_fx_model(quantized_model, "tmp_quantized_model.svg")
+            aten_dialect = torch.export.export(quantized_model, example_args)
+            # Convert to edge dialect and lower the module to the backend with a custom partitioner
+        compile_spec = [CompileSpec("device", device.encode())]
+        lowered_module: EdgeProgramManager = to_edge_transform_and_lower(
+            aten_dialect,
+            partitioner=[
+                OpenvinoPartitioner(compile_spec),
+            ],
+            compile_config=EdgeCompileConfig(
+                _skip_dim_order=True,
+            ),
+        )
+
+        # Apply backend-specific passes
+        return lowered_module.to_executorch(
+            config=executorch.exir.ExecutorchBackendConfig()
+        )
 
 
 def lower_to_xnnpack(
@@ -217,6 +218,7 @@ def main(
     model = YOLO(model_name)
 
     if quantize:
+        raise NotImplementedError("Quantization is comming soon!")
         if video_path is None:
             raise RuntimeError(
                 "Could not quantize model without the video for the calibration."
@@ -273,7 +275,8 @@ def transform_fn(frame):
         "--model_name",
         type=str,
         default="yolo12s",
-        help="Ultralytics yolo model name.",
+        choices=["yolo12n", "yolo12s", "yolo12m", "yolo12l", "yolo12x"],
+        help="Ultralytics yolo12 model name.",
     )
     parser.add_argument(
         "--input_dims",
@@ -312,14 +315,12 @@ def transform_fn(frame):
     args = parser.parse_args()
 
     # Run the main function with parsed arguments
-    # Disable nncf patching as export of the patched model is not supported.
-    with nncf.torch.disable_patching():
-        main(
-            model_name=args.model_name,
-            input_dims=args.input_dims,
-            quantize=args.quantize,
-            video_path=args.video_path,
-            subset_size=args.subset_size,
-            backend=args.backend,
-            device=args.device,
-        )
+    main(
+        model_name=args.model_name,
+        input_dims=args.input_dims,
+        quantize=args.quantize,
+        video_path=args.video_path,
+        subset_size=args.subset_size,
+        backend=args.backend,
+        device=args.device,
+    )
diff --git a/examples/models/yolo12/inference.h b/examples/models/yolo12/inference.h
@@ -77,7 +77,10 @@ std::vector<Detection> infer_yolo_once(
       ScalarType::Float);
   const auto result = module.forward(t_input);
 
-  ET_CHECK_MSG(result.ok(), "Could not infer the model with an error");
+  ET_CHECK_MSG(
+      result.ok(),
+      "Execution of method forward failed with status 0x%" PRIx32,
+      (uint32_t)result.error());
 
   const auto t = result->at(0).toTensor(); // Using only the 0 output
   // yolov8 has an output of shape (batchSize, 84,  8400) (Num classes +
diff --git a/examples/models/yolo12/main.cpp b/examples/models/yolo12/main.cpp
diff --git a/examples/models/yolo12/requirements.txt b/examples/models/yolo12/requirements.txt
diff --git a/examples/models/yolo12/yolo12s_demo.gif b/examples/models/yolo12/yolo12s_demo.gif