UFlow Anomaly model implementation (#4251)

djdameln · web-flow · commit fd3fa0cb4201 · 2025-03-06T13:05:41.000+01:00
* uflow initial implementation

* upgrade anomalib version

* upgrade anomalib version

* add entry for uflow in template dict

* update recipe

* prevent KeyError when "__path__" is missing in config["data"]

* use override for resize transform in uflow configs

* override input size in uflow configs

* disable early stopping for uflow

* disable early stopping for uflow in template

* add model specs

* enable performance tests for uflow

* enable openvino tests for uflow

* update license headers

* use Uflow as accuracy preset

* update changelog

* remove uflow recipes for unused tasks

* change template name and task

* add UFlow description to docs

* UFlow -&gt; U-Flow

* hide early stopping in UI for uflow

* reorder

* formatting
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -54,6 +54,8 @@ All notable changes to this project will be documented in this file.
   (<https://github.com/openvinotoolkit/training_extensions/pull/4142>)
 - Add DETR XAI Explain Mode
   (<https://github.com/openvinotoolkit/training_extensions/pull/4184>)
+- Add UFlow anomaly detection algorithm
+  (<https://github.com/openvinotoolkit/training_extensions/pull/4251>)
 
 ### Enhancements
 
diff --git a/docs/source/guide/explanation/algorithms/anomaly/index.rst b/docs/source/guide/explanation/algorithms/anomaly/index.rst
@@ -77,13 +77,15 @@ Models
 ******
 As mentioned above, the goal of visual anomaly detection is to learn a representation of normal behaviour in the data and then identify instances that deviate from this normal behaviour. OpenVINO Training Extensions supports several deep learning approaches to this task, including the following:
 
-+-------+----------------------------------------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------+---------------------+-----------------+
-| Name  | Classification                                                                                                                               | Detection                                                                                                                                        | Segmentation                                                                                                                               | Complexity (GFLOPs) | Model size (MB) |
-+=======+==============================================================================================================================================+==================================================================================================================================================+============================================================================================================================================+=====================+=================+
-| PADIM | `padim <https://github.com/openvinotoolkit/training_extensions/blob/develop/src/otx/recipe/anomaly_classification/padim.yaml>`_              | `padim <https://github.com/openvinotoolkit/training_extensions/blob/develop/src/otx/recipe/anomaly_detection/padim.yaml>`_                       | `padim <https://github.com/openvinotoolkit/training_extensions/blob/develop/src/otx/recipe/anomaly_segmentation/padim.yaml>`_              | 3.9                 | 168.4           |
-+-------+----------------------------------------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------+---------------------+-----------------+
-| STFPM | `stfpm <https://github.com/openvinotoolkit/training_extensions/blob/develop/src/otx/recipe/anomaly_classification/stfpm.yaml>`_              | `stfpm <https://github.com/openvinotoolkit/training_extensions/blob/develop/src/otx/recipe/anomaly_detection/stfpm.yaml>`_                       | `stfpm <https://github.com/openvinotoolkit/training_extensions/blob/develop/src/otx/recipe/anomaly_segmentation/stfpm.yaml>`_              | 5.6                 | 21.1            |
-+-------+----------------------------------------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------+---------------------+-----------------+
++--------+-------------------------------------------------------------------------------------------------------------------+----------------------+-----------------+
+| Name   | Recipe                                                                                                            | Complexity (GFLOPs)  | Model size (MB) |
++========+===================================================================================================================+======================+=================+
+| PADIM  | `padim <https://github.com/openvinotoolkit/training_extensions/blob/develop/src/otx/recipe/anomaly_/padim.yaml>`_ | 3.9                  | 168.4           |
++--------+-------------------------------------------------------------------------------------------------------------------+----------------------+-----------------+
+| STFPM  | `stfpm <https://github.com/openvinotoolkit/training_extensions/blob/develop/src/otx/recipe/anomaly_/stfpm.yaml>`_ | 5.6                  | 21.1            |
++--------+-------------------------------------------------------------------------------------------------------------------+----------------------+-----------------+
+| U-Flow | `uflow <https://github.com/openvinotoolkit/training_extensions/blob/develop/src/otx/recipe/anomaly_/uflow.yaml>`_ | 59.6                 | 62.88           |
++--------+-------------------------------------------------------------------------------------------------------------------+----------------------+-----------------+
 
 
 Clustering-based Models
@@ -153,3 +155,28 @@ Since STFPM trains the student network, we use the following parameters for its
    - ``Early Stopping``: Early stopping is used to stop the training process when the validation loss stops improving. The default value of the early stopping patience is ``10``.
 
 For more information on STFPM's training. We invite you to read Anomalib's `STFPM documentation <https://anomalib.readthedocs.io/en/v1.0.0/markdown/guides/reference/models/image/stfpm.html>`_.
+
+Normalizing Flow Models
+-----------------------------------
+Normalizing Flow models use invertible neural networks to transform image features into a simpler distribution, like a Gaussian. During inference, the Flow network is used to compute the likelihood of the input image under the learned distribution, assigning low probabilities to anomalous samples. OpenVINO Training Extensions currently supports `U-Flow: Unsupervised Anomaly Detection via Normalizing Flow <https://arxiv.org/pdf/2103.04257.pdf>`_.
+
+U-Flow
+^^^^^
+
+.. figure:: ../../../../../utils/images/uflow.png
+   :width: 600
+   :align: center
+   :alt: Anomaly Task Types
+
+U-Flow consists of four stages.
+
+1. **Feature Extraction**: The images are passed through a pre-trained bacbone to extract feature embeddings at multiple scales.
+2. **Normalizing Flow**: The feature embeddings are passed through a U-shaped normalizing flow network to learn the distribution of normal images.
+3. **Anomaly Score Calculation**: The anomaly score is calculated as the negative log-likelihood of the feature embeddings under the learned distribution.
+4. **Anomaly Map Generation**: The anomaly score is used to generate an anomaly map, which highlights the anomalous regions in the image.
+
+Training Parameters
+~~~~~~~~~~~~~~~~~~~~
+There are currently no configurable training parameters exposed for U-Flow.
+
+For more information on UFlow's training. We invite you to read Anomalib's `U-Flow documentation <https://anomalib.readthedocs.io/en/v1.0.0/markdown/guides/reference/models/image/uflow.html>`_.
diff --git a/pyproject.toml b/pyproject.toml
@@ -85,7 +85,7 @@ base = [
     "onnx==1.17.0",
     "onnxconverter-common==1.14.0",
     "nncf==2.14.1",
-    "anomalib[core]==1.1.0",
+    "anomalib[core]==1.1.3",
 ]
 
 ci_tox = [
diff --git a/src/otx/algo/anomaly/uflow.py b/src/otx/algo/anomaly/uflow.py
@@ -0,0 +1,62 @@
+"""OTX UFlow model."""
+
+# Copyright (C) 2025 Intel Corporation
+# SPDX-License-Identifier: Apache-2.0
+# mypy: ignore-errors
+
+from __future__ import annotations
+
+from typing import TYPE_CHECKING, Literal
+
+from anomalib.models.image import Uflow as AnomalibUflow
+
+from otx.core.model.anomaly import AnomalyMixin, OTXAnomaly
+from otx.core.types.label import AnomalyLabelInfo
+from otx.core.types.task import OTXTaskType
+
+if TYPE_CHECKING:
+    from otx.core.types.label import LabelInfoTypes
+
+
+class Uflow(AnomalyMixin, AnomalibUflow, OTXAnomaly):
+    """OTX UFlow model.
+
+    Args:
+        label_info (LabelInfoTypes, optional): Label information. Defaults to AnomalyLabelInfo().
+        backbone (str, optional): Feature extractor backbone. Defaults to "resnet18".
+        flow_steps (int, optional): Number of flow steps. Defaults to 4.
+        affine_clamp (float, optional): Affine clamp. Defaults to 2.0.
+        affine_subnet_channels_ratio (float, optional): Affine subnet channels ratio. Defaults to 1.0.
+        permute_soft (bool, optional): Whether to use soft permutation. Defaults to False.
+        task (Literal[
+                OTXTaskType.ANOMALY_CLASSIFICATION, OTXTaskType.ANOMALY_DETECTION, OTXTaskType.ANOMALY_SEGMENTATION
+            ], optional): Task type of Anomaly Task. Defaults to OTXTaskType.ANOMALY_CLASSIFICATION.
+        input_size (tuple[int, int], optional):
+            Model input size in the order of height and width. Defaults to (256, 256)
+    """
+
+    def __init__(
+        self,
+        label_info: LabelInfoTypes = AnomalyLabelInfo(),
+        backbone: str = "resnet18",
+        flow_steps: int = 4,
+        affine_clamp: float = 2.0,
+        affine_subnet_channels_ratio: float = 1.0,
+        permute_soft: bool = False,
+        task: Literal[
+            OTXTaskType.ANOMALY,
+            OTXTaskType.ANOMALY_CLASSIFICATION,
+            OTXTaskType.ANOMALY_DETECTION,
+            OTXTaskType.ANOMALY_SEGMENTATION,
+        ] = OTXTaskType.ANOMALY_CLASSIFICATION,
+        input_size: tuple[int, int] = (448, 448),
+    ) -> None:
+        self.input_size = input_size
+        self.task = OTXTaskType(task)
+        super().__init__(
+            backbone=backbone,
+            flow_steps=flow_steps,
+            affine_clamp=affine_clamp,
+            affine_subnet_channels_ratio=affine_subnet_channels_ratio,
+            permute_soft=permute_soft,
+        )
diff --git a/src/otx/recipe/anomaly/uflow.yaml b/src/otx/recipe/anomaly/uflow.yaml
@@ -0,0 +1,44 @@
+model:
+  class_path: otx.algo.anomaly.uflow.Uflow
+  init_args:
+    backbone: "resnet18"
+    flow_steps: 4
+    affine_clamp: 2.0
+    affine_subnet_channels_ratio: 1.0
+    permute_soft: False
+    task: ANOMALY
+
+engine:
+  task: ANOMALY
+  device: auto
+
+callback_monitor: image_F1Score
+
+data: ../_base_/data/anomaly.yaml
+
+overrides:
+  precision: 32
+  max_epochs: 200
+  num_sanity_val_steps: 0
+  data:
+    input_size: [448, 448]
+    train_subset:
+      transforms:
+        - class_path: torchvision.transforms.v2.Resize
+          init_args:
+            size: [448, 448]
+            antialias: true
+    val_subset:
+      transforms:
+        - class_path: torchvision.transforms.v2.Resize
+          init_args:
+            size: [448, 448]
+            antialias: true
+      sampler:
+        class_path: torch.utils.data.RandomSampler
+    test_subset:
+      transforms:
+        - class_path: torchvision.transforms.v2.Resize
+          init_args:
+            size: [448, 448]
+            antialias: true
diff --git a/src/otx/tools/converter.py b/src/otx/tools/converter.py
@@ -1,4 +1,4 @@
-# Copyright (C) 2024 Intel Corporation
+# Copyright (C) 2024-2025 Intel Corporation
 # SPDX-License-Identifier: Apache-2.0
 
 """Converter for v1 config."""
@@ -169,6 +169,10 @@
         "task": OTXTaskType.ANOMALY,
         "model_name": "stfpm",
     },
+    "ote_anomaly_uflow": {
+        "task": OTXTaskType.ANOMALY,
+        "model_name": "uflow",
+    },
     # ANOMALY CLASSIFICATION
     "ote_anomaly_classification_padim": {
         "task": OTXTaskType.ANOMALY_CLASSIFICATION,
@@ -413,7 +417,7 @@ def _remove_unused_key(config: dict) -> None:
             config (dict): The configuration dictionary.
         """
         config.pop("config")  # Remove config key that for CLI
-        config["data"].pop("__path__")  # Remove __path__ key that for CLI overriding
+        config["data"].pop("__path__", None)  # Remove __path__ key that for CLI overriding
 
     @staticmethod
     def instantiate(
diff --git a/src/otx/tools/templates/anomaly/classification/stfpm/template.yaml b/src/otx/tools/templates/anomaly/classification/stfpm/template.yaml
@@ -23,6 +23,3 @@ training_targets:
 # Computational Complexity
 gigaflops: 5.6
 size: 21.1
-
-# Model spec
-model_category: ACCURACY
diff --git a/src/otx/tools/templates/anomaly/classification/uflow/configuration.yaml b/src/otx/tools/templates/anomaly/classification/uflow/configuration.yaml
@@ -0,0 +1,182 @@
+dataset:
+  description: Dataset Parameters
+  header: Dataset Parameters
+  num_workers:
+    affects_outcome_of: NONE
+    default_value: 8
+    description:
+      Increasing this value might improve training speed however it might
+      cause out of memory errors. If the number of workers is set to zero, data loading
+      will happen in the main training thread.
+    editable: true
+    header: Number of workers
+    max_value: 36
+    min_value: 0
+    type: INTEGER
+    ui_rules:
+      action: DISABLE_EDITING
+      operator: AND
+      rules: []
+      type: UI_RULES
+    value: 8
+    visible_in_ui: true
+    warning: null
+  type: PARAMETER_GROUP
+  visible_in_ui: true
+description: Configuration for Uflow
+header: Configuration for Uflow
+id: ""
+learning_parameters:
+  enable_early_stopping:
+    affects_outcome_of: TRAINING
+    default_value: false
+    description: Early exit from training when validation accuracy isn't changed or decreased for several epochs.
+    editable: false
+    header: Enable early stopping of the training
+    type: BOOLEAN
+    ui_rules:
+      action: DISABLE_EDITING
+      operator: AND
+      rules: []
+      type: UI_RULES
+    visible_in_ui: false
+    warning: null
+  backbone:
+    affects_outcome_of: NONE
+    default_value: resnet18
+    description: Pre-trained backbone used for feature extraction
+    editable: false
+    enum_name: ModelBackbone
+    header: Model Backbone
+    options:
+      RESNET18: resnet18
+      WIDE_RESNET_50: wide_resnet50_2
+      MCAIT: mcait
+    type: SELECTABLE
+    ui_rules:
+      action: DISABLE_EDITING
+      operator: AND
+      rules: []
+      type: UI_RULES
+    value: resnet18
+    visible_in_ui: false
+    warning: null
+  description: Learning Parameters
+  header: Learning Parameters
+  train_batch_size:
+    affects_outcome_of: TRAINING
+    default_value: 32
+    description:
+      The number of training samples seen in each iteration of training.
+      Increasing this value improves training time and may make the training more
+      stable. A larger batch size has higher memory requirements.
+    editable: true
+    header: Batch size
+    max_value: 512
+    min_value: 1
+    type: INTEGER
+    ui_rules:
+      action: DISABLE_EDITING
+      operator: AND
+      rules: []
+      type: UI_RULES
+    value: 32
+    visible_in_ui: true
+    warning:
+      Increasing this value may cause the system to use more memory than available,
+      potentially causing out of memory errors, please update with caution.
+  type: PARAMETER_GROUP
+  visible_in_ui: true
+nncf_optimization:
+  description: Optimization by NNCF
+  enable_pruning:
+    affects_outcome_of: NONE
+    default_value: false
+    description: Enable filter pruning algorithm
+    editable: true
+    header: Enable filter pruning algorithm
+    type: BOOLEAN
+    ui_rules:
+      action: DISABLE_EDITING
+      operator: AND
+      rules: []
+      type: UI_RULES
+    value: false
+    visible_in_ui: true
+    warning: null
+  enable_quantization:
+    affects_outcome_of: NONE
+    default_value: true
+    description: Enable quantization algorithm
+    editable: true
+    header: Enable quantization algorithm
+    type: BOOLEAN
+    ui_rules:
+      action: DISABLE_EDITING
+      operator: AND
+      rules: []
+      type: UI_RULES
+    value: true
+    visible_in_ui: true
+    warning: null
+  header: Optimization by NNCF
+  pruning_supported:
+    affects_outcome_of: TRAINING
+    default_value: false
+    description: Whether filter pruning is supported
+    editable: false
+    header: Whether filter pruning is supported
+    type: BOOLEAN
+    ui_rules:
+      action: DISABLE_EDITING
+      operator: AND
+      rules: []
+      type: UI_RULES
+    value: false
+    visible_in_ui: false
+    warning: null
+  type: PARAMETER_GROUP
+  visible_in_ui: true
+pot_parameters:
+  description: POT Parameters
+  header: POT Parameters
+  preset:
+    affects_outcome_of: NONE
+    default_value: Performance
+    description: Quantization preset that defines quantization scheme
+    editable: true
+    enum_name: POTQuantizationPreset
+    header: Preset
+    options:
+      MIXED: Mixed
+      PERFORMANCE: Performance
+    type: SELECTABLE
+    ui_rules:
+      action: DISABLE_EDITING
+      operator: AND
+      rules: []
+      type: UI_RULES
+    value: Performance
+    visible_in_ui: true
+    warning: null
+  stat_subset_size:
+    affects_outcome_of: NONE
+    default_value: 300
+    description: Number of data samples used for post-training optimization
+    editable: true
+    header: Number of data samples
+    max_value: 1000
+    min_value: 1
+    type: INTEGER
+    ui_rules:
+      action: DISABLE_EDITING
+      operator: AND
+      rules: []
+      type: UI_RULES
+    value: 300
+    visible_in_ui: true
+    warning: null
+  type: PARAMETER_GROUP
+  visible_in_ui: false
+type: CONFIGURABLE_PARAMETERS
+visible_in_ui: true
diff --git a/src/otx/tools/templates/anomaly/classification/uflow/template.yaml b/src/otx/tools/templates/anomaly/classification/uflow/template.yaml
diff --git a/tests/perf/test_anomaly.py b/tests/perf/test_anomaly.py
diff --git a/tests/unit/algo/anomaly/test_openvino_model.py b/tests/unit/algo/anomaly/test_openvino_model.py

Original file line number	Diff line number	Diff line change
`@@ -85,7 +85,7 @@ base = [`
`85`	`85`	`"onnx==1.17.0",`
`86`	`86`	`"onnxconverter-common==1.14.0",`
`87`	`87`	`"nncf==2.14.1",`
`88`		`- "anomalib[core]==1.1.0",`
	`88`	`+ "anomalib[core]==1.1.3",`
`89`	`89`	`]`
`90`	`90`
`91`	`91`	`ci_tox = [`