open-edge-platform · sovrasov · Nov 4, 2024 · Oct 31, 2024 · Oct 31, 2024 · Oct 31, 2024
@@ -1,5 +1,46 @@
 # Detection Model
 
+## Description
+
+Detection model aims to detect objects in an image. The model outputs a list of detected objects, each containing a bounding box, score and class label.
+
+## OpenVINO Model Specifications
+
+### Inputs
+
+A single input image of shape (H, W, 3) where H and W are the height and width of the image, respectively.
+
+### Outputs
+
+Detection model outputs a list of detection objects (i.e `list[Detection]`) wrapped in `DetectionResult`, each object containing the following attributes:
+
+- `score` (float) - Confidence score of the object.
+- `id` (int) - Class label of the object.
+- `str_label` (str) - String label of the object.
+- `xmin` (int) - X-coordinate of the top-left corner of the bounding box.
+- `ymin` (int) - Y-coordinate of the top-left corner of the bounding box.
+- `xmax` (int) - X-coordinate of the bottom-right corner of the bounding box.
+- `ymax` (int) - Y-coordinate of the bottom-right corner of the bounding box.
+
+## Example
+
+```python
+import cv2
+from model_api.models import SSD
+
+# Load the model
+model = SSD.create_model("model.xml")
+
+# Forward pass
+predictions = model(image)
+
+# Iterate over the segmented objects
+for pred_obj in predictions.objects:
+    pred_score = pred_obj.score
+    label_id = pred_obj.id
+    bbox = [pred_obj.xmin, pred_obj.ymin, pred_obj.xmax, pred_obj.ymax]
+```
+
 ```{eval-rst}
 .. automodule:: model_api.models.detection_model
    :members:

@@ -1,5 +1,48 @@
 # Instance Segmentation
 
+## Description
+
+Instance segmentation model aims to detect and segment objects in an image. It is an extension of object detection, where each object is segmented into a separate mask. The model outputs a list of segmented objects, each containing a mask, bounding box, score and class label.
+
+## OpenVINO Model Specifications
+
+### Inputs
+
+A single input image of shape (H, W, 3) where H and W are the height and width of the image, respectively.
+
+### Outputs
+
+Instance segmentation model outputs a list of segmented objects (i.e `list[SegmentedObject]`)wrapped in `InstanceSegmentationResult.segmentedObjects`, each containing the following attributes:
+
+- `mask` (numpy.ndarray) - A binary mask of the object.
+- `score` (float) - Confidence score of the object.
+- `id` (int) - Class label of the object.
+- `str_label` (str) - String label of the object.
+- `xmin` (int) - X-coordinate of the top-left corner of the bounding box.
+- `ymin` (int) - Y-coordinate of the top-left corner of the bounding box.
+- `xmax` (int) - X-coordinate of the bottom-right corner of the bounding box.
+- `ymax` (int) - Y-coordinate of the bottom-right corner of the bounding box.
+
+## Example
+
+```python
+import cv2
+from model_api.models import MaskRCNNModel
+
+# Load the model
+model = MaskRCNNModel.create_model("model.xml")
+
+# Forward pass
+predictions = model(image)
+
+# Iterate over the segmented objects
+for pred_obj in predictions.segmentedObjects:
+    pred_mask = pred_obj.mask
+    pred_score = pred_obj.score
+    label_id = pred_obj.id
+    bbox = [pred_obj.xmin, pred_obj.ymin, pred_obj.xmax, pred_obj.ymax]
+```
+
 ```{eval-rst}
 .. automodule:: model_api.models.instance_segmentation
    :members:

@@ -1,5 +1,55 @@
 # Keypoint Detection
 
+## Description
+
+Keypoint detection model aims to detect a set of pre-defined keypoints on a cropped object.
+If a crop is not tight enough, quality of keypoints degrades. Having this model and an
+object detector, one can organize keypoint detection for all objects of interest presented on an image (top-down approach).
+
+## Models
+
+Top-down keypoint detection pipeline uses detections that come from any appropriate detector,
+and a keypoints regression model acting on crops.
+
+### Parameters
+
+The following parameters can be provided via python API or RT Info embedded into OV model:
+
+- `labels`(`list(str)`) : a list of keypoints names.
+
+## OpenVINO Model Specifications
+
+### Inputs
+
+A single `NCHW` tensor representing a batch of images.
+
+### Outputs
+
+Two vectors in Simple Coordinate Classification Perspective ([SimCC](https://arxiv.org/abs/2107.03332)) format:
+
+- `pred_x` (B, N, D1) - `x` coordinate representation, where `N` is the number of keypoints.
+- `pred_y` (B, N, D2) - `y` coordinate representation, where `N` is the number of keypoints.
+
+## Example
+
+```python
+import cv2
+from model_api.models import TopDownKeypointDetectionPipeline, Detection, KeypointDetectionModel
+
+model = KeypointDetectionModel.create_model("kp_model.xml")
+# a list of detections in (x_min, y_min, x_max, y_max, score, class_id) format
+detections = [Detection(0, 0, 100, 100, 1.0, 0)]
+top_down_pipeline = TopDownKeypointDetectionPipeline(model)
+predictions = top_down_detector.predict(image, detections)
+
+# iterating over a list of DetectedKeypoints. Each of the items corresponds to a detection
+for obj_keypoints in predictions:
+    for point in obj_keypoints.keypoints.astype(np.int32):
+        cv2.circle(
+            image, point, radius=0, color=(0, 255, 0), thickness=5
+        )
+```
+
 ```{eval-rst}
 .. automodule:: model_api.models.keypoint_detection
    :members: