✨Add model descriptions and documentation for instance segmentation, detection, and keypoint detection

eugene123tw · eugene123tw · commit ad2bff36cbf0 · 2024-10-31T17:29:08.000Z
diff --git a/docs/source/index.md b/docs/source/index.md
@@ -1,5 +1,32 @@
 # InferenceSDK Documentation
 
+## Model Description
+
+::::{grid} 1 2 2 3
+:margin: 1 1 0 0
+:gutter: 1
+
+:::{grid-item-card} Instance Segmentation
+:link: ./python/descriptions/instance_segmentation
+:link-type: doc
+[TODO]
+:::
+
+:::{grid-item-card} Detection
+:link: ./python/descriptions/detection_model
+:link-type: doc
+[TODO]
+:::
+
+:::{grid-item-card} Keypoint Detection
+:link: ./python/descriptions/keypoint_detection
+:link-type: doc
+[TODO]
+:::
+
+
+::::
+
 ## Python API Reference
 
 ::::{grid} 1 2 2 3
diff --git a/docs/source/python/descriptions/detection_model.md b/docs/source/python/descriptions/detection_model.md
@@ -0,0 +1 @@
+# Detection Model
diff --git a/docs/source/python/descriptions/index.md b/docs/source/python/descriptions/index.md
@@ -0,0 +1,55 @@
+# Model Descriptions
+
+::::{grid} 1 2 2 3
+:margin: 1 1 0 0
+:gutter: 1
+
+:::{grid-item-card} Detection Model
+:link: ./detection_model
+:link-type: doc
+[todo]
+:::
+
+:::{grid-item-card} Anomaly
+:link: ./anomaly
+:link-type: doc
+[todo]
+:::
+
+:::{grid-item-card} Keypoint Detection
+:link: ./keypoint_detection
+:link-type: doc
+[todo]
+:::
+
+:::{grid-item-card} Visual Prompting
+:link: ./visual_prompting
+:link-type: doc
+[todo]
+:::
+
+:::{grid-item-card} Classification
+:link: ./classification
+:link-type: doc
+[todo]
+:::
+
+:::{grid-item-card} Segmentation
+:link: ./segmentation
+:link-type: doc
+[todo]
+:::
+
+:::{grid-item-card} Instance Segmentation
+:link: ./instance_segmentation
+:link-type: doc
+[todo]
+:::
+
+:::{grid-item-card} Action Classification
+:link: ./action_classification
+:link-type: doc
+[todo]
+:::
+
+::::
diff --git a/docs/source/python/descriptions/instance_segmentation.md b/docs/source/python/descriptions/instance_segmentation.md
@@ -0,0 +1,45 @@
+# Instance Segmentation
+
+## Description
+
+Instance segmentation model aims to detect and segment objects in an image. It is an extension of object detection, where each object is segmented into a separate mask. The model outputs a list of segmented objects, each containing a mask, bounding box, score and class label.
+
+## OpenVINO Model Specifications
+
+### Inputs
+
+A single input image of shape (H, W, 3) where H and W are the height and width of the image, respectively.
+
+### Outputs
+
+Instance segmentation model outputs a list of segmented objects (i.e `list[SegmentedObject]`)wrapped in `InstanceSegmentationResult.segmentedObjects`, each containing the following attributes:
+
+- `mask` (numpy.ndarray) - A binary mask of the object.
+- `score` (float) - Confidence score of the object.
+- `id` (int) - Class label of the object.
+- `str_label` (str) - String label of the object.
+- `xmin` (int) - X-coordinate of the top-left corner of the bounding box.
+- `ymin` (int) - Y-coordinate of the top-left corner of the bounding box.
+- `xmax` (int) - X-coordinate of the bottom-right corner of the bounding box.
+- `ymax` (int) - Y-coordinate of the bottom-right corner of the bounding box.
+
+
+## Example
+
+```python
+import cv2
+from model_api.models import MaskRCNNModel
+
+# Load the model
+model = MaskRCNNModel.create_model("model.xml")
+
+# Forward pass
+predictions = model(image)
+
+# Iterate over the segmented objects
+for pred_obj in predictions.segmentedObjects:
+    pred_mask = pred_obj.mask
+    pred_score = pred_obj.score
+    label_id = pred_obj.id
+    bbox = [pred_obj.xmin, pred_obj.ymin, pred_obj.xmax, pred_obj.ymax]
+```
diff --git a/docs/source/python/descriptions/keypoint_detection.md b/docs/source/python/descriptions/keypoint_detection.md
@@ -0,0 +1,51 @@
+# Keypoint Detection
+
+## Description
+
+Keypoint detection model aims to detect a set of pre-defined keypoints on a cropped object.
+If a crop is not tight enough, quality of keypoints degrades. Having this model and an
+object detector, one can organize keypoint detection for all objects of interest presented on an image (top-down approach).
+
+## Models
+
+Top-down keypoint detection pipeline uses detections that come from any appropriate detector,
+and a keypoints regression model acting on crops.
+
+### Parameters
+
+The following parameters can be provided via python API or RT Info embedded into OV model:
+
+- `labels`(`list(str)`) : a list of keypoints names.
+
+## OpenVINO Model Specifications
+
+### Inputs
+
+A single `NCHW` tensor representing a batch of images.
+
+### Outputs
+
+Two vectors in Simple Coordinate Classification Perspective ([SimCC](https://arxiv.org/abs/2107.03332)) format:
+
+- `pred_x` (B, N, D1) - `x` coordinate representation, where `N` is the number of keypoints.
+- `pred_y` (B, N, D2) - `y` coordinate representation, where `N` is the number of keypoints.
+
+## Example
+
+```python
+import cv2
+from model_api.models import TopDownKeypointDetectionPipeline, Detection, KeypointDetectionModel
+
+model = KeypointDetectionModel.create_model("kp_model.xml")
+# a list of detections in (x_min, y_min, x_max, y_max, score, class_id) format
+detections = [Detection(0, 0, 100, 100, 1.0, 0)]
+top_down_pipeline = TopDownKeypointDetectionPipeline(model)
+predictions = top_down_detector.predict(image, detections)
+
+# iterating over a list of DetectedKeypoints. Each of the items corresponds to a detection
+for obj_keypoints in predictions:
+    for point in obj_keypoints.keypoints.astype(np.int32):
+        cv2.circle(
+            image, point, radius=0, color=(0, 255, 0), thickness=5
+        )
+```