You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: com.unity.perception/Documentation~/PerceptionCamera.md
+79Lines changed: 79 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -77,6 +77,85 @@ _Example rendered object info for a single object_
77
77
78
78
The RenderedObjectInfoLabeler records a list of all objects visible in the Camera image, including its instance ID, resolved label ID and visible pixels. If Unity cannot resolve objects to a label in the IdLabelConfig, it does not record these objects.
79
79
80
+
### KeypointLabeler
81
+
82
+
The keypoint labeler captures keypoints of a labeled gameobject. The typical use of this labeler is capturing human pose
83
+
estimation data. The labeler uses a [keypoint template](#KeypointTemplate) which defines the keypoints to capture for the
84
+
model and the skeletal connections between those keypoints. The positions of the keypoints are recorded in pixel coordinates
85
+
and saved to the captures json file.
86
+
87
+
```
88
+
keypoints {
89
+
label_id: <int> -- Integer identifier of the label
90
+
instance_id: <str> -- UUID of the instance.
91
+
template_guid: <str> -- UUID of the keypoint template
92
+
pose: <str> -- Pose ground truth information
93
+
keypoints [ -- Array of keypoint data, one entry for each keypoint defined in associated template file.
94
+
{
95
+
index: <int> -- Index of keypoint in template
96
+
x: <float> -- X pixel coordinate of keypoint
97
+
y: <float> -- Y pixel coordinate of keypoint
98
+
state: <int> -- 0: keypoint does not exist, 1 keypoint exists
99
+
}, ...
100
+
]
101
+
}
102
+
```
103
+
104
+
#### Keypoint Template
105
+
106
+
keypoint templates are used to define the keypoints and skeletal connections captured by the KeypointLabeler. The keypoint
107
+
template takes advantage of Unity's humanoid animation rig, and allows the user to automatically associate template keypoints
108
+
to animation rig joints. Additionally, the user can choose to ignore the rigged points, or add points not defined in the rig.
109
+
A Coco keypoint template is included in the perception package.
110
+
111
+
##### Editor
112
+
113
+
The keypoint template editor allows the user to create/modify a keypoint template. The editor consists of the header information,
114
+
the keypoint array, and the skeleton array.
115
+
116
+

117
+
<br/>_Header section of the keypoint template_
118
+
119
+
In the header section, a user can change the name of the template and supply textures that they would like to use for the keypoint
120
+
visualization.
121
+
122
+

123
+
<br/>_Keypoint section of the keypoint template_
124
+
125
+
The keypoint section allows the user to create/edit keypoints and associate them with Unity animation rig points. Each keypoint record
126
+
has 4 fields: label (the name of the keypoint), Associate to Rig (a boolean value which, if true, automatically maps the keypoint to
127
+
the gameobject defined by the rig), Rig Label (only needed if Associate To Rig is true, defines which rig component to associate with
128
+
the keypoint), and Color (RGB color value of the keypoint in the visualization).
129
+
130
+

131
+
<br/>_Skeleton section of the keypoint template_
132
+
133
+
The skeleton section allows the user to create connections between joints, basically defining the skeleton of a labeled object.
134
+
135
+
##### Format
136
+
```
137
+
annotation_definition.spec {
138
+
template_id: <str> -- The UUID of the template
139
+
template_name: <str> -- Human readable name of the template
140
+
key_points [ -- Array of joints defined in this template
141
+
{
142
+
label: <str> -- The label of the joint
143
+
index: <int> -- The index of the joint
144
+
}, ...
145
+
]
146
+
skeleton [ -- Array of skeletal connections (which joints have connections between one another) defined in this template
147
+
{
148
+
joint1: <int> -- The first joint of the connection
149
+
joint2: <int> -- The second joint of the connection
150
+
}, ...
151
+
]
152
+
}
153
+
```
154
+
155
+
#### Animation Pose Label
156
+
157
+
This file is used to define timestamps in an animation to a pose label.
158
+
80
159
## Limitations
81
160
82
161
Ground truth is not compatible with all rendering features, especially those that modify the visibility or shape of objects in the frame.
@@ -172,21 +172,21 @@ A grayscale PNG file that stores integer values (label pixel_value in [annotatio
172
172
173
173
#### capture.annotation.values
174
174
175
-
<!-- Not yet implemented annotations
176
-
##### instance segmentation - polygon
175
+
##### instance segmentation - color image
177
176
178
-
A json object that stores collections of polygons. Each polygon record maps a tuple of (instance, label) to a list of
179
-
K pixel coordinates that forms a polygon. This object can be directly stored in annotation.values
177
+
A color png file that stores instance ids as a color value per pixel. The png files are located in the "filename" location.
180
178
181
179
```
182
-
semantic_segmentation_polygon {
183
-
label_id: <int> -- Integer identifier of the label
184
-
label_name: <str> -- String identifier of the label
185
-
instance_id: <str> -- UUID of the instance.
186
-
polygon: [<int, int>,...] -- List of points in pixel coordinates of the outer edge. Connecting these points in order should create a polygon that identifies the object.
180
+
instance_segmentation {
181
+
instance_id: <int> -- The instance ID of the labeled object
182
+
color { -- The pixel color that correlates with the instance ID
183
+
r: <int> -- The red value of the pixel between 0 and 255
184
+
g: <int> -- The green value of the pixel between 0 and 255
185
+
b: <int> -- The blue value of the pixel between 0 and 255
186
+
a: <int> -- The alpha value of the pixel between 0 and 255
187
+
}
187
188
}
188
189
```
189
-
-->
190
190
191
191
##### 2D bounding box
192
192
@@ -196,36 +196,77 @@ We follow the OpenCV 2D coordinate [system](https://github.com/vvvv/VL.OpenCV/wi
196
196
197
197
```
198
198
bounding_box_2d {
199
-
label_id: <int> -- Integer identifier of the label
200
-
label_name: <str> -- String identifier of the label
201
-
instance_id: <str> -- UUID of the instance.
199
+
label_id: <int> -- Integer identifier of the label
200
+
label_name: <str> -- String identifier of the label
201
+
instance_id: <str> -- UUID of the instance.
202
202
x: <float> -- x coordinate of the upper left corner.
203
203
y: <float> -- y coordinate of the upper left corner.
204
204
width: <float> -- number of pixels in the x direction
205
205
height: <float> -- number of pixels in the y direction
206
206
}
207
207
```
208
-
<!-- Not yet implemented annotations
209
208
210
209
##### 3D bounding box
211
210
212
-
A json file that stored collections of 3D bounding boxes.
213
-
Each bounding box record maps a tuple of (instance, label) to translation, size and rotation that draws a 3D bounding box, as well as velocity and acceleration (optional) of the 3D bounding box.
214
-
All location data is given with respect to the **sensor coordinate system**.
211
+
3D bounding box information. Unlike the 2D bounding box, 3D bounding boxes coordinates are captured in **sensor coordinate system**.
212
+
Each bounding box record maps a tuple of (instance, label) to translation, size and rotation that draws a 3D bounding box, as well as velocity and acceleration (optional) of the 3D bounding box.
215
213
216
214
```
217
215
bounding_box_3d {
218
-
label_id: <int> -- Integer identifier of the label
219
-
label_name: <str> -- String identifier of the label
220
-
instance_id: <str> -- UUID of the instance.
221
-
translation: <float, float, float> -- 3d bounding box's center location in meters as center_x, center_y, center_z with respect to global coordinate system.
222
-
size: <float, float, float> -- 3d bounding box size in meters as width, length, height.
223
-
rotation: <float, float, float, float> -- 3d bounding box orientation as quaternion: w, x, y, z.
224
-
velocity: <float, float, float> -- 3d bounding box velocity in meters per second as v_x, v_y, v_z.
225
-
acceleration: <float, float, float> [optional] -- 3d bounding box acceleration in meters per second^2 as a_x, a_y, a_z.
216
+
label_id: <int> -- Integer identifier of the label
217
+
label_name: <str> -- String identifier of the label
218
+
instance_id: <str> -- UUID of the instance.
219
+
translation { -- 3d bounding box's center location in meters with respect to global coordinate system.
220
+
x: <float> -- The x coordinate
221
+
y: <float> -- The y coordinate
222
+
z: <float> -- The z coordinate
223
+
}
224
+
size { -- 3d bounding box size in meters
225
+
x: <float> -- The x coordinate
226
+
y: <float> -- The y coordinate
227
+
z: <float> -- The z coordinate
228
+
}
229
+
rotation { -- 3d bounding box orientation as quaternion: w, x, y, z.
230
+
x: <float> -- The x coordinate
231
+
y: <float> -- The y coordinate
232
+
z: <float> -- The z coordinate
233
+
w: <float> -- The w coordinate
234
+
}
235
+
velocity { -- [Optional] 3d bounding box velocity in meters per second.
236
+
x: <float> -- The x coordinate
237
+
y: <float> -- The y coordinate
238
+
z: <float> -- The z coordinate
239
+
}
240
+
acceleration { -- [Optional] 3d bounding box acceleration in meters per second^2.
241
+
x: <float> -- The x coordinate
242
+
y: <float> -- The y coordinate
243
+
z: <float> -- The z coordinate
244
+
}
245
+
}
246
+
```
247
+
##### Keypoints
248
+
249
+
Keypoint data, commonly used for human pose estimation. A keypoint capture is associated to a template that defines the keypoints (see annotation.definition file).
250
+
Each keypoint record maps a tuple of (instance, label) to template, pose, and an array of keypoints. A keypoint will exist in this record for each keypoint defined in the template file.
251
+
If a given keypoint doesn't exist in the labeled gameobject, then that keypoint will have a state value of 0; if it does exist then it will have a keypoint value of 2.
252
+
```
253
+
keypoints {
254
+
label_id: <int> -- Integer identifier of the label
255
+
instance_id: <str> -- UUID of the instance.
256
+
template_guid: <str> -- UUID of the keypoint template
257
+
pose: <str> -- Pose ground truth information
258
+
keypoints [ -- Array of keypoint data, one entry for each keypoint defined in associated template file.
259
+
{
260
+
index: <int> -- Index of keypoint in template
261
+
x: <float> -- X pixel coordinate of keypoint
262
+
y: <float> -- Y pixel coordinate of keypoint
263
+
state: <int> -- 0: keypoint does not exist, 2 keypoint exists
264
+
}, ...
265
+
]
226
266
}
227
267
```
228
268
269
+
<!-- Not yet implemented annotations
229
270
230
271
#### instances (V2, WIP)
231
272
@@ -303,27 +344,52 @@ Each record describes a particular type of annotation and contains an annotation
303
344
Typically, the `spec` key describes all labels_id and label_name used by the annotation.
304
345
Some special cases like semantic segmentation might assign additional values (e.g. pixel value) to record the mapping between label_id/label_name and pixel color in the annotated PNG files.
305
346
347
+
##### annotation definition header
306
348
```
307
349
annotation_definition {
308
-
id: <int> -- Integer identifier of the annotation definition.
309
-
name: <str> -- Human readable annotation spec name (e.g. sementic_segmentation, instance_segmentation, etc.)
310
-
description: <str, optional> -- Description of this annotation specifications.
311
-
format: <str> -- The format of the annotation files. (e.g. png, json, etc.)
312
-
spec: [<obj>...] -- Format-specific specification for the annotation values (ex. label-value mappings for semantic segmentation images)
350
+
id: <int> -- Integer identifier of the annotation definition.
351
+
name: <str> -- Human readable annotation spec name (e.g. sementic_segmentation, instance_segmentation, etc.)
352
+
description: <str> -- [Optional] Description of this annotation specifications.
353
+
format: <str> -- The format of the annotation files. (e.g. png, json, etc.)
354
+
spec: [<obj>...] -- Format-specific specification for the annotation values (ex. label-value mappings for semantic segmentation images)
313
355
}
314
-
315
-
# semantic segmentation
356
+
```
357
+
##### semantic segmentation
358
+
Annotation spec for semantic [segmentation labeler](#semantic-segmentation---grayscale-image)
359
+
```
316
360
annotation_definition.spec {
317
-
label_id: <int> -- Integer identifier of the label
318
-
label_name: <str> -- String identifier of the label
319
-
pixel_value: <int> -- Grayscale pixel value
320
-
color_pixel_value: <int, int, int> [optional] -- Color pixel value
361
+
label_id: <int> -- Integer identifier of the label
362
+
label_name: <str> -- String identifier of the label
363
+
pixel_value: <int> -- Grayscale pixel value
364
+
color_pixel_value: <int, int, int> -- [Optional] Color pixel value
321
365
}
322
-
323
-
# label enumeration spec, used for annotations like bounding box 2d. This might be a subset of all labels used in simulation.
366
+
```
367
+
##### label enumeration spec
368
+
This spec is used for annotations like [bounding box 2d](#2d-bounding-box). This might be a subset of all labels used in simulation.
369
+
```
324
370
annotation_definition.spec {
325
-
label_id: <int> -- Integer identifier of the label
326
-
label_name: <str> -- String identifier of the label
371
+
label_id: <int> -- Integer identifier of the label
372
+
label_name: <str> -- String identifier of the label
373
+
}
374
+
```
375
+
##### keypoint template
376
+
keypoint templates are used to define the keypoints and skeletal connections captured by the [keypoint labeler](#keypoints).
377
+
```
378
+
annotation_definition.spec {
379
+
template_id: <str> -- The UUID of the template
380
+
template_name: <str> -- Human readable name of the template
381
+
key_points [ -- Array of joints defined in this template
382
+
{
383
+
label: <str> -- The label of the joint
384
+
index: <int> -- The index of the joint
385
+
}, ...
386
+
]
387
+
skeleton [ -- Array of skeletal connections (which joints have connections between one another) defined in this template
388
+
{
389
+
joint1: <int> -- The first joint of the connection
390
+
joint2: <int> -- The second joint of the connection
0 commit comments