Skip to content

Commit 538c7bb

Browse files
Merge pull request #23 from NVIDIA-ISAAC-ROS/hotfix-release-dp3.1-1
Add DOPE Custom Model tutorial
2 parents 79edf43 + c392dbb commit 538c7bb

File tree

5 files changed

+63
-12
lines changed

5 files changed

+63
-12
lines changed

README.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -246,6 +246,7 @@ To continue your exploration, check out the following suggested examples:
246246
- [`DOPE` with `Triton`](docs/dope-triton.md)
247247
- [`Centerpose` with `Triton`](docs/centerpose.md)
248248
- [`DOPE` with non-standard input image sizes](docs/dope-custom-size.md)
249+
- [Train your own `DOPE` model](docs/dope-custom-model.md)
249250
250251
### Use Different Models
251252

docs/dope-custom-model.md

Lines changed: 33 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,33 @@
1+
# Training your own DOPE model
2+
3+
## Overview
4+
5+
The DOPE network architecture is intended to be trained on objects of a specific class, which means that using DOPE for pose estimation of a custom object class requires training a custom model for that class.
6+
7+
[NVIDIA Isaac Sim](https://developer.nvidia.com/isaac-sim) offers a convenient workflow for training a custom DOPE model using synthetic data generation (SDG).
8+
9+
## Tutorial Walkthrough
10+
11+
1. Clone the [Isaac Sim DOPE Training repository](https://github.com/andrewyguo/dope_training#deep-object-pose-estimation-dope---training) and follow the training instructions to prepare a custom DOPE model.
12+
2. Using the [Isaac Sim DOPE inference script](https://github.com/andrewyguo/dope_training/tree/master/inference), test the custom DOPE model's inference capability and ensure that the quality is acceptable for your use case.
13+
14+
3. Follow steps 1-5 of the main DOPE [quickstart](../README.md#quickstart).
15+
16+
4. At step 6, move the prepared `.pth` model output from the Isaac Sim DOPE Training script into the `/tmp/models` path inside the Docker container.
17+
```bash
18+
docker cp custom_model.pth isaac_ros_dev-x86_64-container:/tmp/models
19+
```
20+
5. At step 7, run the `dope_converter.py` script with the custom model:
21+
22+
```bash
23+
python3 /workspaces/isaac_ros-dev/src/isaac_ros_pose_estimation/isaac_ros_dope/scripts/dope_converter.py --format onnx --input /tmp/models/custom_model.pth
24+
```
25+
26+
6. Proceed through steps 8-9.
27+
7. At step 10, launch the ROS 2 launchfile with the custom model:
28+
29+
```bash
30+
ros2 launch isaac_ros_dope isaac_ros_dope_tensor_rt.launch.py model_file_path:=/tmp/models/custom_model.onnx engine_file_path:=/tmp/models/custom_model.plan
31+
```
32+
33+
8. Continue with the rest of the quickstart. You should now be able to detect poses of custom objects.

isaac_ros_dope/config/dope_config.yaml

Lines changed: 7 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -60,5 +60,10 @@ dope:
6060
"PeasAndCarrots" : [ 5.8512001037597656, 7.0636000633239746, 6.5918002128601074 ]
6161
}
6262

63-
# 9 element camera matrix (assuming 640x480 image)
64-
camera_matrix: [463.51, 0.0, 321.652, 0.0, 616.44, 232.260, 0.0, 0.0, 1.0]
63+
# 9 element camera matrix (using default from Ketchup demo)
64+
# Taken from: https://github.com/andrewyguo/dope_training/blob/master/inference/config/camera_info.yaml
65+
camera_matrix: [
66+
364.16501736, 0.0, 121.36296296,
67+
0.0, 364.16501736, 121.36296296,
68+
0.0, 0.0, 1.0
69+
]

isaac_ros_dope/gxf/dope/dope_decoder.cpp

Lines changed: 10 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -182,24 +182,24 @@ FindObjects(const std::array<cv::Mat, kInputMapsChannels> &maps) {
182182
for (size_t pp = 0; pp < peaks.size(); ++pp) {
183183
const auto peak = peaks[pp];
184184

185-
// Compute the weighted average for localizing the peak, using a 5x5
185+
// Compute the weighted average for localizing the peak, using an 11x11
186186
// window
187187
Vector2f peak_sum(0, 0);
188188
float weight_sum = 0.0f;
189-
for (int ii = -2; ii <= 2; ++ii) {
190-
for (int jj = -2; jj <= 2; ++jj) {
191-
const int row = peak[0] + ii;
192-
const int col = peak[1] + jj;
189+
const int WINDOW_SIZE = 11;
190+
for (int ii = -(WINDOW_SIZE - 1) / 2; ii <= (WINDOW_SIZE - 1) / 2; ++ii) {
191+
for (int jj = -(WINDOW_SIZE - 1) / 2; jj <= (WINDOW_SIZE - 1) / 2; ++jj) {
192+
const int col = peak[0] + ii;
193+
const int row = peak[1] + jj;
193194

194-
if (col < 0 || col >= image.size[1] || row < 0 ||
195-
row >= image.size[0]) {
195+
if (col < 0 || col >= image.cols || row < 0 || row >= image.rows) {
196196
continue;
197197
}
198198

199199
const float weight = image.at<float>(row, col);
200200
weight_sum += weight;
201-
peak_sum[0] += row * weight;
202-
peak_sum[1] += col * weight;
201+
peak_sum[0] += col * weight;
202+
peak_sum[1] += row * weight;
203203
}
204204
}
205205

@@ -322,7 +322,7 @@ ExtractPose(const DopeObjectKeypoints &object,
322322
cv::Mat cv_keypoints_2d;
323323
cv::eigen2cv(object.second, cv_keypoints_2d);
324324
if (!cv::solvePnP(cv_keypoints_3d.t(), cv_keypoints_2d.t(), camera_matrix,
325-
dist_coeffs, rvec, tvec)) {
325+
dist_coeffs, rvec, tvec, false, cv::SOLVEPNP_EPNP)) {
326326
GXF_LOG_ERROR("cv::solvePnP failed");
327327
return nvidia::gxf::Unexpected{GXF_FAILURE};
328328
}

isaac_ros_dope/launch/isaac_ros_dope_tensor_rt.launch.py

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -38,6 +38,14 @@ def generate_launch_description():
3838
'network_image_height',
3939
default_value='480',
4040
description='The input image height that the network expects'),
41+
DeclareLaunchArgument(
42+
'encoder_image_mean',
43+
default_value='[0.5, 0.5, 0.5]',
44+
description='The mean for image normalization'),
45+
DeclareLaunchArgument(
46+
'encoder_image_stddev',
47+
default_value='[0.5, 0.5, 0.5]',
48+
description='The standard deviation for image normalization'),
4149
DeclareLaunchArgument(
4250
'model_file_path',
4351
default_value=f'{default_model_file_path}',
@@ -87,6 +95,8 @@ def generate_launch_description():
8795
# DNN Image Encoder parameters
8896
network_image_width = LaunchConfiguration('network_image_width')
8997
network_image_height = LaunchConfiguration('network_image_height')
98+
encoder_image_mean = LaunchConfiguration('encoder_image_mean')
99+
encoder_image_stddev = LaunchConfiguration('encoder_image_stddev')
90100

91101
# Tensor RT parameters
92102
model_file_path = LaunchConfiguration('model_file_path')
@@ -110,6 +120,8 @@ def generate_launch_description():
110120
parameters=[{
111121
'network_image_width': network_image_width,
112122
'network_image_height': network_image_height,
123+
'image_mean': encoder_image_mean,
124+
'image_stddev': encoder_image_stddev,
113125
}],
114126
remappings=[('encoded_tensor', 'tensor_pub')])
115127

0 commit comments

Comments
 (0)