Skip to content

Commit 64a97f2

Browse files
Changes Pink IK solver and null space computation to reduce runtime speed (#3904)
# Description This PR optimizes the Pink IK controller solver by changing the qpsolver to use **daqp**, and also optimizing the matrix pseudo inverse in Null Space Posture Task. This achieves a **2x performance improvement** by reducing runtime from 1.23 ms to 0.52 ms. Perf experiments documented in third page here: https://docs.google.com/document/d/1E9UiYUU-oCOIetUkqAIva8oK0NvdNkMeqP2gxmeqxNA/edit?tab=t.0#heading=h.snu74q2v857w ## Key Changes 1. **Optimized Pseudoinverse Computation**: Replaced `np.linalg.pinv()` with a custom implementation using direct LAPACK/BLAS calls in the null space projector calculation. The new approach uses Cholesky factorization (`dpotrf`) and triangular solvers (`dpotrs`) for faster computation. 2. **QP Solver Update**: Changed the Pink IK solver from `osqp` to `daqp` for improved performance. 3. **New Dependency**: Added `daqp==0.7.2` to `setup.py` for Linux platforms. ## Type of change <!-- As you go through the list, delete the ones that are not applicable. --> - New feature (non-breaking change which adds functionality) ## Checklist - [x] I have read and understood the [contribution guidelines](https://isaac-sim.github.io/IsaacLab/main/source/refs/contributing.html) - [x] I have run the [`pre-commit` checks](https://pre-commit.com/) with `./isaaclab.sh --format` - [x] I have made corresponding changes to the documentation - [x] My changes generate no new warnings - [ ] I have added tests that prove my fix is effective or that my feature works - [x] I have updated the changelog and the corresponding version in the extension's `config/extension.toml` file - [x] I have added my name to the `CONTRIBUTORS.md` or my name already exists there <!-- As you go through the checklist above, you can mark something as done by putting an x character in it For example, - [x] I have done this task - [ ] I have not done this task --> --------- Co-authored-by: Kelly Guo <[email protected]>
1 parent 76f5b29 commit 64a97f2

File tree

6 files changed

+75
-6
lines changed

6 files changed

+75
-6
lines changed

docs/source/overview/imitation-learning/teleop_imitation.rst

Lines changed: 33 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -292,6 +292,15 @@ Using the Mimic generated data we can now train a state-based BC agent for ``Isa
292292
Visualizing results
293293
^^^^^^^^^^^^^^^^^^^
294294

295+
.. tip::
296+
297+
**Important: Testing Multiple Checkpoint Epochs**
298+
299+
When evaluating policy performance, it is common for different training epochs to yield significantly different results.
300+
If you don't see the expected performance, **always test policies from various epochs** (not just the final checkpoint)
301+
to find the best-performing model. Model performance can vary substantially across training, and the final epoch
302+
is not always optimal.
303+
295304
By inferencing using the generated model, we can visualize the results of the policy:
296305

297306
.. tab-set::
@@ -315,6 +324,11 @@ By inferencing using the generated model, we can visualize the results of the po
315324
--device cpu --enable_cameras --task Isaac-Stack-Cube-Franka-IK-Rel-Visuomotor-v0 --num_rollouts 50 \
316325
--checkpoint /PATH/TO/desired_model_checkpoint.pth
317326
327+
.. tip::
328+
329+
**If you don't see expected performance results:** Test policies from multiple checkpoint epochs, not just the final one.
330+
Policy performance can vary significantly across training epochs, and intermediate checkpoints often outperform the final model.
331+
318332
.. note::
319333

320334
**Expected Success Rates and Timings for Franka Cube Stack Task**
@@ -323,6 +337,7 @@ By inferencing using the generated model, we can visualize the results of the po
323337
* Data generation time: ~30 mins for state, ~4 hours for visuomotor (varies based on num envs the user runs)
324338
* BC RNN training time: 1000 epochs + ~30 mins (for state), 600 epochs + ~6 hours (for visuomotor)
325339
* BC RNN policy success rate: ~40-60% (for both state + visuomotor)
340+
* **Recommendation:** Evaluate checkpoints from various epochs throughout training to identify the best-performing model
326341

327342

328343
Demo 1: Data Generation and Policy Training for a Humanoid Robot
@@ -513,6 +528,11 @@ Visualize the results of the trained policy by running the following command, us
513528
.. note::
514529
Change the ``NORM_FACTOR`` in the above command with the values generated in the training step.
515530

531+
.. tip::
532+
533+
**If you don't see expected performance results:** It is critical to test policies from various checkpoint epochs.
534+
Performance can vary significantly between epochs, and the best-performing checkpoint is often not the final one.
535+
516536
.. figure:: https://download.isaacsim.omniverse.nvidia.com/isaaclab/images/gr-1_steering_wheel_pick_place_policy.gif
517537
:width: 100%
518538
:align: center
@@ -528,7 +548,7 @@ Visualize the results of the trained policy by running the following command, us
528548
* Success rate for data generation depends on the quality of human demonstrations (how well the user performs them) and dataset annotation quality. Both data generation and downstream policy success are sensitive to these factors and can show high variance. See :ref:`Common Pitfalls when Generating Data <common-pitfalls-generating-data>` for tips to improve your dataset.
529549
* Data generation success for this task is typically 65-80% over 1000 demonstrations, taking 18-40 minutes depending on GPU hardware and success rate (19 minutes on a RTX ADA 6000 @ 80% success rate).
530550
* Behavior Cloning (BC) policy success is typically 75-86% (evaluated on 50 rollouts) when trained on 1000 generated demonstrations for 2000 epochs (default), depending on demonstration quality. Training takes approximately 29 minutes on a RTX ADA 6000.
531-
* Recommendation: Train for 2000 epochs with 1000 generated demonstrations, and evaluate multiple checkpoints saved between the 1500th and 2000th epochs to select the best-performing policy.
551+
* **Recommendation:** Train for 2000 epochs with 1000 generated demonstrations, and **evaluate multiple checkpoints saved between the 1000th and 2000th epochs** to select the best-performing policy. Testing various epochs is essential for finding optimal performance.
532552

533553

534554
Demo 2: Data Generation and Policy Training for Humanoid Robot Locomanipulation with Unitree G1
@@ -642,6 +662,11 @@ Visualize the trained policy performance:
642662
.. note::
643663
Change the ``NORM_FACTOR`` in the above command with the values generated in the training step.
644664

665+
.. tip::
666+
667+
**If you don't see expected performance results:** Always test policies from various checkpoint epochs.
668+
Different epochs can produce significantly different results, so evaluate multiple checkpoints to find the optimal model.
669+
645670
.. figure:: https://download.isaacsim.omniverse.nvidia.com/isaaclab/images/locomanipulation-g-1_steering_wheel_pick_place.gif
646671
:width: 100%
647672
:align: center
@@ -657,7 +682,7 @@ Visualize the trained policy performance:
657682
* Success rate for data generation depends on the quality of human demonstrations (how well the user performs them) and dataset annotation quality. Both data generation and downstream policy success are sensitive to these factors and can show high variance. See :ref:`Common Pitfalls when Generating Data <common-pitfalls-generating-data>` for tips to improve your dataset.
658683
* Data generation success for this task is typically 65-82% over 1000 demonstrations, taking 18-40 minutes depending on GPU hardware and success rate (18 minutes on a RTX ADA 6000 @ 82% success rate).
659684
* Behavior Cloning (BC) policy success is typically 75-85% (evaluated on 50 rollouts) when trained on 1000 generated demonstrations for 2000 epochs (default), depending on demonstration quality. Training takes approximately 40 minutes on a RTX ADA 6000.
660-
* Recommendation: Train for 2000 epochs with 1000 generated demonstrations, and evaluate multiple checkpoints saved between the 1500th and 2000th epochs to select the best-performing policy.
685+
* **Recommendation:** Train for 2000 epochs with 1000 generated demonstrations, and **evaluate multiple checkpoints saved between the 1000th and 2000th epochs** to select the best-performing policy. Testing various epochs is essential for finding optimal performance.
661686

662687
Generate the dataset with manipulation and point-to-point navigation
663688
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
@@ -851,6 +876,11 @@ Visualize the results of the trained policy by running the following command, us
851876
.. note::
852877
Change the ``NORM_FACTOR`` in the above command with the values generated in the training step.
853878

879+
.. tip::
880+
881+
**If you don't see expected performance results:** Test policies from various checkpoint epochs, not just the final one.
882+
Policy performance can vary substantially across training, and intermediate checkpoints often yield better results.
883+
854884
.. figure:: https://download.isaacsim.omniverse.nvidia.com/isaaclab/images/gr-1_nut_pouring_policy.gif
855885
:width: 100%
856886
:align: center
@@ -866,7 +896,7 @@ Visualize the results of the trained policy by running the following command, us
866896
* Success rate for data generation depends on the quality of human demonstrations (how well the user performs them) and dataset annotation quality. Both data generation and downstream policy success are sensitive to these factors and can show high variance. See :ref:`Common Pitfalls when Generating Data <common-pitfalls-generating-data>` for tips to improve your dataset.
867897
* Data generation for 1000 demonstrations takes approximately 10 hours on a RTX ADA 6000.
868898
* Behavior Cloning (BC) policy success is typically 50-60% (evaluated on 50 rollouts) when trained on 1000 generated demonstrations for 600 epochs (default). Training takes approximately 15 hours on a RTX ADA 6000.
869-
* Recommendation: Train for 600 epochs with 1000 generated demonstrations, and evaluate multiple checkpoints saved between the 300th and 600th epochs to select the best-performing policy.
899+
* **Recommendation:** Train for 600 epochs with 1000 generated demonstrations, and **evaluate multiple checkpoints saved between the 300th and 600th epochs** to select the best-performing policy. Testing various epochs is critical for achieving optimal performance.
870900

871901
.. _common-pitfalls-generating-data:
872902

source/isaaclab/config/extension.toml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
[package]
22

33
# Note: Semantic Versioning is used: https://semver.org/
4-
version = "0.47.6"
4+
version = "0.47.7"
55

66
# Description
77
title = "Isaac Lab framework for Robot Learning"

source/isaaclab/docs/CHANGELOG.rst

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,15 @@
11
Changelog
22
---------
33

4+
0.47.7 (2025-10-31)
5+
~~~~~~~~~~~~~~~~~~~
6+
7+
Changed
8+
^^^^^^^
9+
10+
* Changed Pink IK controller qpsolver from osqp to daqp.
11+
* Changed Null Space matrix computation in Pink IK's Null Space Posture Task to a faster matrix pseudo inverse computation.
12+
413

514
0.47.6 (2025-11-01)
615
~~~~~~~~~~~~~~~~~~~~

source/isaaclab/isaaclab/controllers/pink_ik/null_space_posture_task.py

Lines changed: 30 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,8 @@
44
# SPDX-License-Identifier: BSD-3-Clause
55

66
import numpy as np
7+
import scipy.linalg.blas as blas
8+
import scipy.linalg.lapack as lapack
79

810
import pinocchio as pin
911
from pink.configuration import Configuration
@@ -75,6 +77,9 @@ class NullSpacePostureTask(Task):
7577
7678
"""
7779

80+
# Regularization factor for pseudoinverse computation to ensure numerical stability
81+
PSEUDOINVERSE_DAMPING_FACTOR: float = 1e-9
82+
7883
def __init__(
7984
self,
8085
cost: float,
@@ -237,6 +242,30 @@ def compute_jacobian(self, configuration: Configuration) -> np.ndarray:
237242
J_combined = np.concatenate(J_frame_tasks, axis=0)
238243

239244
# Compute null space projector: N = I - J^+ * J
240-
N_combined = np.eye(J_combined.shape[1]) - np.linalg.pinv(J_combined) @ J_combined
245+
# Use fast pseudoinverse computation with direct LAPACK/BLAS calls
246+
m, n = J_combined.shape
247+
248+
# Wide matrix (typical for robotics): use left pseudoinverse
249+
# J^+ = J^T @ inv(J @ J^T + λ²I)
250+
# This is faster because we invert an m×m matrix instead of n×n
251+
252+
# Compute J @ J^T using BLAS (faster than numpy)
253+
JJT = blas.dgemm(1.0, J_combined, J_combined.T)
254+
np.fill_diagonal(JJT, JJT.diagonal() + self.PSEUDOINVERSE_DAMPING_FACTOR**2)
255+
256+
# Use LAPACK's Cholesky factorization (dpotrf = Positive definite TRiangular Factorization)
257+
L, info = lapack.dpotrf(JJT, lower=1, clean=False, overwrite_a=True)
258+
259+
if info != 0:
260+
# Fallback if not positive definite: use numpy's pseudoinverse
261+
J_pinv = np.linalg.pinv(J_combined)
262+
return np.eye(n) - J_pinv @ J_combined
263+
264+
# Solve (J @ J^T + λ²I) @ X = J using LAPACK's triangular solver (dpotrs)
265+
# This directly solves the system without computing the full inverse
266+
X, _ = lapack.dpotrs(L, J_combined, lower=1)
267+
268+
# Compute null space projector: N = I - J^T @ X
269+
N_combined = np.eye(n) - J_combined.T @ X
241270

242271
return N_combined

source/isaaclab/isaaclab/controllers/pink_ik/pink_ik.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -221,7 +221,7 @@ def compute(
221221
self.pink_configuration,
222222
self.cfg.variable_input_tasks + self.cfg.fixed_input_tasks,
223223
dt,
224-
solver="osqp",
224+
solver="daqp",
225225
safety_break=self.cfg.fail_on_joint_limit_violation,
226226
)
227227
joint_angle_changes = velocity * dt

source/isaaclab/setup.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -52,6 +52,7 @@
5252
INSTALL_REQUIRES += [
5353
# required by isaaclab.isaaclab.controllers.pink_ik
5454
f"pin-pink==3.1.0 ; platform_system == 'Linux' and ({SUPPORTED_ARCHS_ARM})",
55+
f"daqp==0.7.2 ; platform_system == 'Linux' and ({SUPPORTED_ARCHS_ARM})",
5556
# required by isaaclab.devices.openxr.retargeters.humanoid.fourier.gr1_t2_dex_retargeting_utils
5657
f"dex-retargeting==0.4.6 ; platform_system == 'Linux' and ({SUPPORTED_ARCHS})",
5758
]

0 commit comments

Comments
 (0)