Skip to content

Commit 317be41

Browse files
authored
Updates AutoMate with more documentation and options (#2674)
# Description Fix the issues reported from QA. - Change number of trajectories for disassembly task and the job will output this number during running. - Add explanation about disassembly task in environment doc (not involving policy training and evaluation) - Add flag for wandb to record learning curves for assembly tasks - Add flag for max_iterations to set number of training epochs - Add the command line for windows in run_w_id.py and run_disassembly_w_id.py ## Type of change - Bug fix (non-breaking change which fixes an issue) - New feature (non-breaking change which adds functionality) - This change requires a documentation update ## Checklist - [ x ] I have run the [`pre-commit` checks](https://pre-commit.com/) with `./isaaclab.sh --format` - [ x ] I have made corresponding changes to the documentation - [ x ] My changes generate no new warnings - [ ] I have added tests that prove my fix is effective or that my feature works - [ ] I have updated the changelog and the corresponding version in the extension's `config/extension.toml` file - [ x ] I have added my name to the `CONTRIBUTORS.md` or my name already exists there
1 parent ba2a7dc commit 317be41

File tree

6 files changed

+38
-24
lines changed

6 files changed

+38
-24
lines changed

docs/source/overview/environments.rst

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -216,11 +216,11 @@ We provide environments for both disassembly and assembly.
216216
217217
For addition instructions and Windows installation, please refer to the `CUDA installation page <https://developer.nvidia.com/cuda-12-8-0-download-archive>`_.
218218

219-
* |disassembly-link|: The plug starts inserted in the socket. A low-level controller lifts th plug out and moves it to a random position. These trajectories serve as demonstrations for the reverse process, i.e., learning to assemble. To run disassembly for a specific task: ``./isaaclab.sh -p source/isaaclab_tasks/isaaclab_tasks/direct/automate/run_disassembly_w_id.py --assembly_id=ASSEMBLY_ID``
219+
* |disassembly-link|: The plug starts inserted in the socket. A low-level controller lifts the plug out and moves it to a random position. This process is purely scripted and does not involve any learned policy. Therefore, it does not require policy training or evaluation. The resulting trajectories serve as demonstrations for the reverse process, i.e., learning to assemble. To run disassembly for a specific task: ``python source/isaaclab_tasks/isaaclab_tasks/direct/automate/run_disassembly_w_id.py --assembly_id=ASSEMBLY_ID --disassembly_dir=DISASSEMBLY_DIR``. All generated trajectories are saved to a local directory ``DISASSEMBLY_DIR``.
220220
* |assembly-link|: The goal is to insert the plug into the socket. You can use this environment to train a policy via reinforcement learning or evaluate a pre-trained checkpoint.
221221

222-
* To train an assembly policy: ``./isaaclab.sh -p source/isaaclab_tasks/isaaclab_tasks/direct/automate/run_w_id.py --assembly_id=ASSEMBLY_ID --train``
223-
* To evaluate an assembly policy: ``./isaaclab.sh -p source/isaaclab_tasks/isaaclab_tasks/direct/automate/run_w_id.py --assembly_id=ASSEMBLY_ID --checkpoint=CHECKPOINT --log_eval``
222+
* To train an assembly policy, we run the command ``python source/isaaclab_tasks/isaaclab_tasks/direct/automate/run_w_id.py --assembly_id=ASSEMBLY_ID --train``. We can customize the training process using the optional flags: ``--headless`` to run without opening the GUI windows, ``--max_iterations=MAX_ITERATIONS`` to set the number of training iterations, ``--num_envs=NUM_ENVS`` to set the number of parallel environments during training, ``--seed=SEED`` to assign the random seed, ``--wandb`` to enable logging to WandB (requires a WandB account). The policy checkpoints will be saved automatically during training in the directory ``logs/rl_games/Assembly/test``.
223+
* To evaluate an assembly policy, we run the command ``python source/isaaclab_tasks/isaaclab_tasks/direct/automate/run_w_id.py --assembly_id=ASSEMBLY_ID --checkpoint=CHECKPOINT --log_eval``. The evaluation results are stored in ``evaluation_{ASSEMBLY_ID}.h5``.
224224

225225
.. table::
226226
:widths: 33 37 30

source/isaaclab_tasks/isaaclab_tasks/direct/automate/assembly_env.py

Lines changed: 10 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -71,7 +71,8 @@ def __init__(self, cfg: AssemblyEnvCfg, render_mode: str | None = None, **kwargs
7171
if self.cfg_task.sample_from != "rand":
7272
self._init_eval_loading()
7373

74-
wandb.init(project="automate", name=self.cfg_task.assembly_id + "_" + datetime.now().strftime("%m/%d/%Y"))
74+
if self.cfg_task.wandb:
75+
wandb.init(project="automate", name=self.cfg_task.assembly_id + "_" + datetime.now().strftime("%m/%d/%Y"))
7576

7677
def _init_eval_loading(self):
7778
eval_held_asset_pose, eval_fixed_asset_pose, eval_success = automate_log.load_log_from_hdf5(
@@ -553,7 +554,8 @@ def _get_rewards(self):
553554
rew_buf = self._update_rew_buf(curr_successes)
554555
self.ep_succeeded = torch.logical_or(self.ep_succeeded, curr_successes)
555556

556-
wandb.log(self.extras)
557+
if self.cfg_task.wandb:
558+
wandb.log(self.extras)
557559

558560
# Only log episode success rates at the end of an episode.
559561
if torch.any(self.reset_buf):
@@ -577,11 +579,12 @@ def _get_rewards(self):
577579
)
578580

579581
self.extras["curr_max_disp"] = self.curr_max_disp
580-
wandb.log({
581-
"success": torch.mean(self.ep_succeeded.float()),
582-
"reward": torch.mean(rew_buf),
583-
"sbc_rwd_scale": sbc_rwd_scale,
584-
})
582+
if self.cfg_task.wandb:
583+
wandb.log({
584+
"success": torch.mean(self.ep_succeeded.float()),
585+
"reward": torch.mean(rew_buf),
586+
"sbc_rwd_scale": sbc_rwd_scale,
587+
})
585588

586589
if self.cfg_task.if_logging_eval:
587590
self.success_log = torch.cat([self.success_log, self.ep_succeeded.reshape((self.num_envs, 1))], dim=0)

source/isaaclab_tasks/isaaclab_tasks/direct/automate/assembly_tasks_cfg.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -138,6 +138,7 @@ class AssemblyTask:
138138
if_logging_eval: bool = False
139139
num_eval_trials: int = 100
140140
eval_filename: str = "evaluation_00015.h5"
141+
wandb: bool = False
141142

142143
# Fine-tuning
143144
sample_from: str = "rand" # gp, gmm, idv, rand

source/isaaclab_tasks/isaaclab_tasks/direct/automate/disassembly_tasks_cfg.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -115,10 +115,10 @@ class Hole8mm(FixedAssetCfg):
115115
class Extraction(DisassemblyTask):
116116
name = "extraction"
117117

118-
assembly_id = "00731"
118+
assembly_id = "00015"
119119
assembly_dir = f"{ASSET_DIR}/{assembly_id}/"
120120
disassembly_dir = "disassembly_dir"
121-
num_log_traj = 100
121+
num_log_traj = 1000
122122

123123
fixed_asset_cfg = Hole8mm()
124124
held_asset_cfg = Peg8mm()

source/isaaclab_tasks/isaaclab_tasks/direct/automate/run_disassembly_w_id.py

Lines changed: 7 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,7 @@
77
import os
88
import re
99
import subprocess
10+
import sys
1011

1112

1213
def update_task_param(task_cfg, assembly_id, disassembly_dir):
@@ -61,9 +62,12 @@ def main():
6162
args.disassembly_dir,
6263
)
6364

64-
bash_command = (
65-
"./isaaclab.sh -p scripts/reinforcement_learning/rl_games/train.py --task=Isaac-AutoMate-Disassembly-Direct-v0"
66-
)
65+
if sys.platform.startswith("win"):
66+
bash_command = "isaaclab.bat -p"
67+
elif sys.platform.startswith("linux"):
68+
bash_command = "./isaaclab.sh -p"
69+
70+
bash_command += " scripts/reinforcement_learning/rl_games/train.py --task=Isaac-AutoMate-Disassembly-Direct-v0"
6771

6872
bash_command += f" --num_envs={str(args.num_envs)}"
6973
bash_command += f" --seed={str(args.seed)}"

source/isaaclab_tasks/isaaclab_tasks/direct/automate/run_w_id.py

Lines changed: 15 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -6,9 +6,10 @@
66
import argparse
77
import re
88
import subprocess
9+
import sys
910

1011

11-
def update_task_param(task_cfg, assembly_id, if_sbc, if_log_eval):
12+
def update_task_param(task_cfg, assembly_id, if_sbc, if_log_eval, if_wandb):
1213
# Read the file lines.
1314
with open(task_cfg) as f:
1415
lines = f.readlines()
@@ -20,6 +21,7 @@ def update_task_param(task_cfg, assembly_id, if_sbc, if_log_eval):
2021
if_sbc_pattern = re.compile(r"^(.*if_sbc\s*:\s*bool\s*=\s*).*$")
2122
if_log_eval_pattern = re.compile(r"^(.*if_logging_eval\s*:\s*bool\s*=\s*).*$")
2223
eval_file_pattern = re.compile(r"^(.*eval_filename\s*:\s*str\s*=\s*).*$")
24+
if_wandb_pattern = re.compile(r"^(.*wandb\s*:\s*bool\s*=\s*).*$")
2325

2426
for line in lines:
2527
if "assembly_id =" in line:
@@ -30,6 +32,8 @@ def update_task_param(task_cfg, assembly_id, if_sbc, if_log_eval):
3032
line = if_log_eval_pattern.sub(rf"\1{str(if_log_eval)}", line)
3133
elif "eval_filename: str = " in line:
3234
line = eval_file_pattern.sub(r"\1'{}'".format(f"evaluation_{assembly_id}.h5"), line)
35+
elif "wandb: bool =" in line:
36+
line = if_wandb_pattern.sub(rf"\1{str(if_wandb)}", line)
3337

3438
updated_lines.append(line)
3539

@@ -47,28 +51,30 @@ def main():
4751
default="source/isaaclab_tasks/isaaclab_tasks/direct/automate/assembly_tasks_cfg.py",
4852
)
4953
parser.add_argument("--assembly_id", type=str, help="New assembly ID to set.")
54+
parser.add_argument("--wandb", action="store_true", help="Use wandb to record learning curves")
5055
parser.add_argument("--checkpoint", type=str, help="Checkpoint path.")
5156
parser.add_argument("--num_envs", type=int, default=128, help="Number of parallel environment.")
5257
parser.add_argument("--seed", type=int, default=-1, help="Random seed.")
5358
parser.add_argument("--train", action="store_true", help="Run training mode.")
5459
parser.add_argument("--log_eval", action="store_true", help="Log evaluation results.")
5560
parser.add_argument("--headless", action="store_true", help="Run in headless mode.")
61+
parser.add_argument("--max_iterations", type=int, default=1500, help="Number of iteration for policy learning.")
5662
args = parser.parse_args()
5763

58-
update_task_param(args.cfg_path, args.assembly_id, args.train, args.log_eval)
64+
update_task_param(args.cfg_path, args.assembly_id, args.train, args.log_eval, args.wandb)
5965

6066
bash_command = None
67+
if sys.platform.startswith("win"):
68+
bash_command = "isaaclab.bat -p"
69+
elif sys.platform.startswith("linux"):
70+
bash_command = "./isaaclab.sh -p"
6171
if args.train:
62-
bash_command = (
63-
"./isaaclab.sh -p scripts/reinforcement_learning/rl_games/train.py --task=Isaac-AutoMate-Assembly-Direct-v0"
64-
)
65-
bash_command += f" --seed={str(args.seed)}"
72+
bash_command += " scripts/reinforcement_learning/rl_games/train.py --task=Isaac-AutoMate-Assembly-Direct-v0"
73+
bash_command += f" --seed={str(args.seed)} --max_iterations={str(args.max_iterations)}"
6674
else:
6775
if not args.checkpoint:
6876
raise ValueError("No checkpoint provided for evaluation.")
69-
bash_command = (
70-
"./isaaclab.sh -p scripts/reinforcement_learning/rl_games/play.py --task=Isaac-AutoMate-Assembly-Direct-v0"
71-
)
77+
bash_command += " scripts/reinforcement_learning/rl_games/play.py --task=Isaac-AutoMate-Assembly-Direct-v0"
7278

7379
bash_command += f" --num_envs={str(args.num_envs)}"
7480

0 commit comments

Comments
 (0)