You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: source/en/user_guide/internmanip/quick_start/add_benchmark.md
+11-5Lines changed: 11 additions & 5 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,12 +1,15 @@
1
1
# 🥇 Add a New Benchmark
2
2
3
3
4
-
This guide walks you through adding a custom agent and custom evaluation benchmark to the InternManip framework.
4
+
This guide walks you through **adding a custom benchmark** into the InternManip framework, including defining your own `Agent`and `Evaluator` classes, as well as registering and launching them.
5
5
6
-
### 1. Implement Your Model Agent
6
+
### 1. Define a Custom Agent
7
7
8
8
9
-
To support a new model in InternManip, define a subclass of [`BaseAgent`](../../internmanip/agent/base.py). You must implement two core methods:
9
+
In the updated design, an **Agent** is tied to the **benchmark (evaluation environment)** rather than to a specific policy model. It is responsible for interfacing between the environment and the control policy, handling observation preprocessing and action postprocessing, and coordinating resets.
10
+
11
+
12
+
All agents must inherit from [`BaseAgent`](../../internmanip/agent/base.py) and implement the following two methods:
10
13
11
14
-`step()`: given an observation, returns an action.
12
15
-`reset()`: resets internal states, if needed.
@@ -106,7 +109,10 @@ eval_cfg = EvalCfg(
106
109
agent=AgentCfg(
107
110
agent_type="custom_agent", # Corresponds to the name registered in AgentRegistry
108
111
base_model_path="path/to/model",
109
-
model_kwargs={...},
112
+
agent_settings={...},
113
+
model_kwargs={
114
+
'HF_cache_dir': None,
115
+
},
110
116
server_cfg=ServerCfg( # Optional server configuration
111
117
server_host="localhost",
112
118
server_port=5000,
@@ -132,4 +138,4 @@ eval_cfg = EvalCfg(
132
138
python scripts/eval/start_evaluator.py \
133
139
--config scripts/eval/configs/custom_on_custom.py
134
140
```
135
-
Use `--distributed` for Ray-based multi-GPU, and `--server` for client-server mode.
141
+
> 💡 Use `--server` for client-server mode, and `--distributed` for Ray-based multi-GPU (WIP).
Copy file name to clipboardExpand all lines: source/en/user_guide/internmanip/quick_start/add_dataset.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -7,7 +7,7 @@ The process involves two main steps: **[ensuring the dataset format](#dataset-st
7
7
8
8
## Dataset Structure
9
9
10
-
All datasets must follow the [LeRobotDataset Format](#https://github.com/huggingface/lerobot) to ensure compatibility with the data loaders and training pipelines.
10
+
All datasets must follow the [LeRobotDataset Format](https://github.com/huggingface/lerobot) to ensure compatibility with the data loaders and training pipelines.
Copy file name to clipboardExpand all lines: source/en/user_guide/internmanip/quick_start/add_model.md
+58-24Lines changed: 58 additions & 24 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -41,8 +41,25 @@ Finally, you need to **register** the model with the framework and you can start
41
41
42
42
## 2. Create the Model Configuration File
43
43
44
-
The config file is used to store the architecture related hyper-parameters. Here is some basic information you need to know:
45
-
You shall add the model configuration file in `internmanip/configs/model/{model_name}_cfg.py`, which should inherit `transformers.PretrainedConfig`.
44
+
Each model in our framework should define its architecture-related hyperparameters in a **configuration file**.
45
+
These configuration classes inherit from `transformers.PretrainedConfig`, which provides serialization, deserialization, and compatibility with HuggingFace’s model loading utilities.
46
+
47
+
You should place your model’s config file in:
48
+
```bash
49
+
internmanip/configs/model/{model_name}_cfg.py
50
+
```
51
+
52
+
**🧱 About transformers.PretrainedConfig**
53
+
54
+
[`PretrainedConfig`](https://huggingface.co/docs/transformers/main_classes/configuration) is the base class for all HuggingFace model configurations. It supports:
55
+
- Loading/saving config files via .from_pretrained() and .save_pretrained()
56
+
- Managing default values
57
+
- Providing shared arguments across training, inference, and serialization
58
+
59
+
60
+
<!-- The config file is used to store the architecture related hyper-parameters. Here is some basic information you need to know:
61
+
You shall add the model configuration file in `internmanip/configs/model/{model_name}_cfg.py`, which should inherit `transformers.PretrainedConfig`. -->
62
+
46
63
47
64
The following is **an example** of a model configuration file:
48
65
@@ -53,32 +70,41 @@ class CustomPolicyConfig(PretrainedConfig):
> 🔧 Important: All config classes must implement a transform() method that returns a 3-tuple.
78
102
79
103
As shown in the example above, the config class defines key architectural hyperparameters—such as the backbone model name, whether to freeze the backbone, the hidden/output dimensions of the action head, and more. You are free to extend this config with any additional parameters required by your custom model.
80
104
81
-
Additionally, you can implement a **model-specific `transform` method** within the config class. This method allows you to apply custom data transformations that are *not* included in the dataset-specific transform list defined in `internmanip/configs/dataset/data_config.py`.
105
+
**🔧 About `transforms`**
106
+
107
+
Additionally, you can implement a **model-specific `transform` method** within the config class. This method allows you to apply custom data transformations that are ***not*** included in the dataset-specific transform list defined in `internmanip/configs/dataset/data_config.py`.
82
108
83
109
During training, the script `scripts/train/train.py` will automatically call this method and apply your custom transform alongside the default ones. Your `transform` method should follow the same input/output format as dataset-specific transform. For implementation guidance, refer to examples in the `internmanip/dataset/transform` directory.
84
110
@@ -98,14 +124,21 @@ import torch.nn as nn, torch.nn.functional as F, torch
98
124
from typing import Dict
99
125
from internmanip.configs.model.custom_policy_cfg import CustomPolicyConfig
Make sure the string `"custom_model"` passed to `AutoConfig.register` matches the model name used in both your `CustomPolicyModel` definition and the data collator registration.
176
210
177
-
Don't forget to register the module in your __init__.py, so that your custom model gets imported and initialized properly during runtime. For example:
211
+
Don't forget to register the module in your `__init__.py`, so that your custom model gets imported and initialized properly during runtime. For example:
We provide several built-in policies such as **GR00T-N1**, **GR00T-N1.5**, **Pi-0**, **DP-CLIP**, and **ACT-CLIP**.
36
-
To quickly verify your setup, you can train the **DP-CLIP** model on the `genmanip-demo` dataset (300 demonstrations of the instruction *"Move the milk carton to the top of the ceramic bowl"*).
36
+
To quickly verify your setup, you can train the **Pi-0** model on the `genmanip-demo` dataset (300 demonstrations of the instruction *"Move the milk carton to the top of the ceramic bowl"*).
37
37
This requires **1 GPU with at least 24GB memory**:
38
38
39
39
```bash
40
40
torchrun --nnodes 1 --nproc_per_node 1 \ # number of processes per node, e.g., 1
41
41
scripts/train/train.py \
42
-
--config run_configs/train/dp_clip_genmanip_v1.yaml # Config file that specifies which model to train on which dataset, along with hyperparameters
42
+
--config run_configs/train/pi0_genmanip_v1.yaml # Config file that specifies which model to train on which dataset, along with hyperparameters
43
43
```
44
44
45
45
> 😄 When you run the script, it will prompt you to log in to Weights & Biases (WandB). This integration allows you to monitor your training process in real time via the WandB dashboard.
> 💡 Tips: The recommended training setup is a global batch size of 2048 for 50,000 steps, which typically takes approximately 500 GPU hours (Assuming each node has 8 GPUs).
134
135
135
136
## Customizing Training with Your Own YAML Config
136
137
@@ -148,11 +149,8 @@ base_model_path: lerobot/pi0 # (Optional) Overrides the model checkpoin
148
149
**💡 Notes:**
149
150
150
151
- `model_type`: Must match the name of a model that has already been registered within InternManip.
151
-
152
152
- `dataset_path`: Can be a HuggingFace ID (e.g., `InternRobotics/InternData-GenmanipTest`) or a local directory where the dataset is downloaded.
153
-
154
153
- `data_config`: Refers to a dataset configuration preset (e.g., for preprocessing or loading behavior), also pre-registered in the codebase.
155
-
156
154
- `base_model_path`: This is optional. If the selected `model_type` is supported and known, InternManip will automatically resolve and download the correct checkpoint from HuggingFace. If you’ve already downloaded a model locally or want to use a custom one, you can specify the path here directly.
157
155
158
156
By editing or extending this YAML file, you can quickly try different models, datasets, or training setups — all without modifying the training script.
@@ -165,7 +163,6 @@ By editing or extending this YAML file, you can quickly try different models, da
165
163
When creating your own YAML config file for training or evaluation, you can directly refer to the following officially supported values:
166
164
167
165
- Use values from the `${model_type}` and `${base_model_path}` columns below to populate the corresponding fields in your YAML.
168
-
169
166
- Similarly, values from the `${data_config}` and `${dataset_path}` columns can be used to specify the dataset configuration and loading path.
170
167
171
168
<!-- This ensures consistency with the models and datasets that have been pre-registered within InternManip. -->
The default evaluation setup adopts a client-server architecture where the policy (model) and the environment run in separate processes. This improves compatibility and modularity for large-scale benchmarks.
278
+
You can evaluate `pi0` on the `genmanip` benchmark in a single process using the following command:
278
279
279
280
280
-
By default, the inference of model will be running in the main loop sharing the same process with the `env`. You can evaluate `pi0` on the `Genmanip` benchmark in a single process using the following command:
281
+
**🖥 Terminal 1: Launch the Policy Server (Model Side)**
281
282
283
+
Activate the environment for the model and start the policy server:
282
284
```bash
283
-
python scripts/eval/start_evaluator.py \
284
-
--config scripts/eval/config/pi0_on_genmanip.py
285
+
source .venv/model/bin/activate
286
+
python scripts/eval/start_policy_server.py
287
+
```
288
+
This server listens for observation inputs from the environment and responds with action predictions from the model.
289
+
290
+
**🖥 Terminal 2: Launch the Evaluator (Environment Side)**
0 commit comments