Skip to content

Commit 2922cc6

Browse files
authored
[Feat] add training and evaluation docs
* add training and evaluation docs * fix some terms * ad training tutorial
1 parent e37762c commit 2922cc6

File tree

4 files changed

+283
-73
lines changed

4 files changed

+283
-73
lines changed

source/en/user_guide/internnav/quick_start/train_eval.md renamed to source/en/user_guide/internnav/quick_start/evaluation.md

Lines changed: 34 additions & 61 deletions
Original file line numberDiff line numberDiff line change
@@ -1,16 +1,16 @@
1-
# Training and Evaluation
1+
# Evaluation
22

3-
This document presents how to train and evaluate models for different systems with InternNav.
3+
This document describes how to evaluate models in **InternNav**.
44

5-
## Whole-system
5+
## InternVLA-N1 (Dual System)
66

7-
### Training
8-
The training pipeline is currently under preparation and will be open-sourced soon.
7+
Model weights of InternVLA-N1 (Dual System) can be downloaded from [InternVLA-N1-DualVLN](https://huggingface.co/InternRobotics/InternVLA-N1-DualVLN) and [InternVLA-N1-w-NavDP](https://huggingface.co/InternRobotics/InternVLA-N1-w-NavDP).
98

10-
### Evaluation
11-
Before evaluation, we should download the robot assets from [InternUTopiaAssets](https://huggingface.co/datasets/InternRobotics/Embodiments) and move them to the `data/` directory. Model weights of InternVLA-N1 can be downloaded from [InternVLA-N1](https://huggingface.co/InternRobotics/InternVLA-N1).
9+
---
10+
11+
### Evaluation on Isaac Sim
12+
Before evaluation, we should download the robot assets from [InternUTopiaAssets](https://huggingface.co/datasets/InternRobotics/Embodiments) and move them to the `data/` directory.
1213

13-
#### Evaluation on Isaac Sim
1414
[UPDATE] We support using local model and isaac sim in one process now. Evaluate on Single-GPU:
1515

1616
```bash
@@ -51,7 +51,7 @@ The simulation can be visualized by set `vis_output=True` in eval_cfg.
5151

5252
<img src="../../../_static/video/nav_eval.gif" alt="My GIF">
5353

54-
#### Evaluation on Habitat Sim
54+
### Evaluation on Habitat Sim
5555
Evaluate on Single-GPU:
5656

5757
```bash
@@ -74,18 +74,36 @@ For multi-gpu inference, currently we support inference on SLURM as well as envi
7474
--config scripts/eval/configs/habitat_dual_system_cfg.py
7575
```
7676

77+
## InternVLA-N1 (System 2)
7778

78-
## System1
79+
Model weights of InternVLA-N1 (System2) can be downloaded from [InternVLA-N1-System2](https://huggingface.co/InternRobotics/InternVLA-N1-System2).
7980

80-
### Training
81+
Currently we only support evaluate single System2 on Habitat:
8182

82-
Download the training data from [Hugging Face](https://huggingface.co/datasets/InternRobotics/InternData-N1/), and organize them in the form mentioned in [installation](./installation.md).
83+
Evaluate on Single-GPU:
8384

8485
```bash
85-
./scripts/train/start_train.sh --name "$NAME" --model-name navdp
86+
python scripts/eval/eval.py --config scripts/eval/configs/habitat_s2_cfg.py
87+
88+
# set config with the following fields
89+
eval_cfg = EvalCfg(
90+
agent=AgentCfg(
91+
model_name='internvla_n1',
92+
model_settings={
93+
"mode": "system2", # inference mode: dual_system or system2
94+
"model_path": "checkpoints/<s2_checkpoint>", # path to model checkpoint
95+
}
96+
)
97+
)
98+
```
99+
100+
For multi-gpu inference, currently we only support inference on SLURM.
101+
102+
```bash
103+
./scripts/eval/bash/eval_system2.sh
86104
```
87105

88-
### Evaluation
106+
## VN Systems (System 1)
89107

90108
We support the evaluation of diverse System-1 baselines separately in [NavDP](https://github.com/InternRobotics/NavDP/tree/navdp_benchmark) to make it easy to use and deploy.
91109
To install the environment, we provide a quick start below:
@@ -129,53 +147,8 @@ python navdp_server.py --port {PORT} --checkpoint {CHECKPOINT_path}
129147
python eval_pointgoal_wheeled.py --port {PORT} --scene_dir {SCENE_DIR}
130148
```
131149

132-
133-
## System2
134-
135-
### Training
136-
137-
Currently, we only support training of small VLN models (CMA, RDP, Seq2Seq) in this repo. For the training of LLM-based VLN (Navid, StreamVLN, etc), please refer to [StreamVLN](https://github.com/OpenRobotLab/StreamVLN) for training details.
138-
139-
```base
140-
# train cma model
141-
./scripts/train/start_train.sh --name cma_train --model cma
142-
143-
# train rdp model
144-
./scripts/train/start_train.sh --name rdp_train --model rdp
145-
146-
# train seq2seq model
147-
./scripts/train/start_train.sh --name seq2seq_train --model seq2seq
148-
```
149-
### Evaluation
150-
151-
#### InternVLA-N1-S2
152-
Currently we only support evaluate single System2 on Habitat:
153-
154-
Evaluate on Single-GPU:
155-
156-
```bash
157-
python scripts/eval/eval.py --config scripts/eval/configs/habitat_s2_cfg.py
158-
159-
# set config with the following fields
160-
eval_cfg = EvalCfg(
161-
agent=AgentCfg(
162-
model_name='internvla_n1',
163-
model_settings={
164-
"mode": "system2", # inference mode: dual_system or system2
165-
"model_path": "checkpoints/<s2_checkpoint>", # path to model checkpoint
166-
}
167-
)
168-
)
169-
```
170-
171-
For multi-gpu inference, currently we only support inference on SLURM.
172-
173-
```bash
174-
./scripts/eval/bash/eval_system2.sh
175-
```
176-
177-
#### Baseline Models
178-
We provide three small VLN baselines (Seq2Seq, CMA, RDP) for evaluation in the InterUtopia (Isaac-Sim) environment.
150+
## Single-System VLN Baselines
151+
We provide three small Single-System VLN baselines (Seq2Seq, CMA, RDP) for evaluation in the InterUtopia (Isaac-Sim) environment.
179152

180153
Download the baseline models:
181154
```bash

source/en/user_guide/internnav/quick_start/index.md

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -15,5 +15,6 @@ myst:
1515
installation
1616
simulation
1717
interndata
18-
train_eval
18+
training
19+
evaluation
1920
```
Lines changed: 132 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,132 @@
1+
# Training
2+
3+
This document provides instructions for training models in **InternNav**.
4+
5+
## Overview
6+
7+
InternNav supports training models under three system paradigms:
8+
9+
- **Dual-System VLN Models**: integrated System2 + System1 architectures
10+
- **Single-System VLN Models**: end-to-end vision-and-language navigation models
11+
- **VN System (System1) Models**: low-level visual navigation and control models
12+
13+
14+
Each paradigm follows a different training protocol, which is detailed below.
15+
16+
17+
## Dual-System VLN Models
18+
Dual-System VLN Models integrates **System2** (high-level reasoning and planning) with
19+
**System1** (low-level action control), supporting both modular integration and joint training.
20+
21+
22+
### Supported Systems
23+
- **InternVLA-N1 (System2)**
24+
- **InternVLA-N1 (Dual System) w/ NavDP***
25+
(*NavDP** indicates joint tuning with System2)
26+
- **InternVLA-N1 (Dual System) DualVLN**
27+
28+
29+
### 1. Training for InternVLA-N1 (System2)
30+
31+
**InternVLA-N1 (System2)** is trained independently to predict 2D pixel goals for navigation.
32+
33+
It can be used with any compatible System1 model capable of executing 2D pixel goals or point goals (given depth and pose).
34+
Alternatively, it can be jointly trained together with a System1 model for end-to-end multi-system optimization.
35+
36+
37+
#### Training Command
38+
39+
```bash
40+
# training system2 separately
41+
sbatch ./scripts/train/base_train/qwenvl_train/train_system2.sh
42+
```
43+
44+
---
45+
46+
### 2. Joint Training for InternVLA-N1 (Dual System)
47+
48+
After completing training of **InternVLA-N1 (System2)**, joint training is supported with a pixel-goal navigation System1, using either the **NavDP** or **NextDiT** architecture.
49+
50+
- **InternVLA-N1 (Dual System) w/ NavDP**: preserves **NavDP**'s model design and uses **RGB-D** input.
51+
- **InternVLA-N1 (Dual System) DualVLN**: uses only **RGB** input, resulting in a smaller model footprint.
52+
53+
#### Training Command
54+
55+
```bash
56+
# training system1 based on system2
57+
sbatch ./scripts/train/base_train/qwenvl_train/train_dual_system.sh
58+
```
59+
60+
- For **w/ NavDP** model variant, set `system1=navdp_async`. Optimal performance is typically observed after **30,000 iterations**.
61+
- For **DualVLN** model variant, set `system1=nextdit_async`. Optimal performance is typically observed after **15,000 iterations**.
62+
63+
## Single-System VLN Models
64+
65+
Single-System VLN Models directly map **visual observations and language instructions** to navigation actions in an end-to-end manner.
66+
67+
68+
### Supported Models
69+
70+
The following Single-System VLN Models are currently supported:
71+
72+
- Seq2Seq
73+
- CMA
74+
- RDP
75+
76+
For our VLM-based VLN model **StreamVLN**, please refer to the following repository for training details:
77+
https://github.com/InternRobotics/StreamVLN
78+
79+
Support for StreamVLN within InternNav is planned for future releases.
80+
81+
82+
### Training Command
83+
84+
Training is performed through a unified training entry script.
85+
Below are example commands for each supported model.
86+
87+
**Seq2Seq**
88+
```
89+
./scripts/train/base_train/start_train.sh --name seq2seq_train --model seq2seq
90+
```
91+
92+
**CMA**
93+
```
94+
./scripts/train/base_train/start_train.sh --name cma_train --model cma
95+
```
96+
97+
**RDP**
98+
```
99+
./scripts/train/base_train/start_train.sh --name rdp_train --model rdp
100+
```
101+
102+
103+
## VN System (System1) Models
104+
105+
VN System (System1) focuses on **low-level visual navigation and motion control**.
106+
107+
108+
### Supported Methods
109+
110+
The following visual navigation methods are included in the System1 benchmark:
111+
112+
- DD-PPO
113+
- iPlanner
114+
- ViPlanner
115+
- GNM
116+
- ViNT
117+
- NoMaD
118+
- NavDP (**InternVLA-N1 System1**)
119+
120+
Among them, **only NavDP is currently supported for training** in InternNav.
121+
All other methods are provided for **evaluation and comparison purposes only**.
122+
123+
124+
### Training Command
125+
126+
**NavDP**
127+
128+
129+
```bash
130+
./scripts/train/base_train/start_train.sh --name navdp_train --model-name navdp
131+
```
132+

0 commit comments

Comments
 (0)