Skip to content

Commit f8331e8

Browse files
authored
[Fix] Fix issues noticed during Regression benchmark (#217)
* fix vlnpe result.json save path * update vlln dataset path * fix evaluator bug for rxr ndtw result * update habitat extensions readme * add community tutorials * align eval configs to default path * fix ne and spl contain NaN issue * update links for community work * update readme * update checkpoint path; isolate transformer dependency * Update readme IROS
1 parent 977c934 commit f8331e8

File tree

16 files changed

+83
-808
lines changed

16 files changed

+83
-808
lines changed

README.md

Lines changed: 19 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -49,9 +49,10 @@ The toolbox supports the most advanced high-quality navigation dataset, InternDa
4949
- [🏠 Introduction](#-introduction)
5050
- [🔥 News](#-news)
5151
- [📚 Getting Started](#-getting-started)
52-
- [📦 Overview of Benchmark \& Model Zoo](#-overview-of-benchmark-and-model-zoo)
52+
- [📦 Overview of Benchmark \& Model Zoo](#-overview)
5353
- [🔧 Customization](#-customization)
5454
- [👥 Contribute](#-contribute)
55+
- [🚀 Community Deployment & Best Practices](#-community-deployment--best-practices)
5556
- [🔗 Citation](#-citation)
5657
- [📄 License](#-license)
5758
- [👏 Acknowledgements](#-acknowledgements)
@@ -213,6 +214,23 @@ For example, raising issues, fixing bugs in the framework, and adapting or addin
213214

214215
**Note:** We welcome the feedback of the model's zero-shot performance when deploying in your own environment. Please show us your results and offer us your future demands regarding the model's capability. We will select the most valuable ones and collaborate with users together to solve them in the next few months :)
215216

217+
## 🚀 Community Deployment & Best Practices
218+
219+
We are excited to see InternNav being deployed and extended by the community across different robots and real-world scenarios.
220+
Below are selected community-driven deployment guides and solution write-ups, which may serve as practical references for advanced users.
221+
222+
- **IROS Challenge Nav Track: Champion Solution (2025)**
223+
A complete system-level solution and design analysis for Vision-and-Language Navigation in Physical Environments.
224+
🔗 https://zhuanlan.zhihu.com/p/1969046543286907790
225+
226+
- **Go2 Series Deployment Tutorial (ShanghaiTech University)**
227+
Step-by-step edge deployment guide for InternNav-based perception and navigation.
228+
🔗 https://github.com/cmjang/InternNav-deploy
229+
230+
- **G1 Series Deployment Tutorial (Wuhan University)**
231+
Detailed educational materials on vision-language navigation deployment.
232+
🔗 [*Chapter 5: Vision-Language Navigation (Part II)*](https://mp.weixin.qq.com/s/p3cJzbRvecMajiTh9mXoAw)
233+
216234
## 🔗 Citation
217235

218236
If you find our work helpful, please cite:

internnav/agent/dialog_agent.py

Lines changed: 5 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -11,21 +11,16 @@
1111
import quaternion
1212
import torch
1313
from PIL import Image, ImageDraw
14-
from transformers import (
15-
AutoProcessor,
16-
AutoTokenizer,
17-
Qwen2_5_VLForConditionalGeneration,
18-
)
1914

2015
from internnav.agent import Agent
2116
from internnav.configs.agent import AgentCfg
2217

2318
try:
24-
pass
25-
except Exception as e:
26-
print(f"Warning: ({e}), Ignore this if not using dual_system.")
27-
28-
try:
19+
from transformers import (
20+
AutoProcessor,
21+
AutoTokenizer,
22+
Qwen2_5_VLForConditionalGeneration,
23+
)
2924
from depth_camera_filtering import filter_depth
3025
from habitat.tasks.nav.shortest_path_follower import ShortestPathFollower
3126
except Exception as e:

internnav/dataset/vlln_lerobot_dataset.py

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -22,21 +22,21 @@
2222

2323
# Define placeholders for dataset paths
2424
IION_split1 = {
25-
"data_path": "traj_data/mp3d_split1",
25+
"data_path": "projects/VL-LN-Bench/traj_data/mp3d_split1",
2626
"height": 125,
2727
"pitch_1": 0,
2828
"pitch_2": 30,
2929
}
3030

3131
IION_split2 = {
32-
"data_path": "traj_data/mp3d_split2",
32+
"data_path": "projects/VL-LN-Bench/traj_data/mp3d_split2",
3333
"height": 125,
3434
"pitch_1": 0,
3535
"pitch_2": 30,
3636
}
3737

3838
IION_split3 = {
39-
"data_path": "traj_data/mp3d_split3",
39+
"data_path": "projects/VL-LN-Bench/traj_data/mp3d_split3",
4040
"height": 125,
4141
"pitch_1": 0,
4242
"pitch_2": 30,

internnav/evaluator/utils/result_logger.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -319,5 +319,5 @@ def finalize_all_results(self, rank, world_size):
319319
}
320320

321321
# write log content to file
322-
with open(f"{self.name}_result.json", "w") as f:
322+
with open(f"{PROJECT_ROOT_PATH}/logs/{self.name}/result.json", "w") as f:
323323
json.dump(json_data, f, indent=2, ensure_ascii=False)

internnav/habitat_extensions/vlln/README.md

Lines changed: 25 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -3,16 +3,34 @@
33
Vision-Language-and-Language Navigation (VL-LN) is a new [benchmark](https://0309hws.github.io/VL-LN.github.io/) built upon VLN in Habitat, which refers to the setting that models take the vision and language as input and output language and navigation actions. In contrast to VLN, where agents only take navigation actions, agents in VL-LN could ask questions and engage in dialogue with users to complete tasks better with language interaction.
44
This package adapts [Meta AI Habitat](https://aihabitat.org) for VL-LN within InternNav. It wraps Habitat environments that expose semantic masks, registers dialog-aware datasets and metrics, and provides evaluators that coordinate agent actions, NPC interactions, and logging.
55

6+
Install our benchmark [dataset](https://huggingface.co/datasets/InternRobotics/VL-LN-Bench) and the [latest checkpoints](https://huggingface.co/InternRobotics/VL-LN-Bench-basemodel) from HuggingFace.
7+
Place the downloaded benchmark under `InternNav/projects/VL-LN-Bench` to match the default path expected by the code.
8+
69
## Package structure
710

811
```
9-
habitat_vlln_extensions/
10-
├── __init__.py
11-
├── habitat_dialog_evaluator.py
12-
├── habitat_vlln_env.py
13-
├── measures.py
14-
├── simple_npc/
15-
└── utils/
12+
InternNav
13+
├── assets/
14+
├── internnav/
15+
│ ├── habitat_vlln_extensions
16+
│ │ ├── simple_npc
17+
│ │ │ ├── api_key.txt
18+
│ │ ├── measures.py
19+
│ │ ├── habitat_dialog_evaluator.py
20+
│ │ ├── habitat_vlln_env.py
21+
│ ... ... ...
22+
...
23+
├── projects
24+
│ ├── VL-LN-Bench/
25+
│ │ ├── base_model/
26+
│ │ ├── raw_data/
27+
│ │ ├── scene_datasets/
28+
│ │ │ └── mp3d/
29+
│ │ │ └── 17DRP5sb8fy/
30+
│ │ │ ├── 1LXtFkjw3qL/
31+
│ │ │ ...
32+
│ │ ├── traj_data/
33+
...
1634
```
1735

1836
* `__init__.py` re-exports the public entry points so callers can import

internnav/habitat_extensions/vlln/habitat_dialog_evaluator.py

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -257,6 +257,13 @@ def calc_metrics(self, global_metrics: dict) -> dict:
257257
# avoid /0 if no episodes
258258
denom = max(len(sucs_all), 1)
259259

260+
# clean NaN in spls, treat as 0.0
261+
torch.nan_to_num(spls_all, nan=0.0, posinf=0.0, neginf=0.0, out=spls_all)
262+
263+
# clean inf in nes, only fiinite nes are counted
264+
nes_finite_mask = torch.isfinite(nes_all)
265+
nes_all = nes_all[nes_finite_mask]
266+
260267
return {
261268
"sucs_all": float(sucs_all.mean().item()) if denom > 0 else 0.0,
262269
"spls_all": float(spls_all.mean().item()) if denom > 0 else 0.0,

internnav/habitat_extensions/vln/README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,7 @@ utilities.
99
## Package structure
1010

1111
```
12-
habitat_extensions/
12+
habitat_extensions/vln/
1313
├── __init__.py
1414
├── habitat_env.py
1515
├── habitat_default_evaluator.py

internnav/habitat_extensions/vln/habitat_vln_evaluator.py

Lines changed: 10 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -192,7 +192,7 @@ def eval_action(self):
192192
"nes": nes, # shape [N_local]
193193
}
194194

195-
if ndtws:
195+
if ndtws is not None:
196196
result["ndtws"] = ndtws # shape [N_local]
197197
return result
198198

@@ -207,6 +207,13 @@ def calc_metrics(self, global_metrics: dict) -> dict:
207207

208208
# avoid /0 if no episodes
209209
denom = max(len(sucs_all), 1)
210+
211+
# clean NaN in spls, treat as 0.0
212+
torch.nan_to_num(spls_all, nan=0.0, posinf=0.0, neginf=0.0, out=spls_all)
213+
214+
# clean inf in nes, only fiinite nes are counted
215+
nes_finite_mask = torch.isfinite(nes_all)
216+
nes_all = nes_all[nes_finite_mask]
210217

211218
result_all = {
212219
"sucs_all": float(sucs_all.mean().item()) if denom > 0 else 0.0,
@@ -587,7 +594,7 @@ def _run_eval_dual_system(self) -> tuple:
587594
torch.tensor(spls).to(self.device),
588595
torch.tensor(oss).to(self.device),
589596
torch.tensor(nes).to(self.device),
590-
torch.tensor(ndtw).to(self.device) if 'ndtw' in metrics else None,
597+
torch.tensor(ndtw).to(self.device) if ndtw else None,
591598
)
592599

593600
def _run_eval_system2(self) -> tuple:
@@ -876,5 +883,5 @@ def _run_eval_system2(self) -> tuple:
876883
torch.tensor(spls).to(self.device),
877884
torch.tensor(oss).to(self.device),
878885
torch.tensor(nes).to(self.device),
879-
torch.tensor(ndtw).to(self.device) if 'ndtw' in metrics else None,
886+
torch.tensor(ndtw).to(self.device) if ndtw else None,
880887
)

0 commit comments

Comments
 (0)