Skip to content

Commit 60eaacd

Browse files
committed
update
Signed-off-by: yaoyu-33 <[email protected]>
1 parent 28a4d39 commit 60eaacd

File tree

2 files changed

+38
-27
lines changed

2 files changed

+38
-27
lines changed

examples/models/vlm/gemma3_vl/README.md

Lines changed: 32 additions & 21 deletions
Original file line numberDiff line numberDiff line change
@@ -21,6 +21,36 @@ See the [conversion.sh](conversion.sh) script for commands to:
2121
- Export Megatron checkpoints back to Hugging Face format
2222
- Run multi-GPU round-trip validation between formats
2323

24+
25+
## Inference
26+
27+
**See the [inference.sh](inference.sh) script for commands to:
28+
- Run inference with Hugging Face checkpoints
29+
- Run inference with imported Megatron checkpoints
30+
- Run inference with exported Hugging Face checkpoints
31+
32+
**Expected output:**
33+
```
34+
...
35+
Generation step 46
36+
Generation step 47
37+
Generation step 48
38+
Generation step 49
39+
======== GENERATED TEXT OUTPUT ========
40+
Image: https://huggingface.co/nvidia/NVIDIA-Nemotron-Nano-12B-v2-VL-BF16/resolve/main/images/table.png
41+
Prompt: Describe this image.
42+
Generated: <bos><bos><start_of_turn>user
43+
...
44+
Describe this image.<end_of_turn>
45+
<start_of_turn>model
46+
Here's a description of the image you sent, breaking down the technical specifications of the H100 SXM and H100 NVL server cards:
47+
48+
**Overall:**
49+
50+
The image is a table comparing the technical specifications of two
51+
=======================================
52+
```
53+
2454
## Pretrain
2555

2656
Pretraining is not verified for this model.
@@ -37,25 +67,6 @@ See the [peft.sh](peft.sh) script for LoRA fine-tuning with configurable tensor
3767

3868
[W&B Report](TODO)
3969

40-
## Inference
41-
42-
See the [inference.sh](inference.sh) script for commands to:
43-
- Run inference with Hugging Face checkpoints
44-
- Run inference with imported Megatron checkpoints
45-
- Run inference with exported Hugging Face checkpoints
46-
47-
**Example output:**
48-
```
49-
Describe this image.<end_of_turn>
50-
<start_of_turn>model
51-
Here's a description of the image you sent, breaking down the technical specifications of the H100 SXM and H100 NVL server cards:
52-
53-
**Overall:**
70+
## Evaluation
5471

55-
The image is a table comparing the technical specifications of two NVIDIA server cards: the H100 SXM and the H100 NVL. It's designed to highlight the performance differences between the two cards, particularly in terms of compute power and memory.
56-
57-
**Column Breakdown:**
58-
59-
*
60-
=======================================
61-
```
72+
TBD

examples/models/vlm/gemma3_vl/inference.sh

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -2,29 +2,29 @@
22
WORKSPACE=${WORKSPACE:-/workspace}
33

44
# Inference with Hugging Face checkpoints
5-
uv run torchrun --nproc_per_node=4 examples/conversion/hf_to_megatron_generate_vlm.py \
5+
uv run python -m torch.distributed.run --nproc_per_node=4 examples/conversion/hf_to_megatron_generate_vlm.py \
66
--hf_model_path google/gemma-3-4b-it \
77
--image_path "https://huggingface.co/nvidia/NVIDIA-Nemotron-Nano-12B-v2-VL-BF16/resolve/main/images/table.png" \
88
--prompt "Describe this image." \
9-
--max_new_tokens 100 \
9+
--max_new_tokens 50 \
1010
--tp 2 \
1111
--pp 2
1212

1313
# Inference with imported Megatron checkpoints
14-
uv run torchrun --nproc_per_node=4 examples/conversion/hf_to_megatron_generate_vlm.py \
14+
uv run python -m torch.distributed.run --nproc_per_node=4 examples/conversion/hf_to_megatron_generate_vlm.py \
1515
--hf_model_path google/gemma-3-4b-it \
1616
--megatron_model_path ${WORKSPACE}/models/gemma-3-4b-it/iter_0000000 \
1717
--image_path "https://huggingface.co/nvidia/NVIDIA-Nemotron-Nano-12B-v2-VL-BF16/resolve/main/images/table.png" \
1818
--prompt "Describe this image." \
19-
--max_new_tokens 100 \
19+
--max_new_tokens 50 \
2020
--tp 2 \
2121
--pp 2
2222

2323
# Inference with exported HF checkpoints
24-
uv run torchrun --nproc_per_node=4 examples/conversion/hf_to_megatron_generate_vlm.py \
24+
uv run python -m torch.distributed.run --nproc_per_node=4 examples/conversion/hf_to_megatron_generate_vlm.py \
2525
--hf_model_path ${WORKSPACE}/models/gemma-3-4b-it-hf-export \
2626
--image_path "https://huggingface.co/nvidia/NVIDIA-Nemotron-Nano-12B-v2-VL-BF16/resolve/main/images/table.png" \
2727
--prompt "Describe this image." \
28-
--max_new_tokens 100 \
28+
--max_new_tokens 50 \
2929
--tp 2 \
3030
--pp 2

0 commit comments

Comments
 (0)