Skip to content

Commit a106dd9

Browse files
committed
fix test
Signed-off-by: Jennifer Chen <[email protected]>
1 parent 291cfa3 commit a106dd9

File tree

3 files changed

+12
-8
lines changed

3 files changed

+12
-8
lines changed

examples/nemo_run/qat/README.md

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -59,13 +59,12 @@ You can run the example either locally or on a [Slurm cluster](ADVANCED.md).
5959
To run the example locally, launch a [NeMo container](https://catalog.ngc.nvidia.com/orgs/nvidia/containers/nemo) with version 25.09 or higher. Clone the `TensorRT-Model-Optimizer` repository and `NeMo` repository (checkout a specific commit for NeMo), then mount it onto your docker container.
6060

6161
- `git clone https://github.com/NVIDIA/TensorRT-Model-Optimizer.git`
62-
- `git clone https://github.com/NVIDIA-NeMo/NeMo.git`
6362
- `git clone https://github.com/NVIDIA/Megatron-LM.git`
6463

6564
Example docker command:
6665

6766
```bash
68-
docker run -v /home/user/:/home/user/ -v /home/user/NeMo:/opt/NeMo -v /home/user/TensorRT-Model-Optimizer/modelopt/:/usr/local/lib/python3.12/dist-packages/modelopt -v /home/user/Megatron-LM:/opt/megatron-lm --gpus all -it --shm-size 20g --rm nvcr.io/nvidia/nemo:25.09 bash
67+
docker run -v /home/user/:/home/user/ -v /home/user/TensorRT-Model-Optimizer/modelopt/:/usr/local/lib/python3.12/dist-packages/modelopt -v /home/user/Megatron-LM:/opt/megatron-lm --gpus all -it --shm-size 20g --rm nvcr.io/nvidia/nemo:25.09 bash
6968
```
7069

7170
You will also need to set your Huggingface token with `export HF_TOKEN=<your-token>`. You may also need to enable write access to the docker container to the `examples/nemo_run` folder by doing `chmod 777 nemo_run` so that logs can be written.

tests/_test_utils/torch_dist/plugins/megatron_common.py

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -379,9 +379,7 @@ def run_mcore_inference(
379379
)
380380

381381
# Note: This is returned in all TP ranks or last PP stage in PP models
382-
print("inference_input size", inference_input["tokens"].shape)
383382
logits = wrapped_model.run_one_forward_step(inference_input)
384-
print("logits size", logits.shape)
385383
logits = broadcast_from_last_pipeline_stage(
386384
[batch_size, model.max_sequence_length, model.vocab_size],
387385
dtype=torch.bfloat16 if model.config.bf16 else torch.float32,

tests/gpu/torch/quantization/plugins/test_apex.py

Lines changed: 11 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -23,7 +23,7 @@
2323
from _test_utils.torch_quantization.models import RegularQuantModelForTP
2424
from _test_utils.torch_quantization.quantize_common import (
2525
auto_quantize_helper,
26-
tensor_parallel_test_helper,
26+
data_tensor_context_parallel_test_helper,
2727
)
2828

2929
import modelopt.torch.quantization as mtq
@@ -58,7 +58,11 @@ def forward(self, x):
5858
x = x[0]
5959
return x
6060

61-
def get_dummy_input(self):
61+
def get_dummy_input(self, seed: int | None = None):
62+
if seed is not None:
63+
gen = torch.Generator()
64+
gen.manual_seed(seed)
65+
return torch.randn(1, 4, 32, generator=gen)
6266
return torch.randn(1, 4, 32)
6367

6468

@@ -106,8 +110,11 @@ def _test_tensor_parallel_helper(config, rank, size):
106110
model_parallel_cuda_manual_seed(SEED)
107111
model = ApexModel().cuda()
108112

109-
tensor_parallel_test_helper(
110-
model, config, get_tensor_model_parallel_group(), get_data_parallel_group()
113+
data_tensor_context_parallel_test_helper(
114+
model,
115+
config,
116+
tp_group=get_tensor_model_parallel_group(),
117+
dp_group=get_data_parallel_group(),
111118
)
112119

113120

0 commit comments

Comments
 (0)