Skip to content

Commit 255eb1a

Browse files
authored
fix: Fix auto quantize multi-gpu tests (NVIDIA#629)
## What does this PR do? **Type of change:** ? Bug fix **Overview:** ? move tensor to cpu before gathering ## Usage <!-- You can potentially add a usage example below. --> ```python # Add a code snippet demonstrating how to use this ``` ## Testing ``` pytest tests/gpu/torch/quantization/plugins/test_megatron.py::test_auto_quantize ``` ## Before your PR is "*Ready for review*" <!-- If you haven't finished some of the above items you can still open `Draft` PR. --> - **Make sure you read and follow [Contributor guidelines](https://github.com/NVIDIA/TensorRT-Model-Optimizer/blob/main/CONTRIBUTING.md)** and your commits are signed. - **Is this change backward compatible?**: Yes/No <!--- If No, explain why. --> - **Did you write any new necessary tests?**: Yes/No - **Did you add or update any necessary documentation?**: Yes/No - **Did you update [Changelog](https://github.com/NVIDIA/TensorRT-Model-Optimizer/blob/main/CHANGELOG.rst)?**: Yes/No <!--- Only for new features, API changes, critical bug fixes or bw breaking changes. --> ## Additional Information <!-- E.g. related issue. --> Signed-off-by: Fridah-nv <[email protected]>
1 parent 5ade7b0 commit 255eb1a

File tree

1 file changed

+2
-1
lines changed

1 file changed

+2
-1
lines changed

modelopt/torch/quantization/algorithms.py

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -273,12 +273,13 @@ def get_score(self, recipe: QuantRecipe) -> float:
273273
if parallel_state.expert_model_parallel_group.is_initialized():
274274
# TODO: Support expert model parallelism for score estimation
275275
warnings.warn("AutoQuantize does not support expert model parallelism yet.")
276+
importance = importance.cpu()
276277
importance = DistributedProcessGroup.get_dist_syncd_obj(
277278
importance,
278279
[parallel_state.tensor_parallel_group, parallel_state.data_parallel_group],
279280
sum,
280281
)
281-
total_score += importance.cpu().item()
282+
total_score += importance.item()
282283
return total_score
283284

284285
def get_cost(self, recipe: QuantRecipe) -> float:

0 commit comments

Comments
 (0)