[NVBUG: 5535437] Refactor engine_dir to checkpoint_dir in PTQ examples. #365

cjluo-nv · 2025-09-24T22:25:26Z

What does this PR do?

Type of change: Refactor

Overview: ?

We no longer uses TRT-LLM engines in our examples. All references of the keyword engine are change to checkpoint

Summary by CodeRabbit

Documentation
- Expanded deprecation note for the TensorRT backend and updated README headings; generalized VLM quantization description to reference supported multi‑modal models.
- Updated example commands and docs to use checkpoint_dir instead of engine_dir.
Chores
- Renamed CLI flag engine_dir → checkpoint_dir across evaluation, PTQ, and helper scripts.
- Added new quantization-related script options and clarified usage text and log messages.
Refactor
- Removed a deprecated helper and a bundled VLM CLI-argument utility to simplify interfaces.

Signed-off-by: Chenjie Luo <[email protected]>

coderabbitai · 2025-09-24T22:25:34Z

Walkthrough

Renames TensorRT‑LLM parameter/flag from engine_dir to checkpoint_dir across LLM eval and PTQ examples and scripts, updates docs and logs, adjusts tokenizer fallback to checkpoint_dir, removes two helper functions, expands CHANGELOG deprecation note, generalizes a VLM README paragraph, and adds quant-related flags to run_fastchat.sh.

Changes

Cohort / File(s)	Summary of changes
Deprecation note `CHANGELOG.rst`	Expanded TRT backend deprecation note to include that `engine_dir` is replaced with `checkpoint_dir` in examples/llm_ptq and examples/vlm_ptq; other bullets unchanged.
LLM eval docs & script `examples/llm_eval/README.md`, `examples/llm_eval/run_fastchat.sh`	Replaced `--engine_dir` with `--checkpoint_dir` and updated headings; `run_fastchat.sh` also adds optional quant flags (`--answer_file`, `--quant_cfg`, `--auto_quantize_bits`, `--calib_size`, `--calib_batch_size`) and forwards them to the Python invocation.
LLM eval Python `examples/llm_eval/gen_model_answer.py`, `examples/llm_eval/lm_eval_tensorrt_llm.py`, `examples/llm_eval/mmlu.py`	Renamed parameters, CLI options, variables, and call sites from `engine_dir` → `checkpoint_dir`; updated function signatures and argument parsing; model type inference now uses basename of `checkpoint_dir`; log message adjusted.
LLM PTQ flow `examples/llm_ptq/run_tensorrt_llm.py`, `examples/llm_ptq/scripts/huggingface_example.sh`	Switched CLI flag and internal references from `engine_dir` → `checkpoint_dir`; tokenizer fallback now sources files from `checkpoint_dir`; updated example invocations and minor docstring wording.
Removed helpers `examples/llm_ptq/example_utils.py`, `examples/vlm_ptq/utils.py`	Deleted `get_mode_type_from_engine_dir(...)` and `add_common_args(parser)` functions, removing those public utilities/CLI configuration helpers.
VLM PTQ docs `examples/vlm_ptq/README.md`	Generalized description to reference “supported Hugging Face multi‑modal models” and removed explicit default parameter details.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Poem

I hop through dirs where engines slept,
Now checkpoints guide the paths we've kept.
Flags reborn, the scripts align—
Quant carrots glint, a tidy line.
Old helpers burrowed, waved goodbye,
I nibble bugs and watch tests fly. 🥕

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	You can run `@coderabbitai generate docstrings` to improve docstring coverage.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check	✅ Passed	The title correctly highlights the engine_dir to checkpoint_dir refactor but restricts it to PTQ examples even though the pull request also renames parameters and documentation across evaluation scripts and the CHANGELOG, making it only partially descriptive of the full changeset.

✨ Finishing touches

📝 Generate Docstrings

🧪 Generate unit tests

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch chenjiel/remove_engines

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (5)

examples/llm_eval/README.md (1)

164-167: Restore -e flag in MT-Bench command.

run_fastchat.sh still requires -e/--checkpoint_dir. The updated README example drops that flag, so anyone following the docs will hit the usage error and the workflow stops. Please fix the command.
-bash run_fastchat.sh -h <HF model folder or model card> <Quantized checkpoint dir>
+bash run_fastchat.sh -h <HF model folder or model card> -e <Quantized checkpoint dir>

examples/llm_eval/gen_model_answer.py (3)

139-156: Avoid NameError under Ray: pass args into get_model_answers

get_model_answers references args.* but args isn’t passed; under Ray workers this will raise NameError. Pass args explicitly.

Apply this diff:

         get_answers_func(
             model_path,
             model_id,
             questions[i : i + chunk_size],
             answer_file,
             max_new_token,
             num_choices,
             num_gpus_per_model,
             max_gpu_memory,
             dtype=dtype,
             revision=revision,
             top_p=top_p,
             temperature=temperature,
             checkpoint_dir=checkpoint_dir,
             nim_model=nim_model,
+            args=args,
         )

164-179: Accept args in get_model_answers; stop relying on module-global args

Make args an explicit parameter to be Ray‑safe.

Apply this diff:

 @torch.inference_mode()
 def get_model_answers(
     model_path,
     model_id,
     questions,
     answer_file,
     max_new_token,
     num_choices,
     num_gpus_per_model,
     max_gpu_memory,
     dtype,
     revision,
     top_p=None,
     temperature=None,
-    checkpoint_dir=None,
-    nim_model=None,
+    checkpoint_dir=None,
+    nim_model=None,
+    args=None,
 ):

208-219: Guard optional quantization args to avoid crashes under Ray

Use getattr on args to avoid AttributeError/NameError if args is None or missing fields.

Apply this diff:

-        tokenizer = get_tokenizer(model_path, trust_remote_code=args.trust_remote_code)
-        if args.quant_cfg:
+        trust_remote_code = getattr(args, "trust_remote_code", False)
+        tokenizer = get_tokenizer(model_path, trust_remote_code=trust_remote_code)
+        if getattr(args, "quant_cfg", None):
             quantize_model(
                 model,
-                args.quant_cfg,
+                args.quant_cfg,
                 tokenizer,
-                args.calib_batch_size,
-                args.calib_size,
-                args.auto_quantize_bits,
+                getattr(args, "calib_batch_size", 4),
+                getattr(args, "calib_size", 512),
+                getattr(args, "auto_quantize_bits", None),
                 test_generated=False,
             )

examples/llm_eval/mmlu.py (1)

256-270: Fix Qwen pad/eos detection after checkpoint_dir rename.

When the checkpoint directory comes from scripts/huggingface_example.sh its basename looks like saved_models_<model>_...; your new parsing now sets model_type to "saved", so the Qwen-specific branch never runs. That regresses Qwen runs (pad/eos stay unset and generation breaks). Please restore the previous behavior by stripping the prefix and matching on lower-cased names (or reusing the old helper).
-        last_part = os.path.basename(kwargs["checkpoint_dir"])
-        model_type = last_part.split("_")[0]
+        last_part = os.path.basename(os.path.normpath(kwargs["checkpoint_dir"]))
+        if last_part.startswith("saved_models_"):
+            last_part = last_part[len("saved_models_") :]
+        model_type = last_part.split("_")[0].lower()
         # Some models require to set pad_token and eos_token based on external config (e.g., qwen)
-        if model_type == "qwen":
+        if model_type.startswith("qwen"):
             tokenizer.pad_token = tokenizer.convert_ids_to_tokens(151643)
             tokenizer.eos_token = tokenizer.convert_ids_to_tokens(151643)

🧹 Nitpick comments (2)

examples/llm_eval/gen_model_answer.py (1)

430-439: CLI polish: mutually exclusive flags and deprecation alias

To avoid user confusion, make --checkpoint-dir and --nim-model mutually exclusive, and optionally accept deprecated --engine-dir as an alias that maps to checkpoint_dir.

Apply this diff:

-    parser.add_argument(
-        "--checkpoint-dir",
-        type=str,
-        help="The path to the model checkpoint directory.",
-    )
-    parser.add_argument(
-        "--nim-model",
-        type=str,
-        help="The NIM model handle to use",
-    )
+    group = parser.add_mutually_exclusive_group()
+    group.add_argument(
+        "--checkpoint-dir",
+        type=str,
+        help="Path to the model checkpoint directory (contains config.json).",
+    )
+    group.add_argument(
+        "--nim-model",
+        type=str,
+        help="The NIM model handle to use.",
+    )
+    # Deprecated alias for backward compatibility
+    parser.add_argument(
+        "--engine-dir",
+        dest="checkpoint_dir",
+        type=str,
+        help="DEPRECATED. Use --checkpoint-dir instead.",
+    )

examples/llm_ptq/run_tensorrt_llm.py (1)

31-31: Require --checkpoint_dir to fail fast on misconfiguration.

Without the flag, we cascade into harder-to-debug None-type errors when attempting to load the checkpoint. Marking it as required keeps the CLI experience sharp.
-    parser.add_argument("--checkpoint_dir", type=str)
+    parser.add_argument("--checkpoint_dir", type=str, required=True)

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 26c203a and 8845aba.

📒 Files selected for processing (11)

CHANGELOG.rst (1 hunks)
examples/llm_eval/README.md (3 hunks)
examples/llm_eval/gen_model_answer.py (6 hunks)
examples/llm_eval/lm_eval_tensorrt_llm.py (2 hunks)
examples/llm_eval/mmlu.py (2 hunks)
examples/llm_eval/run_fastchat.sh (5 hunks)
examples/llm_ptq/example_utils.py (0 hunks)
examples/llm_ptq/run_tensorrt_llm.py (4 hunks)
examples/llm_ptq/scripts/huggingface_example.sh (4 hunks)
examples/vlm_ptq/README.md (1 hunks)
examples/vlm_ptq/utils.py (0 hunks)

💤 Files with no reviewable changes (2)

examples/vlm_ptq/utils.py
examples/llm_ptq/example_utils.py

🧰 Additional context used

🧠 Learnings (1)

📚 Learning: 2025-09-16T20:14:34.768Z

Learnt from: realAsma
PR: NVIDIA/TensorRT-Model-Optimizer#318
File: modelopt/torch/quantization/plugins/transformers_trainer.py:191-191
Timestamp: 2025-09-16T20:14:34.768Z
Learning: The TensorRT-Model-Optimizer project only supports PyTorch >= 2.6, so using the `weights_only` parameter in torch.load calls is acceptable and doesn't require backward compatibility handling.

Applied to files:

CHANGELOG.rst

🧬 Code graph analysis (3)

examples/llm_ptq/run_tensorrt_llm.py (1)

modelopt/deploy/llm/generate.py (1)

LLM (53-291)

examples/llm_eval/gen_model_answer.py (2)

examples/llm_ptq/example_utils.py (1)

get_tokenizer (39-61)

modelopt/deploy/llm/generate.py (1)

LLM (53-291)

examples/llm_eval/lm_eval_tensorrt_llm.py (1)

modelopt/deploy/llm/generate.py (2)

LLM (53-291)

max_seq_len (154-156)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)

GitHub Check: linux
GitHub Check: wait-checks / wait
GitHub Check: build-docs
GitHub Check: code-quality

🔇 Additional comments (16)

examples/llm_eval/gen_model_answer.py (3)

121-124: Param rename to checkpoint_dir looks good

Signature change is consistent with the repo-wide refactor.

262-281: Branching on checkpoint_dir for generation is correct

The fallback to model.generate vs. LLM.generate_tokens aligns with the refactor.

505-508: LGTM: call site updated to pass checkpoint_dir

Consistent with the new argument name.

CHANGELOG.rst (1)

11-11: Changelog entry reflects the checkpoint_dir refactor cleanly.

The note aligns with the code updates I’ve seen—thanks for keeping the release notes in sync.

examples/vlm_ptq/README.md (1)

59-59: README wording matches the newly generalized workflow.

Appreciate the neutral phrasing; it now reads correctly for any supported Hugging Face multi-modal checkpoint.

examples/llm_ptq/run_tensorrt_llm.py (2)

52-54: Tokenzier fallback matches the renamed artifact layout.

Good to see the comment updated alongside the directory rename; behavior stays consistent.

69-70: LLM instantiation now targets the checkpoint directory.

This is the key behavioral switch, and it lines up with the upstream LLM API surface.

examples/llm_ptq/scripts/huggingface_example.sh (4)

161-164: Enc-dec guidance mirrors the checkpoint export wording.

Nice touch keeping the messaging clear for users hitting that branch.

190-190: Run step now hands the checkpoint directory to the eval script.

Matches the refactor plan—no issues spotted.

232-232: lm_eval invocation tracks the new flag name.

checkpoint_dir is propagated into the harness exactly where it needs to be.

262-263: MMLU call is consistent with the renamed flag.

Thanks for catching this path as well; keeps post-quant eval frictionless.

examples/llm_eval/lm_eval_tensorrt_llm.py (1)

45-64: Constructor rename lines up with the shared LLM API.

The assertions and log message now reflect the checkpoint-based workflow; looks solid.

examples/llm_eval/run_fastchat.sh (4)

23-35: Usage text now points readers at the checkpoint directory.

The example command updates are clear and match the rest of the refactor.

44-61: CLI parsing adopts checkpoint_dir cleanly.

The rename is threaded through without introducing regressions.

99-124: Optional quant args are folded in neatly.

Good balance of flexibility without breaking the existing call sites.

142-149: Final command forwards the new flag to FastChat.

Output generation should continue to work with the checkpointed artifacts.

examples/llm_eval/gen_model_answer.py

codecov · 2025-09-24T22:38:38Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 73.46%. Comparing base (26c203a) to head (dcf507e).
⚠️ Report is 3 commits behind head on main.

Additional details and impacted files

@@           Coverage Diff           @@
##             main     #365   +/-   ##
=======================================
  Coverage   73.46%   73.46%           
=======================================
  Files         172      172           
  Lines       17640    17640           
=======================================
  Hits        12959    12959           
  Misses       4681     4681

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

examples/llm_eval/gen_model_answer.py

Signed-off-by: Chenjie Luo <[email protected]>

coderabbitai

Actionable comments posted: 2

🧹 Nitpick comments (1)

examples/llm_eval/gen_model_answer.py (1)
426-429: CLI rename accepted; consider a hidden alias for backward compatibility

Optional: add a suppressed --engine-dir alias that maps to checkpoint_dir to avoid breaking existing scripts.
 parser.add_argument(
-    "--checkpoint-dir",
+    "--checkpoint-dir",
     type=str,
     help="The path to the model checkpoint directory.",
 )
+parser.add_argument(
+    "--engine-dir",
+    dest="checkpoint_dir",
+    type=str,
+    help=argparse.SUPPRESS,
+)

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 8845aba and dcf507e.

📒 Files selected for processing (1)

examples/llm_eval/gen_model_answer.py (6 hunks)

🧰 Additional context used

🧬 Code graph analysis (1)

examples/llm_eval/gen_model_answer.py (2)

examples/llm_ptq/example_utils.py (1)

get_tokenizer (39-61)

modelopt/deploy/llm/generate.py (1)

LLM (53-291)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)

GitHub Check: linux
GitHub Check: wait-checks / wait
GitHub Check: build-docs
GitHub Check: code-quality

🔇 Additional comments (3)

examples/llm_eval/gen_model_answer.py (3)

258-277: Branch on checkpoint_dir for TRT-LLM vs HF: OK

Logic is clear; uses generate_tokens in TRT-LLM path and decodes correctly.

501-504: Main call updated to checkpoint_dir: good

Callsite matches the renamed argument.

177-193: Fix NameError and Ray-safety; improve Qwen detection; pass trust_remote_code to LLM

get_model_answers references args but doesn’t receive it. This will break in Ray workers and is fragile even locally. Also, Qwen detection via split is brittle.

Apply:

 def get_model_answers(
@@
-    top_p=None,
-    temperature=None,
-    checkpoint_dir=None,
-    nim_model=None,
+    top_p=None,
+    temperature=None,
+    checkpoint_dir=None,
+    nim_model=None,
+    trust_remote_code=False,
+    quant_cfg=None,
+    calib_batch_size=4,
+    calib_size=512,
+    auto_quantize_bits=None,
 ):
@@
-    tokenizer = get_tokenizer(model_path, trust_remote_code=args.trust_remote_code)
+    tokenizer = get_tokenizer(model_path, trust_remote_code=trust_remote_code)
     if checkpoint_dir:
-        # get model type
-        last_part = os.path.basename(checkpoint_dir)
-        model_type = last_part.split("_")[0]
-        # Some models require to set pad_token and eos_token based on external config (e.g., qwen)
-        if model_type == "qwen":
+        # Some models require pad/eos tokens (e.g., Qwen)
+        last_part = os.path.basename(checkpoint_dir)
+        if "qwen" in last_part.lower():
             tokenizer.pad_token = tokenizer.convert_ids_to_tokens(151643)
             tokenizer.eos_token = tokenizer.convert_ids_to_tokens(151643)
 
         assert LLM is not None, "tensorrt_llm APIs could not be imported."
-        model = LLM(checkpoint_dir, tokenizer=tokenizer)
+        model = LLM(checkpoint_dir, tokenizer=tokenizer, trust_remote_code=trust_remote_code)

Also update the quantization block (outside the diff range) to use the newly passed params instead of args:

# replace:
# if args.quant_cfg:
#     quantize_model(model, args.quant_cfg, tokenizer, args.calib_batch_size, args.calib_size, args.auto_quantize_bits, test_generated=False)

if quant_cfg:
    quantize_model(
        model,
        quant_cfg,
        tokenizer,
        calib_batch_size,
        calib_size,
        auto_quantize_bits,
        test_generated=False,
    )

coderabbitai · 2025-09-25T17:43:08Z

examples/llm_eval/gen_model_answer.py

+    checkpoint_dir,
    nim_model,
    args,
 ):


🛠️ Refactor suggestion | 🟠 Major

🧩 Analysis chain

Rename to checkpoint_dir in run_eval: looks good

Parameter rename is consistent.

Run to confirm no stale uses remain:

🏁 Script executed:

#!/bin/bash # Find lingering engine_dir references (flags, kwargs, vars) rg -n -C2 -g '!**/venv/**' -g '!**/build/**' -g '!**/dist/**' -e '\bengine_dir\b' -e '--engine-dir'

Length of output: 148

🏁 Script executed:

#!/bin/bash # Re-run search including all files and excluding venv, build, dist rg -n -C2 -g '*' -g '!venv/**' -g '!build/**' -g '!dist/**' -e '\bengine_dir\b' -e '--engine-dir'

Length of output: 5278

Stale engine_dir references in Windows accuracy benchmark examples

examples/windows/accuracy_benchmark/README.md (lines 165, 177): replace --engine_dir with --checkpoint-dir

examples/windows/accuracy_benchmark/mmlu_benchmark.py (line 523): rename CLI arg and variable from engine_dir to checkpoint_dir

examples/windows/accuracy_benchmark/trtllm_utils.py (around line 317): update parser.add_argument("--engine_dir",…) and all helper functions to use checkpoint_dir instead

🤖 Prompt for AI Agents

In examples/llm_eval/gen_model_answer.py around lines 121-124, ensure any remaining references to the old CLI/variable name engine_dir are replaced with checkpoint_dir; across the Windows accuracy benchmark examples update the README (lines 165,177) to use --checkpoint-dir, rename CLI args and variables in examples/windows/accuracy_benchmark/mmlu_benchmark.py (line ~523) from engine_dir to checkpoint_dir, and in examples/windows/accuracy_benchmark/trtllm_utils.py update parser.add_argument("--engine_dir", ...) to --checkpoint-dir and rename all helper function parameters and internal variables from engine_dir to checkpoint_dir so names and docs are consistent.

coderabbitai · 2025-09-25T17:43:08Z

examples/llm_eval/gen_model_answer.py

+            checkpoint_dir=checkpoint_dir,
            nim_model=nim_model,
        )


🛠️ Refactor suggestion | 🟠 Major

Pass explicit flags to remote worker; avoid relying on globals inside Ray

When use_ray is true, get_model_answers runs out-of-process. Pass required options explicitly to avoid NameError on globals.

Apply:

checkpoint_dir=checkpoint_dir, + trust_remote_code=args.trust_remote_code, + quant_cfg=args.quant_cfg, + calib_batch_size=args.calib_batch_size, + calib_size=args.calib_size, + auto_quantize_bits=args.auto_quantize_bits, nim_model=nim_model,

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

checkpoint_dir=checkpoint_dir,

nim_model=nim_model,

)

checkpoint_dir=checkpoint_dir,

trust_remote_code=args.trust_remote_code,

quant_cfg=args.quant_cfg,

calib_batch_size=args.calib_batch_size,

calib_size=args.calib_size,

auto_quantize_bits=args.auto_quantize_bits,

nim_model=nim_model,

)

🤖 Prompt for AI Agents

In examples/llm_eval/gen_model_answer.py around lines 153-155, the call that runs get_model_answers out-of-process with use_ray=True relies on module globals and will raise NameError inside the remote worker; update the remote invocation to pass all required options explicitly (e.g., flags/variables previously referenced as globals such as model/config/checkpoint_dir/nim_model/tokenizer/device/seed/other runtime options) and update get_model_answers' remote-compatible signature to accept those parameters so the worker receives everything it needs; ensure only serializable types are passed and remove reliance on globals inside the function body.

…s. (#365) Signed-off-by: Chenjie Luo <[email protected]> Signed-off-by: Chenjie Luo <[email protected]>

…s. (#365) Signed-off-by: Chenjie Luo <[email protected]> Signed-off-by: Chenjie Luo <[email protected]> Signed-off-by: Ye Yu <[email protected]>

Refactor engine_dir to checkpoint_dir in PTQ examples.

8845aba

Signed-off-by: Chenjie Luo <[email protected]>

cjluo-nv requested review from a team as code owners September 24, 2025 22:25

cjluo-nv requested a review from Edwardf0t1 September 24, 2025 22:25

cjluo-nv requested a review from kevalmorabia97 September 24, 2025 22:25

cjluo-nv changed the title ~~Refactor engine_dir to checkpoint_dir in PTQ examples.~~ [NVBUG: 5535437] Refactor engine_dir to checkpoint_dir in PTQ examples. Sep 24, 2025

coderabbitai bot reviewed Sep 24, 2025

View reviewed changes

examples/llm_eval/gen_model_answer.py Show resolved Hide resolved

kevalmorabia97 approved these changes Sep 25, 2025

View reviewed changes

examples/llm_eval/gen_model_answer.py Show resolved Hide resolved

Update gen_model_answer.py

dcf507e

Signed-off-by: Chenjie Luo <[email protected]>

coderabbitai bot reviewed Sep 25, 2025

View reviewed changes

meenchen approved these changes Sep 25, 2025

View reviewed changes

cjluo-nv merged commit 598b9ce into main Sep 25, 2025
27 checks passed

cjluo-nv deleted the chenjiel/remove_engines branch September 25, 2025 19:07

kevalmorabia97 pushed a commit that referenced this pull request Sep 25, 2025

[NVBUG: 5535437] Refactor engine_dir to checkpoint_dir in PTQ example…

816323e

…s. (#365) Signed-off-by: Chenjie Luo <[email protected]> Signed-off-by: Chenjie Luo <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[NVBUG: 5535437] Refactor engine_dir to checkpoint_dir in PTQ examples. #365

[NVBUG: 5535437] Refactor engine_dir to checkpoint_dir in PTQ examples. #365

Uh oh!

cjluo-nv commented Sep 24, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Sep 24, 2025 •

edited

Loading

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

codecov bot commented Sep 24, 2025 •

edited

Loading

Uh oh!

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Sep 25, 2025

Uh oh!

coderabbitai bot Sep 25, 2025

Uh oh!

Uh oh!

Uh oh!

-            checkpoint_dir=checkpoint_dir,
-            nim_model=nim_model,
-        )
+            checkpoint_dir=checkpoint_dir,
+            trust_remote_code=args.trust_remote_code,
+            quant_cfg=args.quant_cfg,
+            calib_batch_size=args.calib_batch_size,
+            calib_size=args.calib_size,
+            auto_quantize_bits=args.auto_quantize_bits,
+            nim_model=nim_model,
+        )

[NVBUG: 5535437] Refactor engine_dir to checkpoint_dir in PTQ examples. #365

[NVBUG: 5535437] Refactor engine_dir to checkpoint_dir in PTQ examples. #365

Uh oh!

Conversation

cjluo-nv commented Sep 24, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Sep 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Poem

Pre-merge checks and finishing touches

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

codecov bot commented Sep 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Sep 25, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Sep 25, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

cjluo-nv commented Sep 24, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Sep 24, 2025 •

edited

Loading

codecov bot commented Sep 24, 2025 •

edited

Loading