Update onnx ptq test to be single threaded and make it faster #415

ajrasane · 2025-10-08T18:37:52Z

What does this PR do?

Type of change: Bug fix

Overview:

Update test_onnx_ptq.sh to be single threaded

Testing

bash test_onnx_ptq.sh <path_to_imagenet> <path_to_models>

Before your PR is "Ready for review"

Make sure you read and follow Contributor guidelines and your commits are signed.
Is this change backward compatible?: Yes/No
Did you write any new necessary tests?: Yes/No
Did you add or update any necessary documentation?: Yes/No
Did you update Changelog?: Yes/No

Summary by CodeRabbit

Tests
- Quantization and evaluation now run strictly sequentially per model and quantization mode; GPU-scoped parallelism and background jobs removed. Evaluation runs only when eval_mode is enabled; final aggregation happens conditionally. Wall-time is reported as HH:MM:SS. Added explicit dynamo=False during ONNX export in test models.
New Features
- Added timing_cache_path support across evaluation and engine build flows.
- Evaluation CLI: added --eval usage option, eval_mode flag, eval_size parameter, and made engine path optional.
- Engine build API accepts timing_cache_path parameter.
Bug Fixes / Improvements
- Simplified device handling for calibration and removed per-GPU assignment logic. Updated logging/command output to reflect sequential processing and timing cache usage.

Signed-off-by: ajrasane <[email protected]>

coderabbitai · 2025-10-08T18:38:04Z

Walkthrough

Sequentializes the ONNX PTQ example and evaluation; adds timing_cache_path CLI and threading from evaluate.py through trt_client into engine_builder to pass a timing-cache file to trtexec; adds eval_mode/eval_size, simplifies CUDA targeting, and formats wall-time output.

Changes

Cohort / File(s)	Summary
PTQ test script (sequential run & logging) `tests/examples/test_onnx_ptq.sh`	Removes background/GPU-parallel quantization and PID handling; adds `--eval` / `eval_mode` gating and `eval_size`; changes calibration device from `cuda:0` to `cuda`; sequentializes quantization and evaluation per model and `quant_mode`; passes `--timing_cache_path` to evaluation; reports wall-time in HH:MM:SS.
Evaluate CLI: timing cache & optional engine `examples/onnx_ptq/evaluate.py`	Adds `--timing_cache_path` CLI arg (string, default `None`); makes `--engine_path` optional; forwards `timing_cache_path` into `compilation_args` for IR→compiled flow.
TRT client: propagate timing cache `modelopt/torch/_deploy/_runtime/trt_client.py`	Accepts `timing_cache_path` in compilation args and forwards it to `build_engine` during IR→compiled engine construction.
Engine builder: timing cache support `modelopt/torch/_deploy/_runtime/tensorrt/engine_builder.py`	Adds `timing_cache_path: Path
ONNX export runtime behavior `tests/_test_utils/onnx_quantization/lib_test_models.py`	Adds `dynamo=False` to `torch.onnx.export` call to disable Dynamo integration during export.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant Script as tests/examples/test_onnx_ptq.sh
  participant Quant as quantize CLI
  participant Eval as evaluate CLI

  Note over Script,Quant: For each model -> quant_mode runs strictly sequentially
  loop For each quant_mode (sequential)
    Script->>Quant: run quantize(model, quant_mode)
    Quant-->>Script: quant artifacts
  end

  Note over Script,Eval: Evaluation gated by eval_mode
  loop For each quant_mode (sequential, if eval_mode)
    Script->>Eval: run evaluate(model, quant_mode, --timing_cache_path)
    Eval-->>Script: metrics
  end

  Script->>Script: aggregate_results.py (only if eval_mode)

sequenceDiagram
  autonumber
  participant Eval as evaluate.py
  participant TRTClient as trt_client.ir_to_compiled
  participant Builder as engine_builder.build_engine
  participant Trtexec as trtexec (external)

  Eval->>TRTClient: ir_to_compiled(..., timing_cache_path=path)
  TRTClient->>Builder: build_engine(..., timing_cache_path=path)
  Builder->>Trtexec: run trtexec --timing-cache=path
  Trtexec-->>Builder: engine artifact
  Builder-->>TRTClient: compiled engine
  TRTClient-->>Eval: compiled result

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20–30 minutes

Poem

I hop in order, one small paced lap,
No background bustle, no GPU map.
I leave timing crumbs for trtexec to keep,
Eval in sequence, tidy and neat.
A rabbit's tidy run — a tiny clap! 🥕🐇

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	You can run `@coderabbitai generate docstrings` to improve docstring coverage.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check	✅ Passed	The pull request title "Update onnx ptq test to be single threaded and make it faster" directly aligns with the primary objective and main changes in the PR. The raw summary confirms that the core modification in test_onnx_ptq.sh is replacing parallel quantization with strictly sequential quantization, removing GPU parallelism, and implementing supporting optimizations like timing cache support and improved wall-time reporting. The title is concise, specific, and clearly communicates the developer's main intent—converting the test from parallel to single-threaded execution for both correctness and performance improvement.

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch ajrasane/fix_onnx_ptq_test

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 7be261c and f74f12e.

📒 Files selected for processing (1)

tests/_test_utils/onnx_quantization/lib_test_models.py (1 hunks)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)

GitHub Check: linux
GitHub Check: wait-checks / wait
GitHub Check: wait-checks / wait
GitHub Check: build-docs
GitHub Check: code-quality

🔇 Additional comments (1)

tests/_test_utils/onnx_quantization/lib_test_models.py (1)

127-127: No issues found—PyTorch version compatibility is confirmed.

The minimum PyTorch version in the project is torch>=2.6 (from setup.py), which fully supports the dynamo parameter introduced in PyTorch 2.0+. Adding explicit dynamo=False is safe and aligns with best practices: while it defaults to False in PyTorch 2.6, being explicit ensures the traditional TorchScript path is used consistently, even if PyTorch changes the default in future versions.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (2)

tests/examples/test_onnx_ptq.sh (2)

147-153: Quote paths and make calibration device configurable

Foreground quantization change looks good. Quote path/vars to avoid word splitting and allow overriding the calibration device without editing the script.

Apply:

-        python -m modelopt.onnx.quantization \
-            --onnx_path=$model_dir/fp16/model.onnx \
-            --quantize_mode=$quant_mode \
-            --calibration_data=$calib_data_path \
-            --output_path=$model_dir/$quant_mode/model.quant.onnx \
-            --calibration_eps=cuda:0
+        python -m modelopt.onnx.quantization \
+            --onnx_path="$model_dir/fp16/model.onnx" \
+            --quantize_mode="$quant_mode" \
+            --calibration_data="$calib_data_path" \
+            --output_path="$model_dir/$quant_mode/model.quant.onnx" \
+            --calibration_eps="${CALIB_DEVICE:-cuda:0}"

32-32: Remove unused variable

nvidia_gpu_count is no longer used after making the script single‑threaded. Safe to delete.

-nvidia_gpu_count=$(nvidia-smi --query-gpu=count --format=csv,noheader,nounits | head -n 1)

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 3a76d28 and 685183d.

📒 Files selected for processing (1)

tests/examples/test_onnx_ptq.sh (3 hunks)

🧰 Additional context used

🪛 Shellcheck (0.11.0)

tests/examples/test_onnx_ptq.sh

[error] 181-181: Arrays implicitly concatenate in [[ ]]. Use a loop (or explicit * instead of @).

(SC2199)

[warning] 181-181: Remove quotes from right-hand side of =~ to match as a regex rather than literally.

(SC2076)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)

GitHub Check: linux
GitHub Check: build-docs
GitHub Check: code-quality

coderabbitai · 2025-10-08T18:44:18Z

tests/examples/test_onnx_ptq.sh

+        echo "Starting evaluation of $model_name for mode: $quant_mode"
        if [[ " ${latency_models[@]} " =~ " $model_name " ]]; then
-            CUDA_VISIBLE_DEVICES=$gpu_id python evaluate.py \
+            python evaluate.py \
                --onnx_path=$eval_model_path \
                --engine_path=$engine_path \
                --model_name="${timm_model_name[$model_name]}" \
                --engine_precision=$precision \
-                --results_path=$model_dir/$quant_mode/${model_name}_${quant_mode}.csv &
+                --results_path=$model_dir/$quant_mode/${model_name}_${quant_mode}.csv
        else
-            CUDA_VISIBLE_DEVICES=$gpu_id python evaluate.py \
+            python evaluate.py \
                --onnx_path=$eval_model_path \
                --engine_path=$engine_path \
                --imagenet_path=$imagenet_path \
                --eval_data_size=$calib_size \
                --batch_size $batch_size \
                --model_name="${timm_model_name[$model_name]}" \
                --engine_precision=$precision \
-                --results_path=$model_dir/$quant_mode/${model_name}_${quant_mode}.csv &
-        fi
-        pids+=($!)
-    done
-
-    # Wait for all evaluation processes to complete for this model
-    error_occurred=false
-    for pid in "${pids[@]}"; do
-        if ! wait $pid; then
-            echo "ERROR: Evaluation process (PID: $pid) failed"
-            error_occurred=true
+                --results_path=$model_dir/$quant_mode/${model_name}_${quant_mode}.csv
        fi


⚠️ Potential issue | 🟠 Major

Fix array membership test (shellcheck) and quote all args

Use glob match with ${array[*]} for membership; satisfies SC2199/SC2076 and avoids regex pitfalls.

Quote all path/vars in evaluate.py calls to prevent word splitting.

Apply:

- echo "Starting evaluation of $model_name for mode: $quant_mode" - if [[ " ${latency_models[@]} " =~ " $model_name " ]]; then - python evaluate.py \ - --onnx_path=$eval_model_path \ - --engine_path=$engine_path \ - --model_name="${timm_model_name[$model_name]}" \ - --engine_precision=$precision \ - --results_path=$model_dir/$quant_mode/${model_name}_${quant_mode}.csv + echo "Starting evaluation of $model_name for mode: $quant_mode" + if [[ " ${latency_models[*]} " == *" $model_name "* ]]; then + python evaluate.py \ + --onnx_path="$eval_model_path" \ + --engine_path="$engine_path" \ + --model_name="${timm_model_name[$model_name]}" \ + --engine_precision="$precision" \ + --results_path="$model_dir/$quant_mode/${model_name}_${quant_mode}.csv" else - python evaluate.py \ - --onnx_path=$eval_model_path \ - --engine_path=$engine_path \ - --imagenet_path=$imagenet_path \ - --eval_data_size=$calib_size \ - --batch_size $batch_size \ - --model_name="${timm_model_name[$model_name]}" \ - --engine_precision=$precision \ - --results_path=$model_dir/$quant_mode/${model_name}_${quant_mode}.csv + python evaluate.py \ + --onnx_path="$eval_model_path" \ + --engine_path="$engine_path" \ + --imagenet_path="$imagenet_path" \ + --eval_data_size="$calib_size" \ + --batch_size "$batch_size" \ + --model_name="${timm_model_name[$model_name]}" \ + --engine_precision="$precision" \ + --results_path="$model_dir/$quant_mode/${model_name}_${quant_mode}.csv"

Based on Shellcheck hints.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

echo "Starting evaluation of $model_name for mode: $quant_mode"

if [[ " ${latency_models[@]} " =~ " $model_name " ]]; then

CUDA_VISIBLE_DEVICES=$gpu_id python evaluate.py \

python evaluate.py \

--onnx_path=$eval_model_path \

--engine_path=$engine_path \

--model_name="${timm_model_name[$model_name]}" \

--engine_precision=$precision \

--results_path=$model_dir/$quant_mode/${model_name}_${quant_mode}.csv &

--results_path=$model_dir/$quant_mode/${model_name}_${quant_mode}.csv

else

CUDA_VISIBLE_DEVICES=$gpu_id python evaluate.py \

python evaluate.py \

--onnx_path=$eval_model_path \

--engine_path=$engine_path \

--imagenet_path=$imagenet_path \

--eval_data_size=$calib_size \

--batch_size $batch_size \

--model_name="${timm_model_name[$model_name]}" \

--engine_precision=$precision \

--results_path=$model_dir/$quant_mode/${model_name}_${quant_mode}.csv &

fi

pids+=($!)

done

# Wait for all evaluation processes to complete for this model

error_occurred=false

for pid in "${pids[@]}"; do

if ! wait $pid; then

echo "ERROR: Evaluation process (PID: $pid) failed"

error_occurred=true

--results_path=$model_dir/$quant_mode/${model_name}_${quant_mode}.csv

fi

echo "Starting evaluation of $model_name for mode: $quant_mode"

if [[ " ${latency_models[*]} " == *" $model_name "* ]]; then

python evaluate.py \

--onnx_path="$eval_model_path" \

--engine_path="$engine_path" \

--model_name="${timm_model_name[$model_name]}" \

--engine_precision="$precision" \

--results_path="$model_dir/$quant_mode/${model_name}_${quant_mode}.csv"

else

python evaluate.py \

--onnx_path="$eval_model_path" \

--engine_path="$engine_path" \

--imagenet_path="$imagenet_path" \

--eval_data_size="$calib_size" \

--batch_size "$batch_size" \

--model_name="${timm_model_name[$model_name]}" \

--engine_precision="$precision" \

--results_path="$model_dir/$quant_mode/${model_name}_${quant_mode}.csv"

fi

🧰 Tools

🪛 Shellcheck (0.11.0)

[error] 181-181: Arrays implicitly concatenate in [[ ]]. Use a loop (or explicit * instead of @).

(SC2199)

[warning] 181-181: Remove quotes from right-hand side of =~ to match as a regex rather than literally.

(SC2076)

🤖 Prompt for AI Agents

In tests/examples/test_onnx_ptq.sh around lines 180-198, the array membership test and unquoted arguments are unsafe; replace the current if-condition with a glob-style check using the array expansion e.g. [[ " ${latency_models[*]} " == *" $model_name "* ]] to satisfy ShellCheck (SC2199/SC2076), and quote all variables passed to python evaluate.py (e.g. "$eval_model_path", "$engine_path", "$imagenet_path", "$calib_size", "$batch_size", "${timm_model_name[$model_name]}", "$precision", "$model_dir/$quant_mode/${model_name}_${quant_mode}.csv") so none are word-split or subject to globbing.

codecov · 2025-10-08T18:50:20Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 73.37%. Comparing base (3a76d28) to head (f74f12e).
⚠️ Report is 19 commits behind head on main.

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #415      +/-   ##
==========================================
- Coverage   73.79%   73.37%   -0.42%     
==========================================
  Files         171      180       +9     
  Lines       17591    17937     +346     
==========================================
+ Hits        12981    13162     +181     
- Misses       4610     4775     +165

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

kevalmorabia97 · 2025-10-08T19:16:39Z

tests/examples/test_onnx_ptq.sh

-        echo "Starting quantization of $model_name for mode: $quant_mode on GPU $gpu_id"
-        CUDA_VISIBLE_DEVICES=$gpu_id python -m modelopt.onnx.quantization \
+        echo "Starting quantization of $model_name for mode: $quant_mode"
+        python -m modelopt.onnx.quantization \


Does this support multi-gpu calibration to use all available GPUs instead of cuda:0?

Yes, we should be able to control which GPU is getting used using CUDA_VISIBLE_DEVICES. However for now, I have disabled the GPU parallelism in the test till I figure out the root cause.

Signed-off-by: ajrasane <[email protected]>

kevalmorabia97

Tests passing in 13 mins: https://gitlab-master.nvidia.com/omniml/modelopt/-/jobs/218842251

i-riyad · 2025-10-10T01:23:00Z

tests/examples/test_onnx_ptq.sh

+                    --model_name="${timm_model_name[$model_name]}" \
+                    --engine_precision=$precision \
+                    --results_path=$model_dir/$quant_mode/${model_name}_${quant_mode}.csv \
+                    --timing_cache_path=build/timing.cache


Is this build dir available across multiple CI/CD pipeline? Is there a way to use persistent storage in CI/CD?

Currently not. But we can use $model_dir/build which points to /home/scratch.omniml_data_1/models_ci/onnx/build. Test models are also used from /home/scratch.omniml_data_1/models_ci/onnx ($model_dir)

I think that would be the better thing to do, otherwise using this flag has no benefit for this script.

@kevalmorabia97 are you referring to models_folder?
model_dir will be build/model_name. So it wont be correct to save the timing cache in the individual model_dir as the other builds won't be accessing it.
I think the build directory is created under modelopt/examples/onnx_ptq. It is cleared at the end of each test unless we pass the --no-clean option. This is shared by the individual models
@i-riyad , the CICD will run on a different machine every time. Won't it be better to create the timing crash fresh for every CI run? Once it is built for the first model, it will be reused for the successive models in their builds.

Signed-off-by: ajrasane <[email protected]>

coderabbitai

Actionable comments posted: 0

♻️ Duplicate comments (2)

tests/examples/test_onnx_ptq.sh (2)

192-192: Fix array membership test and use glob-style pattern matching.

ShellCheck (SC2199/SC2076) flags the regex-based array concatenation and quoted right-hand side pattern. Use a glob-style match with ${latency_models[*]} for safer, simpler membership testing.

Apply this diff:

-            if [[ " ${latency_models[@]} " =~ " $model_name " ]]; then
+            if [[ " ${latency_models[*]} " == *" $model_name "* ]]; then

194-210: Quote all shell variables in python invocations.

Unquoted variables in evaluate.py calls risk word-splitting and globbing if paths or values contain spaces. Also, line 206 has a syntax error: --batch_size $batch_size is missing the = separator (should be --batch_size=$batch_size or --batch_size "$batch_size").

Apply this diff to quote all arguments and fix the batch_size syntax:

             python evaluate.py \
-                --onnx_path=$eval_model_path \
-                --engine_path=$engine_path \
+                --onnx_path="$eval_model_path" \
+                --engine_path="$engine_path" \
                 --model_name="${timm_model_name[$model_name]}" \
-                --engine_precision=$precision \
-                --results_path=$model_dir/$quant_mode/${model_name}_${quant_mode}.csv \
-                --timing_cache_path=$timing_cache_path
+                --engine_precision="$precision" \
+                --results_path="$model_dir/$quant_mode/${model_name}_${quant_mode}.csv" \
+                --timing_cache_path="$timing_cache_path"
             else
                 python evaluate.py \
-                    --onnx_path=$eval_model_path \
-                    --engine_path=$engine_path \
-                    --imagenet_path=$imagenet_path \
-                    --eval_data_size=$eval_size \
-                    --batch_size $batch_size \
+                    --onnx_path="$eval_model_path" \
+                    --engine_path="$engine_path" \
+                    --imagenet_path="$imagenet_path" \
+                    --eval_data_size="$eval_size" \
+                    --batch_size="$batch_size" \
                     --model_name="${timm_model_name[$model_name]}" \
-                    --engine_precision=$precision \
-                    --results_path=$model_dir/$quant_mode/${model_name}_${quant_mode}.csv \
-                    --timing_cache_path=$timing_cache_path
+                    --engine_precision="$precision" \
+                    --results_path="$model_dir/$quant_mode/${model_name}_${quant_mode}.csv" \
+                    --timing_cache_path="$timing_cache_path"

🧹 Nitpick comments (2)

tests/examples/test_onnx_ptq.sh (2)
157-161: Quote variables in quantization command for consistency and safety.

While less critical (paths are locally controlled), unquoted variables in the quantization invocation should be quoted to follow shell best practices and prevent potential issues if paths contain spaces or special characters.

Apply this diff:
         python -m modelopt.onnx.quantization \
-            --onnx_path=$model_dir/fp16/model.onnx \
+            --onnx_path="$model_dir/fp16/model.onnx" \
             --quantize_mode=$quant_mode \
-            --calibration_data=$calib_data_path \
-            --output_path=$model_dir/$quant_mode/model.quant.onnx \
+            --calibration_data="$calib_data_path" \
+            --output_path="$model_dir/$quant_mode/model.quant.onnx" \
             --calibration_eps=cuda
142-142: Quote variables in image_prep.py invocation.

For consistency with shell best practices, quote the variables in the image_prep.py call.

Apply this diff:
-python image_prep.py --output_path=$calib_data_path --calibration_data_size=$calib_size
+python image_prep.py --output_path="$calib_data_path" --calibration_data_size="$calib_size"

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between ad47d8a and 7be261c.

📒 Files selected for processing (1)

tests/examples/test_onnx_ptq.sh (4 hunks)

🧰 Additional context used

🪛 Shellcheck (0.11.0)

tests/examples/test_onnx_ptq.sh

[error] 192-192: Arrays implicitly concatenate in [[ ]]. Use a loop (or explicit * instead of @).

(SC2199)

[warning] 192-192: Remove quotes from right-hand side of =~ to match as a regex rather than literally.

(SC2076)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)

GitHub Check: wait-checks / wait
GitHub Check: linux
GitHub Check: wait-checks / wait
GitHub Check: code-quality
GitHub Check: build-docs

🔇 Additional comments (2)

tests/examples/test_onnx_ptq.sh (2)

170-218: Evaluation block structure and timing cache integration are sound.

The eval_mode gating and timing_cache_path propagation are correctly implemented. The sequential evaluation design aligns well with the PR objective to make the test single-threaded and efficient.

229-230: Wall-time formatting is correct.

The HH:MM:SS formatting with printf is properly implemented and clearly communicates total runtime.

Signed-off-by: ajrasane <[email protected]>

Update test to be single threaded

685183d

Signed-off-by: ajrasane <[email protected]>

ajrasane requested a review from kevalmorabia97 October 8, 2025 18:37

ajrasane self-assigned this Oct 8, 2025

coderabbitai bot reviewed Oct 8, 2025

View reviewed changes

kevalmorabia97 reviewed Oct 8, 2025

View reviewed changes

ajrasane requested review from a team as code owners October 8, 2025 19:43

ajrasane requested review from cjluo-nv and gcunhase October 8, 2025 19:43

Add timing cache to the evaluate API

0951c2c

Signed-off-by: ajrasane <[email protected]>

ajrasane force-pushed the ajrasane/fix_onnx_ptq_test branch from 67305e4 to 0951c2c Compare October 8, 2025 22:16

Disable model evaluation by default

276e2dd

Signed-off-by: ajrasane <[email protected]>

kevalmorabia97 changed the title ~~Update test to be single threaded~~ Update onnx ptq test to be single threaded and make it faster Oct 9, 2025

kevalmorabia97 approved these changes Oct 9, 2025

View reviewed changes

i-riyad reviewed Oct 10, 2025

View reviewed changes

kevalmorabia97 removed request for cjluo-nv and gcunhase October 10, 2025 21:16

gcunhase approved these changes Oct 16, 2025

View reviewed changes

Set a global path for timing cache

7be261c

Signed-off-by: ajrasane <[email protected]>

ajrasane force-pushed the ajrasane/fix_onnx_ptq_test branch from ad47d8a to 7be261c Compare October 17, 2025 09:05

coderabbitai bot reviewed Oct 17, 2025

View reviewed changes

ajrasane enabled auto-merge (squash) October 17, 2025 09:31

Disable dynamo export by default

f74f12e

Signed-off-by: ajrasane <[email protected]>

i-riyad approved these changes Oct 17, 2025

View reviewed changes

ajrasane merged commit c692074 into main Oct 17, 2025
27 checks passed

ajrasane deleted the ajrasane/fix_onnx_ptq_test branch October 17, 2025 18:15

Update onnx ptq test to be single threaded and make it faster #415

Update onnx ptq test to be single threaded and make it faster #415

Uh oh!

Conversation

ajrasane commented Oct 8, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Testing

Before your PR is "Ready for review"

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Oct 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Poem

Pre-merge checks and finishing touches

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Oct 8, 2025

Choose a reason for hiding this comment

Uh oh!

codecov bot commented Oct 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

kevalmorabia97 Oct 8, 2025

Choose a reason for hiding this comment

Uh oh!

ajrasane Oct 9, 2025

Choose a reason for hiding this comment

Uh oh!

kevalmorabia97 left a comment

Choose a reason for hiding this comment

Uh oh!

i-riyad Oct 10, 2025

Choose a reason for hiding this comment

Uh oh!

kevalmorabia97 Oct 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

i-riyad Oct 10, 2025

Choose a reason for hiding this comment

Uh oh!

ajrasane Oct 10, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

ajrasane commented Oct 8, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Oct 8, 2025 •

edited

Loading

codecov bot commented Oct 8, 2025 •

edited

Loading

kevalmorabia97 Oct 10, 2025 •

edited

Loading