NVIDIA / TensorRT-Model-Optimizer Public

Notifications You must be signed in to change notification settings
Fork 191
Star 1.5k

Code
Issues 57
Pull requests 50
Actions
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Actions
Security
Insights

Pull requests: NVIDIA/TensorRT-Model-Optimizer

Labels 23 Milestones 0

New pull request New

50 Open 219 Closed

Author

Filter by author

Uh oh!

There was an error while loading. Please reload this page.

Label

Filter by label

Uh oh!

There was an error while loading. Please reload this page.

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Uh oh!

There was an error while loading. Please reload this page.

Milestones

Filter by milestone

Uh oh!

There was an error while loading. Please reload this page.

Reviews

Filter by reviews

No reviews Review required Approved review Changes requested

Assignee

Filter by who’s assigned

Assigned to nobody

Uh oh!

There was an error while loading. Please reload this page.

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Pull requests list

[NVBUG: 5619158] Enfore high precision model dtype for diffusion trt

#526 opened Nov 7, 2025 by ajrasane

Loading…

[BUG FIX 5616904] Add transformers version restoration after PTQ for VILA

#525 opened Nov 7, 2025 by yueshen2016

Loading…

[5620660][ONNX] Remove toposort after quantization

#524 opened Nov 7, 2025 by gcunhase

Loading…

parallel eagle draft

#523 opened Nov 6, 2025 by yeyu-nvidia • Draft

[Bug #193] fix fp8 blockwise real quantization

#522 opened Nov 6, 2025 by meenchen

Loading…

Support AWQ fake quant for vLLM MoE models

#521 opened Nov 6, 2025 by meenchen • Draft

Update custom file name patterns when copy files and remove problematic parameters in export

#520 opened Nov 6, 2025 by Edwardf0t1

Loading…

[1/n] Reorganize sparsity module to separate weight and attention sparsity

#517 opened Nov 6, 2025 by kaix-nv

Loading…

Update changelog to include SGLang/vLLM related updates

#516 opened Nov 6, 2025 by Edwardf0t1

Loading…

Fix BMM style MoE export in fp8_pc_pt recipe

#515 opened Nov 5, 2025 by Edwardf0t1

Loading…

Fix DQ1 output type error in DQ1->DQ2 for FP4 weights in NVFP4 model

#513 opened Nov 5, 2025 by vishalpandya1990

Loading…

Alit/moe dev2

#508 opened Nov 4, 2025 by JRD971000 • Draft

[5591945][ONNX] Fix 'nodes not sorted' failure

#507 opened Nov 4, 2025 by gcunhase

Loading…

Add decilm modelling code

#505 opened Nov 4, 2025 by danielkorzekwa

Loading…

[OMNIML-2917] handle lm_head and other un-quantized modules correctly

#504 opened Nov 4, 2025 by shengliangxu

Loading…

PyTorch geometric quantization support

#494 opened Nov 3, 2025 by i-riyad

Loading…

Compress tutorial (PoC)

#492 opened Nov 3, 2025 by danielkorzekwa

Loading…

fix qdq utils issues and remove global cast replacements

#489 opened Oct 31, 2025 by nvluxiaoz

Loading…

Update benchmarking for diffusers

#487 opened Oct 31, 2025 by ajrasane

Loading…

[Draft] [5526696] Add kv cache quantization support for onnx quantization

#486 opened Oct 31, 2025 by zhanghaoc

Loading…

Fix/Improve vllm PTQ and Support multi-node with ray

#484 opened Oct 30, 2025 by mxinO

Loading…

Yeyu/set block

#480 opened Oct 28, 2025 by yeyu-nvidia • Draft

feat: add onnxslim support

#478 opened Oct 28, 2025 by inisis

Loading…

Add MoE (e.g. Qwen3-30B-A3B, Mamba hybrid) pruning support in Minitron

#467 opened Oct 27, 2025 by JRD971000

Loading…

Feat: Eagle3 HF Online - support nemotron models

#463 opened Oct 25, 2025 by h-guo18

Loading…

Previous 1 2 Next

Previous Next

ProTip! Mix and match filters to narrow down what you’re looking for.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Uh oh!