-
Notifications
You must be signed in to change notification settings - Fork 191
Pull requests: NVIDIA/TensorRT-Model-Optimizer
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
[NVBUG: 5619158] Enfore high precision model dtype for diffusion trt
#526
opened Nov 7, 2025 by
ajrasane
Loading…
[BUG FIX 5616904] Add transformers version restoration after PTQ for VILA
#525
opened Nov 7, 2025 by
yueshen2016
Loading…
Update custom file name patterns when copy files and remove problematic parameters in export
#520
opened Nov 6, 2025 by
Edwardf0t1
Loading…
[1/n] Reorganize sparsity module to separate weight and attention sparsity
#517
opened Nov 6, 2025 by
kaix-nv
Loading…
Update changelog to include SGLang/vLLM related updates
#516
opened Nov 6, 2025 by
Edwardf0t1
Loading…
Fix DQ1 output type error in DQ1->DQ2 for FP4 weights in NVFP4 model
#513
opened Nov 5, 2025 by
vishalpandya1990
Loading…
[OMNIML-2917] handle lm_head and other un-quantized modules correctly
#504
opened Nov 4, 2025 by
shengliangxu
Loading…
fix qdq utils issues and remove global cast replacements
#489
opened Oct 31, 2025 by
nvluxiaoz
Loading…
[Draft] [5526696] Add kv cache quantization support for onnx quantization
#486
opened Oct 31, 2025 by
zhanghaoc
Loading…
Add MoE (e.g. Qwen3-30B-A3B, Mamba hybrid) pruning support in Minitron
#467
opened Oct 27, 2025 by
JRD971000
Loading…
Previous Next
ProTip!
Mix and match filters to narrow down what you’re looking for.