Post-training quantization script (hf_ptq.py) fails on Llama-4-Scout-17B due to missing architectures key in config.json



### The generic post-training quantization script, `examples/llm_ptq/hf_ptq.py`, fails when attempting to process and export a checkpoint for the `meta-llama/Llama-4-Scout-17B-16E-Instruct` model.  

The script works correctly for previous generation models like Llama 3, but it encounters a **TypeError** with Llama 4.  

Root cause: the script expects the model's `config.json` to have an `architectures` key, which it uses to identify the model type.  
The Llama 4 Scout model's config file does **not** contain this key, causing the script to crash when it tries to access `model.config.architectures[0]`.  

This is a **blocker** as it prevents the use of the standard post-training quantization workflow for a major new model release.  

**Error Log:**
\`\`\`
Traceback (most recent call last):
  File "/mnt/TensorRT-Model-Optimizer/examples/llm_ptq/hf_ptq.py", line 772, in <module>
    main(args)
  File "/mnt/TensorRT-Model-Optimizer/examples/llm_ptq/hf_ptq.py", line 616, in main
    export_tensorrt_llm_checkpoint(
  File "/mnt/TensorRT-Model-Optimizer/modelopt/torch/export/model_config_export.py", line 553, in export_tensorrt_llm_checkpoint
    raise e
  File "/mnt/TensorRT-Model-Optimizer/modelopt/torch/export/model_config_export.py", line 487, in export_tensorrt_llm_checkpoint
    for (
  File "/mnt/TensorRT-Model-Optimizer/modelopt/torch/export/model_config_export.py", line 154, in torch_to_tensorrt_llm_checkpoint
    architecture = model.config.architectures[0]
                   ~~~~~~~~~~~~~~~~~~~~~~~~~~^^^
TypeError: 'NoneType' object is not subscriptable
\`\`\`

---

### Steps/Code to reproduce bug
1. Set up an environment using an NVIDIA TensorRT-LLM container.  
2. Download the Llama 4 Scout model from Hugging Face:  
   \`\`\`bash
   huggingface-cli download meta-llama/Llama-4-Scout-17B-16E-Instruct \
       --local-dir /path/to/Llama-4-Scout-17B-16E-Instruct
   \`\`\`
3. Run the PTQ script:  
   \`\`\`bash
   python3 examples/llm_ptq/hf_ptq.py \
       --model_dir /path/to/Llama-4-Scout-17B-16E-Instruct \
       --output_dir /tmp/quantized_llama4 \
       --calib_dataset cnn_dailymail \
       --num_calib_size 32 \
       --dtype float16 \
       --qformat fp8
   \`\`\`

---

### Expected behavior
The `hf_ptq.py` script should successfully quantize the Llama 4 model and export a TensorRT-LLM compatible checkpoint, just as it does for Llama 3 models.

---

## System information
- Container used (if applicable): Official NVIDIA TensorRT-LLM container  
- OS (e.g., Ubuntu 22.04, CentOS 7, Windows 10): Ubuntu 24.04  
- CPU architecture (x86_64, aarch64): x86_64  
- GPU name (e.g. H100, A100, L40S): H100  
- GPU memory size: 80 GB  
- Number of GPUs: 8  
- Library versions (if applicable):  
  - Python: 3.12  
  - ModelOpt version or commit hash: latest inside container  
  - CUDA: 12.9  
  - PyTorch: 2.8.0a0+5228986c39.nv25.5  
  - Transformers: (container version)  
  - TensorRT-LLM: 0.21.0  
  - ONNXRuntime: (container version)  
  - TensorRT: (container version)  
- Any other details that may help: Bug only occurs with Llama 4 Scout models (`config.json` missing `architectures` key).


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Post-training quantization script (hf_ptq.py) fails on Llama-4-Scout-17B due to missing architectures key in config.json #340

The generic post-training quantization script, `examples/llm_ptq/hf_ptq.py`, fails when attempting to process and export a checkpoint for the `meta-llama/Llama-4-Scout-17B-16E-Instruct` model.

Steps/Code to reproduce bug

Expected behavior

System information

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Post-training quantization script (hf_ptq.py) fails on Llama-4-Scout-17B due to missing architectures key in config.json #340

Description

The generic post-training quantization script, examples/llm_ptq/hf_ptq.py, fails when attempting to process and export a checkpoint for the meta-llama/Llama-4-Scout-17B-16E-Instruct model.

Steps/Code to reproduce bug

Expected behavior

System information

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

The generic post-training quantization script, `examples/llm_ptq/hf_ptq.py`, fails when attempting to process and export a checkpoint for the `meta-llama/Llama-4-Scout-17B-16E-Instruct` model.