You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
@@ -25,12 +27,23 @@ See the LLaMA example [`examples/llama`](../llama) for details.
25
27
* INT8 SmoothQuant
26
28
* INT4 AWQ & W4A8 AWQ
27
29
28
-
## Download model checkpoints
30
+
## Supported Models
31
+
### EXAONE-3.0
29
32
30
-
First, download the HuggingFace FP32 checkpoints of EXAONE model.
33
+
Download the HuggingFace FP32 checkpoints of EXAONE-3.0 model. We support EXAONE-3.0 families but here, we only use the `EXAONE-3.0-7.8B-Instruct` model for the example.
Download the HuggingFace BF16 checkpoints of EXAONE-Deep model. Here, we only use the `EXAONE-Deep-2.4B` model for the example. We can use the same procedure as EXAONE-3.0 to convert the weights and build the TensorRT engine.
full:B200_PCIe/examples/test_enc_dec.py::test_llm_enc_dec_general[compare_hf-bart-large-cnn-float16-disable_gemm_plugin-enable_attention_plugin-enable_paged_kv_cache-tp:1-pp:1-nb:1] SKIP (Disable for Blackwell)
112
112
full:B200_PCIe/examples/test_enc_dec.py::test_llm_enc_dec_general[compare_hf-bart-large-cnn-float16-disable_gemm_plugin-enable_attention_plugin-disable_paged_kv_cache-tp:1-pp:1-nb:1] SKIP (Disable for Blackwell)
113
113
full:B200_PCIe/examples/test_enc_dec.py::test_llm_enc_dec_general[compare_hf-mbart-large-50-many-to-one-mmt-float16-enable_gemm_plugin-enable_attention_plugin-disable_paged_kv_cache-tp:1-pp:1-nb:1] SKIP (Disable for Blackwell)
114
-
full:B200_PCIe/examples/test_exaone.py::test_llm_exaone_1gpu[enable_weight_only-exaone-float16-nb:1] SKIP (Disable for Blackwell)
114
+
full:B200_PCIe/examples/test_exaone.py::test_llm_exaone_1gpu[enable_weight_only-exaone_3.0_7.8b_instruct-float16-nb:1] SKIP (Disable for Blackwell)
115
115
full:B200_PCIe/examples/test_gpt.py::test_llm_gpt2_next_prompt_tuning_1gpu SKIP (Disable for Blackwell)
116
116
full:B200_PCIe/examples/test_gpt.py::test_llm_gpt_starcoder_lora_1gpu[peft-lora-starcoder2-15b-unity-copilot-starcoder2-lora_fp16-base_fp16] SKIP (Disable for Blackwell)
117
117
full:B200_PCIe/examples/test_gpt.py::test_llm_gpt_starcoder_lora_1gpu[peft-lora-starcoder2-15b-unity-copilot-starcoder2-lora_fp16-base_fp8] SKIP (Disable for Blackwell)
full:B200/examples/test_enc_dec.py::test_llm_enc_dec_general[compare_hf-bart-large-cnn-float16-disable_gemm_plugin-enable_attention_plugin-enable_paged_kv_cache-tp:1-pp:1-nb:1] SKIP (Disable for Blackwell)
225
225
full:B200/examples/test_enc_dec.py::test_llm_enc_dec_general[compare_hf-bart-large-cnn-float16-disable_gemm_plugin-enable_attention_plugin-disable_paged_kv_cache-tp:1-pp:1-nb:1] SKIP (Disable for Blackwell)
226
226
full:B200/examples/test_enc_dec.py::test_llm_enc_dec_general[compare_hf-mbart-large-50-many-to-one-mmt-float16-enable_gemm_plugin-enable_attention_plugin-disable_paged_kv_cache-tp:1-pp:1-nb:1] SKIP (Disable for Blackwell)
227
-
full:B200/examples/test_exaone.py::test_llm_exaone_1gpu[enable_weight_only-exaone-float16-nb:1] SKIP (Disable for Blackwell)
227
+
full:B200/examples/test_exaone.py::test_llm_exaone_1gpu[enable_weight_only-exaone_3.0_7.8b_instruct-float16-nb:1] SKIP (Disable for Blackwell)
228
228
full:B200/examples/test_gpt.py::test_llm_gpt2_next_prompt_tuning_1gpu SKIP (Disable for Blackwell)
229
229
full:B200/examples/test_gpt.py::test_llm_gpt_starcoder_lora_1gpu[peft-lora-starcoder2-15b-unity-copilot-starcoder2-lora_fp16-base_fp16] SKIP (Disable for Blackwell)
230
230
full:B200/examples/test_gpt.py::test_llm_gpt_starcoder_lora_1gpu[peft-lora-starcoder2-15b-unity-copilot-starcoder2-lora_fp16-base_fp8] SKIP (Disable for Blackwell)
0 commit comments