-> **EAGLE3 compatibility note.** TensorRT-LLM implements the EAGLE3 draft head, which expects the pretrained config to expose a `draft_vocab_size` attribute. Draft heads built for other runtimes (for example, community EAGLE heads that work in vLLM) may omit this field and will raise an error such as `AttributeError: 'LlamaConfig' object has no attribute 'draft_vocab_size'`. Use an EAGLE3-compatible draft head or add the `draft_vocab_size` entry to the config before exporting to TensorRT-LLM. When the field is missing, TensorRT-LLM now assumes the draft vocabulary matches the target vocabulary (the value of `vocab_size`) and emits a warning; this mirrors the implicit behavior in runtimes like vLLM that reuse the target vocabulary for the draft head. If your draft head was trained with a different vocabulary, set `draft_vocab_size` explicitly so the converter can build the correct tokenizer table.
0 commit comments