Add support for ERNIE-4.5 #1354

xenova · 2025-07-04T06:47:54Z

Original model: https://huggingface.co/baidu/ERNIE-4.5-0.3B-PT
ONNX weights: https://huggingface.co/onnx-community/ERNIE-4.5-0.3B-ONNX

HuggingFaceDocBuilderDev · 2025-07-04T06:50:01Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Enables exporting the new Ernie 4.5 models via onnxruntime-genai: https://huggingface.co/baidu/ERNIE-4.5-0.3B-PT I've uploaded the converted model to https://huggingface.co/onnx-community/ERNIE-4.5-0.3B-ONNX. Currently only supports the non-MoE version... but maybe someone can help with the MoE version: https://huggingface.co/baidu/ERNIE-4.5-21B-A3B-PT --- Models tested and validated with python ort & [transformers.js](huggingface/transformers.js#1354): ```py from transformers import AutoConfig, AutoTokenizer import onnxruntime import numpy as np # 1. Load config, processor, and model path_to_model = "./path/to/model" config = AutoConfig.from_pretrained("baidu/ERNIE-4.5-0.3B-PT", trust_remote_code=True) tokenizer = AutoTokenizer.from_pretrained("baidu/ERNIE-4.5-0.3B-PT", trust_remote_code=True) decoder_session = onnxruntime.InferenceSession(f"{path_to_model}/model.onnx") ## Set config values num_key_value_heads = config.num_key_value_heads head_dim = config.head_dim num_hidden_layers = config.num_hidden_layers eos_token_id = config.eos_token_id # 2. Prepare inputs ## Create input messages messages = [ { "role": "system", "content": "You are a helpful assistant." }, { "role": "user", "content": "Write me a poem about Machine Learning." }, ] ## Apply tokenizer inputs = tokenizer.apply_chat_template(messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="np") ## Prepare decoder inputs batch_size = inputs['input_ids'].shape[0] past_key_values = { f'past_key_values.{layer}.{kv}': np.zeros([batch_size, num_key_value_heads, 0, head_dim], dtype=np.float32) for layer in range(num_hidden_layers) for kv in ('key', 'value') } input_ids = inputs['input_ids'] position_ids = np.tile(np.arange(1, input_ids.shape[-1] + 1), (batch_size, 1)) attention_mask = np.ones_like(input_ids, dtype=np.int64) # 3. Generation loop max_new_tokens = 1024 generated_tokens = np.array([[]], dtype=np.int64) for i in range(max_new_tokens): logits, *present_key_values = decoder_session.run(None, dict( input_ids=input_ids, attention_mask=attention_mask, position_ids=position_ids, **past_key_values, )) ## Update values for next generation loop input_ids = logits[:, -1].argmax(-1, keepdims=True) attention_mask = np.concatenate([attention_mask, np.ones_like(input_ids, dtype=np.int64)], axis=-1) position_ids = position_ids[:, -1:] + 1 for j, key in enumerate(past_key_values): past_key_values[key] = present_key_values[j] generated_tokens = np.concatenate([generated_tokens, input_ids], axis=-1) if (input_ids == eos_token_id).all(): break ## (Optional) Streaming print(tokenizer.decode(input_ids[0]), end='', flush=True) print() # 4. Output result print(tokenizer.batch_decode(generated_tokens)) ``` --------- Co-authored-by: kunal-vaishnavi <[email protected]>

Add support for ERNIE-4.5

de0eacf

xenova mentioned this pull request Jul 4, 2025

[Model builder] Add support for Ernie 4.5 models microsoft/onnxruntime-genai#1608

Merged

xenova merged commit 4b7a3aa into main Jul 4, 2025
4 checks passed

xenova deleted the add-ernie4_5 branch July 4, 2025 17:07

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add support for ERNIE-4.5 #1354

Add support for ERNIE-4.5 #1354

Uh oh!

xenova commented Jul 4, 2025

Uh oh!

HuggingFaceDocBuilderDev commented Jul 4, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Add support for ERNIE-4.5 #1354

Add support for ERNIE-4.5 #1354

Uh oh!

Conversation

xenova commented Jul 4, 2025

Uh oh!

HuggingFaceDocBuilderDev commented Jul 4, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants