Skip to content

Commit af54694

Browse files
feat: Add ModernBERT model family
Add support for ModernBERT, a modern encoder model with architectural improvements over BERT: - Rotary position embeddings (RoPE) instead of absolute position embeddings - Alternating local and global attention layers for efficiency - Gated linear units (GeGLU) in feed-forward blocks - Pre-normalization with LayerNorm (no bias) - First layer reuses embedding norm for attention Supported architectures: - :base - :for_masked_language_modeling - :for_sequence_classification - :for_token_classification Reference: https://arxiv.org/abs/2412.13663
1 parent a8caabd commit af54694

File tree

4 files changed

+707
-0
lines changed

4 files changed

+707
-0
lines changed

lib/bumblebee.ex

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -170,6 +170,11 @@ defmodule Bumblebee do
170170
"MistralModel" => {Bumblebee.Text.Mistral, :base},
171171
"MistralForCausalLM" => {Bumblebee.Text.Mistral, :for_causal_language_modeling},
172172
"MistralForSequenceClassification" => {Bumblebee.Text.Mistral, :for_sequence_classification},
173+
"ModernBertModel" => {Bumblebee.Text.ModernBert, :base},
174+
"ModernBertForMaskedLM" => {Bumblebee.Text.ModernBert, :for_masked_language_modeling},
175+
"ModernBertForSequenceClassification" =>
176+
{Bumblebee.Text.ModernBert, :for_sequence_classification},
177+
"ModernBertForTokenClassification" => {Bumblebee.Text.ModernBert, :for_token_classification},
173178
"PhiModel" => {Bumblebee.Text.Phi, :base},
174179
"PhiForCausalLM" => {Bumblebee.Text.Phi, :for_causal_language_modeling},
175180
"PhiForSequenceClassification" => {Bumblebee.Text.Phi, :for_sequence_classification},
@@ -259,6 +264,7 @@ defmodule Bumblebee do
259264
"llama" => :llama,
260265
"mistral" => :llama,
261266
"mbart" => :mbart,
267+
"modernbert" => :modernbert,
262268
"phi" => :code_gen,
263269
"phi3" => :llama,
264270
"qwen3" => :qwen2,

0 commit comments

Comments
 (0)