Skip to content

Commit 6904c32

Browse files
feat: Add ModernBERT model family
Add support for ModernBERT, a modern encoder model with architectural improvements over BERT: - Rotary position embeddings (RoPE) instead of absolute position embeddings - Alternating local and global attention layers for efficiency - Gated linear units (GeGLU) in feed-forward blocks - Pre-normalization with LayerNorm (no bias) - First layer reuses embedding norm for attention Supported architectures: - :base - :for_masked_language_modeling - :for_sequence_classification - :for_token_classification Reference: https://arxiv.org/abs/2412.13663
1 parent a8caabd commit 6904c32

File tree

7 files changed

+1231
-0
lines changed

7 files changed

+1231
-0
lines changed

lib/bumblebee.ex

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -170,6 +170,14 @@ defmodule Bumblebee do
170170
"MistralModel" => {Bumblebee.Text.Mistral, :base},
171171
"MistralForCausalLM" => {Bumblebee.Text.Mistral, :for_causal_language_modeling},
172172
"MistralForSequenceClassification" => {Bumblebee.Text.Mistral, :for_sequence_classification},
173+
"ModernBertModel" => {Bumblebee.Text.ModernBert, :base},
174+
"ModernBertForMaskedLM" => {Bumblebee.Text.ModernBert, :for_masked_language_modeling},
175+
"ModernBertForSequenceClassification" =>
176+
{Bumblebee.Text.ModernBert, :for_sequence_classification},
177+
"ModernBertForTokenClassification" => {Bumblebee.Text.ModernBert, :for_token_classification},
178+
"ModernBertDecoderModel" => {Bumblebee.Text.ModernBertDecoder, :base},
179+
"ModernBertDecoderForCausalLM" =>
180+
{Bumblebee.Text.ModernBertDecoder, :for_causal_language_modeling},
173181
"PhiModel" => {Bumblebee.Text.Phi, :base},
174182
"PhiForCausalLM" => {Bumblebee.Text.Phi, :for_causal_language_modeling},
175183
"PhiForSequenceClassification" => {Bumblebee.Text.Phi, :for_sequence_classification},
@@ -259,6 +267,8 @@ defmodule Bumblebee do
259267
"llama" => :llama,
260268
"mistral" => :llama,
261269
"mbart" => :mbart,
270+
"modernbert" => :modernbert,
271+
"modernbert-decoder" => :modernbert,
262272
"phi" => :code_gen,
263273
"phi3" => :llama,
264274
"qwen3" => :qwen2,

0 commit comments

Comments
 (0)