Commit af54694
committed
feat: Add ModernBERT model family
Add support for ModernBERT, a modern encoder model with architectural improvements over BERT:
- Rotary position embeddings (RoPE) instead of absolute position embeddings
- Alternating local and global attention layers for efficiency
- Gated linear units (GeGLU) in feed-forward blocks
- Pre-normalization with LayerNorm (no bias)
- First layer reuses embedding norm for attention
Supported architectures:
- :base
- :for_masked_language_modeling
- :for_sequence_classification
- :for_token_classification
Reference: https://arxiv.org/abs/2412.136631 parent a8caabd commit af54694
File tree
4 files changed
+707
-0
lines changed- lib
- bumblebee/text
- test/bumblebee/text
4 files changed
+707
-0
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
170 | 170 | | |
171 | 171 | | |
172 | 172 | | |
| 173 | + | |
| 174 | + | |
| 175 | + | |
| 176 | + | |
| 177 | + | |
173 | 178 | | |
174 | 179 | | |
175 | 180 | | |
| |||
259 | 264 | | |
260 | 265 | | |
261 | 266 | | |
| 267 | + | |
262 | 268 | | |
263 | 269 | | |
264 | 270 | | |
| |||
0 commit comments