Commit 6904c32
committed
feat: Add ModernBERT model family
Add support for ModernBERT, a modern encoder model with architectural improvements over BERT:
- Rotary position embeddings (RoPE) instead of absolute position embeddings
- Alternating local and global attention layers for efficiency
- Gated linear units (GeGLU) in feed-forward blocks
- Pre-normalization with LayerNorm (no bias)
- First layer reuses embedding norm for attention
Supported architectures:
- :base
- :for_masked_language_modeling
- :for_sequence_classification
- :for_token_classification
Reference: https://arxiv.org/abs/2412.136631 parent a8caabd commit 6904c32
File tree
7 files changed
+1231
-0
lines changed- lib
- bumblebee/text
- test/bumblebee/text
7 files changed
+1231
-0
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
170 | 170 | | |
171 | 171 | | |
172 | 172 | | |
| 173 | + | |
| 174 | + | |
| 175 | + | |
| 176 | + | |
| 177 | + | |
| 178 | + | |
| 179 | + | |
| 180 | + | |
173 | 181 | | |
174 | 182 | | |
175 | 183 | | |
| |||
259 | 267 | | |
260 | 268 | | |
261 | 269 | | |
| 270 | + | |
| 271 | + | |
262 | 272 | | |
263 | 273 | | |
264 | 274 | | |
| |||
0 commit comments