Skip to content

Commit 9b9bd21

Browse files
authored
Add support for NeoBERT (#1350)
1 parent aabb290 commit 9b9bd21

File tree

3 files changed

+61
-0
lines changed

3 files changed

+61
-0
lines changed

README.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -385,6 +385,7 @@ You can refine your search by selecting the task you're interested in (e.g., [te
385385
1. **[MPNet](https://huggingface.co/docs/transformers/model_doc/mpnet)** (from Microsoft Research) released with the paper [MPNet: Masked and Permuted Pre-training for Language Understanding](https://huggingface.co/papers/2004.09297) by Kaitao Song, Xu Tan, Tao Qin, Jianfeng Lu, Tie-Yan Liu.
386386
1. **[MPT](https://huggingface.co/docs/transformers/model_doc/mpt)** (from MosaicML) released with the repository [llm-foundry](https://github.com/mosaicml/llm-foundry/) by the MosaicML NLP Team.
387387
1. **[MT5](https://huggingface.co/docs/transformers/model_doc/mt5)** (from Google AI) released with the paper [mT5: A massively multilingual pre-trained text-to-text transformer](https://huggingface.co/papers/2010.11934) by Linting Xue, Noah Constant, Adam Roberts, Mihir Kale, Rami Al-Rfou, Aditya Siddhant, Aditya Barua, Colin Raffel.
388+
1. **NeoBERT** (from Chandar Research Lab) released with the paper [NeoBERT: A Next-Generation BERT](https://huggingface.co/papers/2502.19587) by Lola Le Breton, Quentin Fournier, Mariam El Mezouar, John X. Morris, Sarath Chandar.
388389
1. **[NLLB](https://huggingface.co/docs/transformers/model_doc/nllb)** (from Meta) released with the paper [No Language Left Behind: Scaling Human-Centered Machine Translation](https://huggingface.co/papers/2207.04672) by the NLLB team.
389390
1. **[Nougat](https://huggingface.co/docs/transformers/model_doc/nougat)** (from Meta AI) released with the paper [Nougat: Neural Optical Understanding for Academic Documents](https://huggingface.co/papers/2308.13418) by Lukas Blecher, Guillem Cucurull, Thomas Scialom, Robert Stojnic.
390391
1. **[OLMo](https://huggingface.co/docs/transformers/master/model_doc/olmo)** (from Ai2) released with the paper [OLMo: Accelerating the Science of Language Models](https://huggingface.co/papers/2402.00838) by Dirk Groeneveld, Iz Beltagy, Pete Walsh, Akshita Bhagia, Rodney Kinney, Oyvind Tafjord, Ananya Harsh Jha, Hamish Ivison, Ian Magnusson, Yizhong Wang, Shane Arora, David Atkinson, Russell Authur, Khyathi Raghavi Chandu, Arman Cohan, Jennifer Dumas, Yanai Elazar, Yuling Gu, Jack Hessel, Tushar Khot, William Merrill, Jacob Morrison, Niklas Muennighoff, Aakanksha Naik, Crystal Nam, Matthew E. Peters, Valentina Pyatkin, Abhilasha Ravichander, Dustin Schwenk, Saurabh Shah, Will Smith, Emma Strubell, Nishant Subramani, Mitchell Wortsman, Pradeep Dasigi, Nathan Lambert, Kyle Richardson, Luke Zettlemoyer, Jesse Dodge, Kyle Lo, Luca Soldaini, Noah A. Smith, Hannaneh Hajishirzi.

docs/snippets/6_supported-models.snippet

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -99,6 +99,7 @@
9999
1. **[MPNet](https://huggingface.co/docs/transformers/model_doc/mpnet)** (from Microsoft Research) released with the paper [MPNet: Masked and Permuted Pre-training for Language Understanding](https://huggingface.co/papers/2004.09297) by Kaitao Song, Xu Tan, Tao Qin, Jianfeng Lu, Tie-Yan Liu.
100100
1. **[MPT](https://huggingface.co/docs/transformers/model_doc/mpt)** (from MosaicML) released with the repository [llm-foundry](https://github.com/mosaicml/llm-foundry/) by the MosaicML NLP Team.
101101
1. **[MT5](https://huggingface.co/docs/transformers/model_doc/mt5)** (from Google AI) released with the paper [mT5: A massively multilingual pre-trained text-to-text transformer](https://huggingface.co/papers/2010.11934) by Linting Xue, Noah Constant, Adam Roberts, Mihir Kale, Rami Al-Rfou, Aditya Siddhant, Aditya Barua, Colin Raffel.
102+
1. **NeoBERT** (from Chandar Research Lab) released with the paper [NeoBERT: A Next-Generation BERT](https://huggingface.co/papers/2502.19587) by Lola Le Breton, Quentin Fournier, Mariam El Mezouar, John X. Morris, Sarath Chandar.
102103
1. **[NLLB](https://huggingface.co/docs/transformers/model_doc/nllb)** (from Meta) released with the paper [No Language Left Behind: Scaling Human-Centered Machine Translation](https://huggingface.co/papers/2207.04672) by the NLLB team.
103104
1. **[Nougat](https://huggingface.co/docs/transformers/model_doc/nougat)** (from Meta AI) released with the paper [Nougat: Neural Optical Understanding for Academic Documents](https://huggingface.co/papers/2308.13418) by Lukas Blecher, Guillem Cucurull, Thomas Scialom, Robert Stojnic.
104105
1. **[OLMo](https://huggingface.co/docs/transformers/master/model_doc/olmo)** (from Ai2) released with the paper [OLMo: Accelerating the Science of Language Models](https://huggingface.co/papers/2402.00838) by Dirk Groeneveld, Iz Beltagy, Pete Walsh, Akshita Bhagia, Rodney Kinney, Oyvind Tafjord, Ananya Harsh Jha, Hamish Ivison, Ian Magnusson, Yizhong Wang, Shane Arora, David Atkinson, Russell Authur, Khyathi Raghavi Chandu, Arman Cohan, Jennifer Dumas, Yanai Elazar, Yuling Gu, Jack Hessel, Tushar Khot, William Merrill, Jacob Morrison, Niklas Muennighoff, Aakanksha Naik, Crystal Nam, Matthew E. Peters, Valentina Pyatkin, Abhilasha Ravichander, Dustin Schwenk, Saurabh Shah, Will Smith, Emma Strubell, Nishant Subramani, Mitchell Wortsman, Pradeep Dasigi, Nathan Lambert, Kyle Richardson, Luke Zettlemoyer, Jesse Dodge, Kyle Lo, Luca Soldaini, Noah A. Smith, Hannaneh Hajishirzi.

src/models.js

Lines changed: 59 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2132,6 +2132,60 @@ export class BertForQuestionAnswering extends BertPreTrainedModel {
21322132
}
21332133
//////////////////////////////////////////////////
21342134

2135+
//////////////////////////////////////////////////
2136+
// NeoBert models
2137+
export class NeoBertPreTrainedModel extends PreTrainedModel { }
2138+
export class NeoBertModel extends NeoBertPreTrainedModel { }
2139+
2140+
export class NeoBertForMaskedLM extends NeoBertPreTrainedModel {
2141+
/**
2142+
* Calls the model on new inputs.
2143+
*
2144+
* @param {Object} model_inputs The inputs to the model.
2145+
* @returns {Promise<MaskedLMOutput>} An object containing the model's output logits for masked language modeling.
2146+
*/
2147+
async _call(model_inputs) {
2148+
return new MaskedLMOutput(await super._call(model_inputs));
2149+
}
2150+
}
2151+
2152+
export class NeoBertForSequenceClassification extends NeoBertPreTrainedModel {
2153+
/**
2154+
* Calls the model on new inputs.
2155+
*
2156+
* @param {Object} model_inputs The inputs to the model.
2157+
* @returns {Promise<SequenceClassifierOutput>} An object containing the model's output logits for sequence classification.
2158+
*/
2159+
async _call(model_inputs) {
2160+
return new SequenceClassifierOutput(await super._call(model_inputs));
2161+
}
2162+
}
2163+
2164+
export class NeoBertForTokenClassification extends NeoBertPreTrainedModel {
2165+
/**
2166+
* Calls the model on new inputs.
2167+
*
2168+
* @param {Object} model_inputs The inputs to the model.
2169+
* @returns {Promise<TokenClassifierOutput>} An object containing the model's output logits for token classification.
2170+
*/
2171+
async _call(model_inputs) {
2172+
return new TokenClassifierOutput(await super._call(model_inputs));
2173+
}
2174+
}
2175+
2176+
export class NeoBertForQuestionAnswering extends NeoBertPreTrainedModel {
2177+
/**
2178+
* Calls the model on new inputs.
2179+
*
2180+
* @param {Object} model_inputs The inputs to the model.
2181+
* @returns {Promise<QuestionAnsweringModelOutput>} An object containing the model's output logits for question answering.
2182+
*/
2183+
async _call(model_inputs) {
2184+
return new QuestionAnsweringModelOutput(await super._call(model_inputs));
2185+
}
2186+
}
2187+
//////////////////////////////////////////////////
2188+
21352189
//////////////////////////////////////////////////
21362190
// ModernBert models
21372191
export class ModernBertPreTrainedModel extends PreTrainedModel { }
@@ -7619,6 +7673,7 @@ export class PretrainedMixin {
76197673

76207674
const MODEL_MAPPING_NAMES_ENCODER_ONLY = new Map([
76217675
['bert', ['BertModel', BertModel]],
7676+
['neobert', ['NeoBertModel', NeoBertModel]],
76227677
['modernbert', ['ModernBertModel', ModernBertModel]],
76237678
['nomic_bert', ['NomicBertModel', NomicBertModel]],
76247679
['roformer', ['RoFormerModel', RoFormerModel]],
@@ -7774,6 +7829,7 @@ const MODEL_FOR_TEXT_TO_WAVEFORM_MAPPING_NAMES = new Map([
77747829

77757830
const MODEL_FOR_SEQUENCE_CLASSIFICATION_MAPPING_NAMES = new Map([
77767831
['bert', ['BertForSequenceClassification', BertForSequenceClassification]],
7832+
['neobert', ['NeoBertForSequenceClassification', NeoBertForSequenceClassification]],
77777833
['modernbert', ['ModernBertForSequenceClassification', ModernBertForSequenceClassification]],
77787834
['roformer', ['RoFormerForSequenceClassification', RoFormerForSequenceClassification]],
77797835
['electra', ['ElectraForSequenceClassification', ElectraForSequenceClassification]],
@@ -7796,6 +7852,7 @@ const MODEL_FOR_SEQUENCE_CLASSIFICATION_MAPPING_NAMES = new Map([
77967852

77977853
const MODEL_FOR_TOKEN_CLASSIFICATION_MAPPING_NAMES = new Map([
77987854
['bert', ['BertForTokenClassification', BertForTokenClassification]],
7855+
['neobert', ['NeoBertForTokenClassification', NeoBertForTokenClassification]],
77997856
['modernbert', ['ModernBertForTokenClassification', ModernBertForTokenClassification]],
78007857
['roformer', ['RoFormerForTokenClassification', RoFormerForTokenClassification]],
78017858
['electra', ['ElectraForTokenClassification', ElectraForTokenClassification]],
@@ -7869,6 +7926,7 @@ const MODEL_FOR_MULTIMODALITY_MAPPING_NAMES = new Map([
78697926

78707927
const MODEL_FOR_MASKED_LM_MAPPING_NAMES = new Map([
78717928
['bert', ['BertForMaskedLM', BertForMaskedLM]],
7929+
['neobert', ['NeoBertForMaskedLM', NeoBertForMaskedLM]],
78727930
['modernbert', ['ModernBertForMaskedLM', ModernBertForMaskedLM]],
78737931
['roformer', ['RoFormerForMaskedLM', RoFormerForMaskedLM]],
78747932
['electra', ['ElectraForMaskedLM', ElectraForMaskedLM]],
@@ -7889,6 +7947,7 @@ const MODEL_FOR_MASKED_LM_MAPPING_NAMES = new Map([
78897947

78907948
const MODEL_FOR_QUESTION_ANSWERING_MAPPING_NAMES = new Map([
78917949
['bert', ['BertForQuestionAnswering', BertForQuestionAnswering]],
7950+
['neobert', ['NeoBertForQuestionAnswering', NeoBertForQuestionAnswering]],
78927951
['roformer', ['RoFormerForQuestionAnswering', RoFormerForQuestionAnswering]],
78937952
['electra', ['ElectraForQuestionAnswering', ElectraForQuestionAnswering]],
78947953
['convbert', ['ConvBertForQuestionAnswering', ConvBertForQuestionAnswering]],

0 commit comments

Comments
 (0)