Skip to content

Commit 85b8eb2

Browse files
authored
Add nanochat (#1441)
* Add support for NanoChat huggingface/transformers#41634 * Add nanochat to supported models list
1 parent fcf2ec9 commit 85b8eb2

File tree

5 files changed

+17
-1
lines changed

5 files changed

+17
-1
lines changed

README.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -393,6 +393,7 @@ You can refine your search by selecting the task you're interested in (e.g., [te
393393
1. **[MPNet](https://huggingface.co/docs/transformers/model_doc/mpnet)** (from Microsoft Research) released with the paper [MPNet: Masked and Permuted Pre-training for Language Understanding](https://huggingface.co/papers/2004.09297) by Kaitao Song, Xu Tan, Tao Qin, Jianfeng Lu, Tie-Yan Liu.
394394
1. **[MPT](https://huggingface.co/docs/transformers/model_doc/mpt)** (from MosaicML) released with the repository [llm-foundry](https://github.com/mosaicml/llm-foundry/) by the MosaicML NLP Team.
395395
1. **[MT5](https://huggingface.co/docs/transformers/model_doc/mt5)** (from Google AI) released with the paper [mT5: A massively multilingual pre-trained text-to-text transformer](https://huggingface.co/papers/2010.11934) by Linting Xue, Noah Constant, Adam Roberts, Mihir Kale, Rami Al-Rfou, Aditya Siddhant, Aditya Barua, Colin Raffel.
396+
1. **[NanoChat](https://huggingface.co/docs/transformers/model_doc/nanochat)** released with the repository [nanochat: The best ChatGPT that $100 can buy](https://github.com/karpathy/nanochat) by Andrej Karpathy.
396397
1. **NeoBERT** (from Chandar Research Lab) released with the paper [NeoBERT: A Next-Generation BERT](https://huggingface.co/papers/2502.19587) by Lola Le Breton, Quentin Fournier, Mariam El Mezouar, John X. Morris, Sarath Chandar.
397398
1. **[NLLB](https://huggingface.co/docs/transformers/model_doc/nllb)** (from Meta) released with the paper [No Language Left Behind: Scaling Human-Centered Machine Translation](https://huggingface.co/papers/2207.04672) by the NLLB team.
398399
1. **[Nougat](https://huggingface.co/docs/transformers/model_doc/nougat)** (from Meta AI) released with the paper [Nougat: Neural Optical Understanding for Academic Documents](https://huggingface.co/papers/2308.13418) by Lukas Blecher, Guillem Cucurull, Thomas Scialom, Robert Stojnic.

docs/snippets/6_supported-models.snippet

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -107,6 +107,7 @@
107107
1. **[MPNet](https://huggingface.co/docs/transformers/model_doc/mpnet)** (from Microsoft Research) released with the paper [MPNet: Masked and Permuted Pre-training for Language Understanding](https://huggingface.co/papers/2004.09297) by Kaitao Song, Xu Tan, Tao Qin, Jianfeng Lu, Tie-Yan Liu.
108108
1. **[MPT](https://huggingface.co/docs/transformers/model_doc/mpt)** (from MosaicML) released with the repository [llm-foundry](https://github.com/mosaicml/llm-foundry/) by the MosaicML NLP Team.
109109
1. **[MT5](https://huggingface.co/docs/transformers/model_doc/mt5)** (from Google AI) released with the paper [mT5: A massively multilingual pre-trained text-to-text transformer](https://huggingface.co/papers/2010.11934) by Linting Xue, Noah Constant, Adam Roberts, Mihir Kale, Rami Al-Rfou, Aditya Siddhant, Aditya Barua, Colin Raffel.
110+
1. **[NanoChat](https://huggingface.co/docs/transformers/model_doc/nanochat)** released with the repository [nanochat: The best ChatGPT that $100 can buy](https://github.com/karpathy/nanochat) by Andrej Karpathy.
110111
1. **NeoBERT** (from Chandar Research Lab) released with the paper [NeoBERT: A Next-Generation BERT](https://huggingface.co/papers/2502.19587) by Lola Le Breton, Quentin Fournier, Mariam El Mezouar, John X. Morris, Sarath Chandar.
111112
1. **[NLLB](https://huggingface.co/docs/transformers/model_doc/nllb)** (from Meta) released with the paper [No Language Left Behind: Scaling Human-Centered Machine Translation](https://huggingface.co/papers/2207.04672) by the NLLB team.
112113
1. **[Nougat](https://huggingface.co/docs/transformers/model_doc/nougat)** (from Meta AI) released with the paper [Nougat: Neural Optical Understanding for Academic Documents](https://huggingface.co/papers/2308.13418) by Lukas Blecher, Guillem Cucurull, Thomas Scialom, Robert Stojnic.

src/configs.js

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -112,6 +112,7 @@ function getNormalizedConfig(config) {
112112
break;
113113
case 'llama':
114114
case 'llama4_text':
115+
case 'nanochat':
115116
case 'arcee':
116117
case 'lfm2':
117118
case 'smollm3':

src/models.js

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4585,6 +4585,12 @@ export class Llama4PreTrainedModel extends PreTrainedModel { }
45854585
export class Llama4ForCausalLM extends Llama4PreTrainedModel { }
45864586
//////////////////////////////////////////////////
45874587

4588+
//////////////////////////////////////////////////
4589+
// NanoChat models
4590+
export class NanoChatPreTrainedModel extends PreTrainedModel { }
4591+
export class NanoChatModel extends NanoChatPreTrainedModel { }
4592+
export class NanoChatForCausalLM extends NanoChatPreTrainedModel { }
4593+
//////////////////////////////////////////////////
45884594

45894595
//////////////////////////////////////////////////
45904596
// Arcee models
@@ -7845,6 +7851,7 @@ const MODEL_MAPPING_NAMES_DECODER_ONLY = new Map([
78457851
['gpt_neox', ['GPTNeoXModel', GPTNeoXModel]],
78467852
['codegen', ['CodeGenModel', CodeGenModel]],
78477853
['llama', ['LlamaModel', LlamaModel]],
7854+
['nanochat', ['NanoChatModel', NanoChatModel]],
78487855
['arcee', ['ArceeModel', ArceeModel]],
78497856
['lfm2', ['Lfm2Model', Lfm2Model]],
78507857
['smollm3', ['SmolLM3Model', SmolLM3Model]],
@@ -7955,6 +7962,7 @@ const MODEL_FOR_CAUSAL_LM_MAPPING_NAMES = new Map([
79557962
['gpt_neox', ['GPTNeoXForCausalLM', GPTNeoXForCausalLM]],
79567963
['codegen', ['CodeGenForCausalLM', CodeGenForCausalLM]],
79577964
['llama', ['LlamaForCausalLM', LlamaForCausalLM]],
7965+
['nanochat', ['NanoChatForCausalLM', NanoChatForCausalLM]],
79587966
['llama4_text', ['Llama4ForCausalLM', Llama4ForCausalLM]],
79597967
['arcee', ['ArceeForCausalLM', ArceeForCausalLM]],
79607968
['lfm2', ['Lfm2ForCausalLM', Lfm2ForCausalLM]],

src/tokenizers.js

Lines changed: 6 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -278,9 +278,14 @@ const BLOOM_SPLIT_CHARS = '.,!?\u2026\u3002\uff0c\u3001\u0964\u06d4\u060c';
278278

279279
// A mapping of regex patterns to their equivalent (but possibly longer) JS-compatible versions.
280280
const PROBLEMATIC_REGEX_MAP = new Map([
281-
// This uses the case insensitive group modifier, which is not supported in JavaScript.
281+
// These use the case insensitive group modifier, which is not supported in JavaScript.
282282
// When parsing the regex, an "Invalid group" error is thrown.
283283
["(?i:'s|'t|'re|'ve|'m|'ll|'d)", "(?:'([sS]|[tT]|[rR][eE]|[vV][eE]|[mM]|[lL][lL]|[dD]))"],
284+
["(?i:[sdmt]|ll|ve|re)", "(?:[sS]|[dD]|[mM]|[tT]|[lL][lL]|[vV][eE]|[rR][eE])"],
285+
286+
// JS doesn't support possessive quantifiers (these are used in recent OpenAI tokenizers).
287+
["[^\\r\\n\\p{L}\\p{N}]?+", "[^\\r\\n\\p{L}\\p{N}]?"],
288+
["[^\\s\\p{L}\\p{N}]++", "[^\\s\\p{L}\\p{N}]+"],
284289

285290
// Used to override the default (invalid) regex of the bloom pretokenizer.
286291
// For more information, see https://github.com/huggingface/transformers.js/issues/94

0 commit comments

Comments
 (0)