You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
* initializing branch and draft PR
* updated model card .md file
* minor
* minor
* Update docs/source/en/model_doc/nllb.md
Co-authored-by: Steven Liu <[email protected]>
* Update docs/source/en/model_doc/nllb.md
suggestion
Co-authored-by: Steven Liu <[email protected]>
* Update docs/source/en/model_doc/nllb.md
Co-authored-by: Steven Liu <[email protected]>
* Update docs/source/en/model_doc/nllb.md
suggestion
Co-authored-by: Steven Liu <[email protected]>
* Update docs/source/en/model_doc/nllb.md
suggestion
Co-authored-by: Steven Liu <[email protected]>
* Update docs/source/en/model_doc/nllb.md
suggestion
Co-authored-by: Steven Liu <[email protected]>
* Update docs/source/en/model_doc/nllb.md
suggestion
Co-authored-by: Steven Liu <[email protected]>
* resolving comments + adding visuals
* Update docs/source/en/model_doc/nllb.md
suggestion
Co-authored-by: Steven Liu <[email protected]>
* Update docs/source/en/model_doc/nllb.md
suggestion
Co-authored-by: Steven Liu <[email protected]>
* Update docs/source/en/model_doc/nllb.md
suggestion
Co-authored-by: Steven Liu <[email protected]>
* Update docs/source/en/model_doc/nllb.md
Co-authored-by: Steven Liu <[email protected]>
* Update docs/source/en/model_doc/nllb.md
suggestion
Co-authored-by: Steven Liu <[email protected]>
* Update docs/source/en/model_doc/nllb.md
Co-authored-by: Steven Liu <[email protected]>
* Update docs/source/en/model_doc/nllb.md
Co-authored-by: Steven Liu <[email protected]>
* NllbTokenizerFast and NllbTokenizer added
* endline
* minor
* Update nllb.md
---------
Co-authored-by: Sahil Kabir <[email protected]>
Co-authored-by: Steven Liu <[email protected]>
*This model was released on 2022-07-11 and added to Hugging Face Transformers on 2022-07-18.*
27
26
28
-
**DISCLAIMER:** The default behaviour for the tokenizer was fixed and thus changed in April 2023.
29
-
The previous version adds `[self.eos_token_id, self.cur_lang_code]` at the end of the token sequence for both target and source tokenization. This is wrong as the NLLB paper mentions (page 48, 6.1.1. Model Architecture) :
30
27
31
-
*Note that we prefix the source sequence with the source language, as opposed to the target
32
-
language as previously done in several works (Arivazhagan et al., 2019; Johnson et al.,
33
-
2017). This is primarily because we prioritize optimizing zero-shot performance of our
34
-
model on any pair of 200 languages at a minor cost to supervised performance.*
28
+
# NLLB
35
29
36
-
Previous behaviour:
30
+
[NLLB: No Language Left Behind](https://huggingface.co/papers/2207.04672) is a multilingual translation model. It's trained on data using data mining techniques tailored for low-resource languages and supports over 200 languages. NLLB features a conditional compute architecture using a Sparsely Gated Mixture of Experts.
pipeline("UN Chief says there is no military solution in Syria")
63
50
```
64
51
65
-
For more details, feel free to check the linked [PR](https://github.com/huggingface/transformers/pull/22313) and [Issue](https://github.com/huggingface/transformers/issues/19943).
66
-
67
-
## Overview
52
+
</hfoption>
53
+
<hfoptionid="AutoModel">
68
54
69
-
The NLLB model was presented in [No Language Left Behind: Scaling Human-Centered Machine Translation](https://huggingface.co/papers/2207.04672) by Marta R. Costa-jussà, James Cross, Onur Çelebi,
70
-
Maha Elbayad, Kenneth Heafield, Kevin Heffernan, Elahe Kalbassi, Janice Lam, Daniel Licht, Jean Maillard, Anna Sun, Skyler Wang, Guillaume Wenzek, Al Youngblood, Bapi Akula,
71
-
Loic Barrault, Gabriel Mejia Gonzalez, Prangthip Hansanti, John Hoffman, Semarley Jarrett, Kaushik Ram Sadagopan, Dirk Rowe, Shannon Spruit, Chau Tran, Pierre Andrews,
model = AutoModelForSeq2SeqLM.from_pretrained("facebook/nllb-200-distilled-600M", torch_dtype="auto", attn_implementaiton="sdpa")
76
60
77
-
*Driven by the goal of eradicating language barriers on a global scale, machine translation has solidified itself as a key focus of artificial intelligence research today.
78
-
However, such efforts have coalesced around a small subset of languages, leaving behind the vast majority of mostly low-resource languages. What does it take to break the
79
-
200 language barrier while ensuring safe, high quality results, all while keeping ethical considerations in mind? In No Language Left Behind, we took on this challenge by
80
-
first contextualizing the need for low-resource language translation support through exploratory interviews with native speakers. Then, we created datasets and models aimed
81
-
at narrowing the performance gap between low and high-resource languages. More specifically, we developed a conditional compute model based on Sparsely Gated Mixture of
82
-
Experts that is trained on data obtained with novel and effective data mining techniques tailored for low-resource languages. We propose multiple architectural and training
83
-
improvements to counteract overfitting while training on thousands of tasks. Critically, we evaluated the performance of over 40,000 different translation directions using
84
-
a human-translated benchmark, Flores-200, and combined human evaluation with a novel toxicity benchmark covering all languages in Flores-200 to assess translation safety.
85
-
Our model achieves an improvement of 44% BLEU relative to the previous state-of-the-art, laying important groundwork towards realizing a universal translation system.*
61
+
article ="UN Chief says there is no military solution in Syria"
62
+
inputs = tokenizer(article, return_tensors="pt")
86
63
87
-
This implementation contains the dense models available on release.
**The sparse model NLLB-MoE (Mixture of Expert) is now available! More details [here](nllb-moe)**
70
+
</hfoption>
71
+
<hfoptionid="transformers CLI">
90
72
91
-
This model was contributed by [Lysandre](https://huggingface.co/lysandre). The authors' code can be found [here](https://github.com/facebookresearch/fairseq/tree/nllb).
73
+
```bash
74
+
echo -e "UN Chief says there is no military solution in Syria"| transformers run --task "translation_en_to_fr" --model facebook/nllb-200-distilled-600M --device 0
75
+
```
92
76
93
-
## Generating with NLLB
77
+
</hfoption>
78
+
</hfoptions>
94
79
95
-
While generating the target text set the `forced_bos_token_id` to the target language id. The following
96
-
example shows how to translate English to French using the *facebook/nllb-200-distilled-600M* model.
80
+
Quantization reduces the memory burden of large models by representing the weights in a lower precision. Refer to the [Quantization](../quantization/overview) overview for more available quantization backends.
97
81
98
-
Note that we're using the BCP-47 code for French `fra_Latn`. See [here](https://github.com/facebookresearch/flores/blob/main/flores200/README.md#languages-in-flores-200)
99
-
for the list of all BCP-47 in the Flores 200 dataset.
82
+
The example below uses [bitsandbytes](../quantization/bitsandbytes) to quantize the weights to 8-bits.
### Generating from any other language than English
118
-
119
-
English (`eng_Latn`) is set as the default language from which to translate. In order to specify that you'd like to translate from a different language,
120
-
you should specify the BCP-47 code in the `src_lang` keyword argument of the tokenizer initialization.
121
-
122
-
See example below for a translation from romanian to german:
99
+
Use the [AttentionMaskVisualizer](https://github.com/huggingface/transformers/blob/main/src/transformers/utils/attention_visualizer.py#L139) to better understand what tokens the model can and cannot attend to.
- The tokenizer was updated in April 2023 to prefix the source sequence with the source language rather than the target language. This prioritizes zero-shot performance at a minor cost to supervised performance.
- For non-English languages, specify the language's [BCP-47](https://github.com/facebookresearch/flores/blob/main/flores200/README.md#languages-in-flores-200) code with the `src_lang` keyword as shown below.
133
+
134
+
- See example below for a translation from Romanian to German.
Le chef de l'ONU dit qu'il n'y a pas de solution militaire en Syrie
149
+
```
146
150
147
151
## NllbTokenizer
148
152
@@ -152,64 +156,3 @@ UN-Chef sagt, es gibt keine militärische Lösung in Syrien
152
156
## NllbTokenizerFast
153
157
154
158
[[autodoc]] NllbTokenizerFast
155
-
156
-
## Using Flash Attention 2
157
-
158
-
Flash Attention 2 is a faster, optimized version of the attention scores computation which relies on `cuda` kernels.
159
-
160
-
### Installation
161
-
162
-
First, check whether your hardware is compatible with Flash Attention 2. The latest list of compatible hardware can be found in the [official documentation](https://github.com/Dao-AILab/flash-attention#installation-and-features).
163
-
164
-
Next, [install](https://github.com/Dao-AILab/flash-attention#installation-and-features) the latest version of Flash Attention 2:
165
-
166
-
```bash
167
-
pip install -U flash-attn --no-build-isolation
168
-
```
169
-
170
-
### Usage
171
-
172
-
To load a model using Flash Attention 2, we can pass the argument `attn_implementation="flash_attention_2"` to [`.from_pretrained`](https://huggingface.co/docs/transformers/main/en/main_classes/model#transformers.PreTrainedModel.from_pretrained). You can use either `torch.float16` or `torch.bfloat16` precision.
0 commit comments