Added link to XLM-RoBERTa pre-trained model

sdadas · web-flow · commit 5528afbd31b4 · 2019-11-11T19:32:12.000+01:00
diff --git a/README.md b/README.md
@@ -179,3 +179,4 @@ This resource was created in a semi-automatic way, by extracting the words and t
 - [Multilingual BERT](https://github.com/google-research/bert/blob/master/multilingual.md) - BERT (Bidirectional Encoder Representations from Transformers) is a model for generating contextual word representations. Multilingual cased model provided by Google supports 104 languages including Polish.
 - [Universal Sentence Encoder](https://tfhub.dev/google/universal-sentence-encoder-multilingual-large/1) - USE (Universal Sentence Encoder) generates sentence level langauge representations. Pre-trained multilingual model supports 16 langauges (Arabic, Chinese-simplified, Chinese-traditional, English, French, German, Italian, Japanese, Korean, Dutch, Polish, Portuguese, Spanish, Thai, Turkish, Russian).
 - [LASER Language-Agnostic SEntence Representations](https://github.com/facebookresearch/LASER) - A multilingual sentence encoder by Facebook Research, supporting 93 languages.
+- [XLM-RoBERTa](https://github.com/pytorch/fairseq/tree/master/examples/xlmr) - Cross lingual sentence encoder trained on 2.5 terabytes of data from CommonCrawl and Wikipedia. Supports 100 languages including Polish. See [Unsupervised Cross-lingual Representation Learning at Scale](https://arxiv.org/pdf/1911.02116.pdf) for details.