Russian model proposal #6628
Replies: 6 comments 13 replies
-
@honnibal @adrianeboyd |
Beta Was this translation helpful? Give feedback.
-
This looks really cool! The results are certainly impressive. I do think we'll be able to integrate it as an official model, especially since it's MIT licensed. The main requirement we have for distributing the models is that we need to add them to our automation systems, so that we can train the model ourselves. It looks like it'll be pretty easy to do that, the main annoyance being how bad the json format is in v2, which makes the training corpus hard to work with. For spaCy v3, the model training process is now much simpler: we use the The thing that would be most helpful from your side is to make sure it still works with spaCy v3, using the current Quick question about your evaluation: is the evaluation also on silver-standard data, or is the development and test data gold-standard? |
Beta Was this translation helpful? Give feedback.
-
Sure thing, I will check it. How to reproduce
We use only standard SpaCy commands
Yep, evaluation is also silver-standard. But we also check the final model on a number of Russian gold-standart datasets, see morphology, syntax, and NER sections of our repo dedicated to public models evaluations. |
Beta Was this translation helpful? Give feedback.
-
Added SpaCy v3 support using SpaCy Projects.
I think the problem was in config. By default |
Beta Was this translation helpful? Give feedback.
-
This is GREAT news! Thanks to all involved indeed, and we can't wait to see it up there with the other languages... |
Beta Was this translation helpful? Give feedback.
-
Proposal got accepted in v3.0.0rc3 https://github.com/explosion/spaCy/releases/tag/v3.0.0rc3. See https://nightly.spacy.io/models/ru for official Russian model pretrained by Spacy team. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
cc @buriy
We trained high quality Russian model for SpaCy and would like to propose it for official SpaCy models registry.
All training sources are available under MIT license. We adapt datasets and embeddings table to fit SpaCy utilities. Training procedure uses standart
spacy convert
,spacy init-model
,spacy train
.Could you please take a look at https://github.com/natasha/natasha-spacy? Is it possible to add such model to official SpaCy models registry? Is there anything we can do help you with this process? For example, we could contribute tests to spacy-models repo or main spacy repo.
Beta Was this translation helpful? Give feedback.
All reactions