-
-
Notifications
You must be signed in to change notification settings - Fork 4.6k
explosion spaCy Language-support Discussions
Sort by:
Latest activity
Categories, most helpful, and community links
Categories
Community links
🌍 Language Support Discussions
Discuss the language data and training models for new languages
Pinned to Language Support
-
🌍 Adding models for new languages master thread
enhancementFeature requests and improvements lang / allGlobal language data new languageAdding support for new languages to spaCy.
Discussions
-
You must be logged in to vote 🌍 What is [initialize] vector='model' and what are the differences between stock models?
feat / vectorsFeature: Word vectors and similarity -
You must be logged in to vote 🌍 Japanese Training data (as used in the model ja_core_news_lg for example)
lang / jaJapanese language data and models -
You must be logged in to vote 🌍 create new pipeline for Persian
lang / faPersian language data and models -
You must be logged in to vote 🌍 Characterization of PoS accuracy
feat / taggerFeature: Part-of-speech tagger perf / accuracyPerformance: accuracy -
You must be logged in to vote 🌍 Using Spacy V2 en_core_web_lg-2.3.1 model in Spacy V3
feat / taggerFeature: Part-of-speech tagger perf / accuracyPerformance: accuracy -
You must be logged in to vote 🌍 zh_core_web_lg static embedding come from where?
lang / zhChinese language data and models feat / vectorsFeature: Word vectors and similarity -
You must be logged in to vote 🌍 Datasets used for the French pretrained pipelines
lang / frFrench language data and models -
You must be logged in to vote 🌍 Improved Italian lemmatizer: ongoing work or plans?
enhancementFeature requests and improvements lang / itItalian language data and models feat / lemmatizerFeature: Rule-based and lookup lemmatization -
You must be logged in to vote 🌍 With which corpora is the French accurate pipeline (fr_dep_new_trf) trained ?
lang / frFrench language data and models feat / trainingFeature: Training utils, Example, Corpus and converters -
You must be logged in to vote 🌍 Incorrect lemmas for Italian language
lang / itItalian language data and models feat / lemmatizerFeature: Rule-based and lookup lemmatization -
You must be logged in to vote 🌍 Training a lemmatizer on Universal Dependencies
feat / lemmatizerFeature: Rule-based and lookup lemmatization -
You must be logged in to vote 🌍 Bug: The different punctuation at the end of a sentense lead analysis results wrong.
feat / taggerFeature: Part-of-speech tagger feat / parserFeature: Dependency Parser -
You must be logged in to vote 🌍 Part of Speech Tagger: "this/that/these/those" Pronouns / Determiners distinction not made
feat / taggerFeature: Part-of-speech tagger feat / trainingFeature: Training utils, Example, Corpus and converters -
You must be logged in to vote 🌍 Sesotho Model development
enhancementFeature requests and improvements -
You must be logged in to vote 🌍 gpt-neo with spacy?
feat / transformerFeature: Transformer -
You must be logged in to vote 🌍 Logging scores on the training set
lang / huHungarian language data and models feat / scorerFeature: Scorer -
You must be logged in to vote 🌍 Term extraction of medical guidelines in German?
lang / deGerman language data and models modelsIssues related to the statistical models -
You must be logged in to vote 🌍 Adding lemmatizer and ner to pipeline
lang / svSwedish language data and models feat / configFeature: Training config -
You must be logged in to vote 🌍 Spacy 3.0 - specify my own candidate generator to use custom UMLS path
third-partyThird-party packages and services -
You must be logged in to vote 🌍 Tokenizer exceptions for Sentencizer
feat / sentencizerFeature: Sentencizer (rule-based sentence segmenter) -
You must be logged in to vote 🌍 Performance of transformer model with and without NER
feat / nerFeature: Named Entity Recognizer perf / speedPerformance: speed -
You must be logged in to vote 🌍 'en_core_web_trf' optimal optimizer's learning rate and number of training epochs?
feat / configFeature: Training config -
You must be logged in to vote 🌍 Running a language model for spaCy 0.101
v1spaCy v1.x -
You must be logged in to vote 🌍 Amharic: What do I need to do to create am_core_web_sm
lang / amAmharic language data and models -
You must be logged in to vote 🌍 Using the ICU library and CLDR data for tokenization?
enhancementFeature requests and improvements