Skip to content
Discussion options

You must be logged in to vote

The particular reference to "noise" looks like a holdover from v2.

In v3 you can use data augmentation with the corpus reader. spacy itself includes a few simple augmenters (which we use for the pretrained pipelines): https://spacy.io/api/top-level#augmenters, https://spacy.io/api/top-level#corpus

Also check out the new augmenty package, which has many more augmenters: https://github.com/kennethenevoldsen/augmenty

Replies: 1 comment 1 reply

Comment options

You must be logged in to vote
1 reply
@Lolologist
Comment options

Answer selected by svlandeg
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feat / ner Feature: Named Entity Recognizer feat / cli Feature: Command-line interface feat / training Feature: Training utils, Example, Corpus and converters
2 participants