Skip to content
Discussion options

You must be logged in to vote

It is definitely possible to use the Matcher to create training data for a textcat model. That's a form of "weak supervision", where you train a statistical model using the output of a rule-based model.

The code you have works. I assume it's example code, but just in case, I will note that "names" is kind of a weird category for a document. Also note you can use entities from existing pipelines if you actually need to match on names.

We recently released a weak supervision tutorial project that you might find useful.

Replies: 1 comment

Comment options

You must be logged in to vote
0 replies
Answer selected by info2000
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feat / textcat Feature: Text Classifier feat / matcher Feature: Token, phrase and dependency matcher
2 participants