Text Classification Model over hundreds of labels #10204
LeoAlvesRodrigues
started this conversation in
Help: Best practices
Replies: 1 comment 3 replies
-
Hey @LeoAlvesRodrigues, thanks for the question. The textcat has no arbitrary limit on labels, but memory use scales with number of labels, so splitting up components might make sense if you have unrelated groups of categories. We have plans to make a hierarchical textcat component, which might be suitable depending on how your categories are structured, but we are not working on it now. Without knowing more about your specific data, I would say prioritize labels that you have the most examples for. If you could give us more information about the data and the problem you're trying to solve, we could give some more specific recommendations. |
Beta Was this translation helpful? Give feedback.
3 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Hello, I am working on a project and we are trying to build a model that will classify text over 700 labels (more or less). We are starting small with only a few labels and I want to know to what degree are other pipeline components useful for this type of project and magnitude. We are also thinking if creating several models with only a few models is better than creating a single model with all labels implemented.
Any information will be appreciated,
Thank you in advance.
Have a good day!
Beta Was this translation helpful? Give feedback.
All reactions