Contributing to enhance text data pipeline - normalization & filtering ideas #2
codewithEshaYoutube
started this conversation in
Ideas
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Hi gloss-api maintainers
I’d like to contribute to the text data pipeline standards by adding a
modern, AI-enhanced workflow for automated text preprocessing.
The proposed pipeline covers:
Additionally, AI models can be integrated for:
The goal is to create a reproducible, modular, and automated pipeline

that can handle large-scale datasets efficiently.
I’ve also prepared a simple diagram to illustrate the workflow.
Looking forward to your feedback and suggestions.
Beta Was this translation helpful? Give feedback.
All reactions