Skip to content

Conversation

@Brijeshthummar02
Copy link

What type of PR is this? (check all applicable)

  • Refactor
  • Feature
  • Bug Fix
  • Optimization
  • Documentation Update

Description

output will now have clean, readable chunks

  • Cleaner output
  • More structured paragraphs
  • Better input for downstream NLP or embedding pipelines

Related Tickets & Documents

QA Instructions, Screenshots, Recordings

you can check this in below attached video.

Added/updated tests?

We encourage keeping test coverage at 80% or above.

  • Yes
  • No, and this is why: please explain why tests are not included

Community Support

Screen.Recording.2025-06-09.140840.mp4

above is video and below is pdf i used for it.

DevOps Guide 2025.pdf

@Brijeshthummar02
Copy link
Author

@mubashir-oss go through the PR and mention further if changes req.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Feature: Text Cleaning Pipeline for Chunking and Extraction

1 participant