Releases: opendatahub-io/data-processing
Releases · opendatahub-io/data-processing
Release candidate 1
This is Release Candidate 1 (RC1) for the v0.0.1 release. This build represents a stable and tested snapshot of the codebase as of this tag.
This RC is intended for QA and deployment validation.
Key Changes & Highlights:
- KFP Pipelines
- Notebooks showcasing standard use-cases and tutorials
What's Changed
- Continue repo initialization by @alimaredia in #1
- Add Docling KFP vlm pipeline by @alimaredia in #2
- Add Docling KFP standard pipeline by @fabianofranz in #3
- Use Kube secrets to configure S3 endpoint, access key, secret key, bucket, and prefix by @fabianofranz in #5
- Use Kube secrets to configure remote model endpoint, model name, and token by @fabianofranz in #4
- Change default OCR engine to Tesseract by @alinaryan in #9
- Fixes Docling tesseract by @fabianofranz in #11
- update CODEOWNERS to reference team instead of ind by @alinaryan in #14
- README with instructions about how to run and customize pipelines by @fabianofranz in #7
- Explicitly set Smoldocling in VLM pipeline by @fabianofranz in #13
- Refactored Common Components into a shared module by @RobuRishabh in #12
- Fixed container connection error by @RobuRishabh in #17
- Fixes links to YAML pipelines in docs by @fabianofranz in #18
- Add pre-commit hooks and a CI workflow by @alinaryan in #10
- Add CI workflow to compile KFP pipelines by @alinaryan in #15
- Add files in kubeflow-pipelines/common to compile-kfp workflow by @alimaredia in #24
- [RHAIENG-1045] Add a notebook for standard conversion by @fabianofranz in #19
- [RHAIENG-1048] Notebook to demonstrate hybrid chunking by @shruthis4 in #25
- Update package versions in requirements and re-compile the kfp by @shruthis4 in #31
- [RHAIENG-1047] Add a notebook for VLM conversion by @fabianofranz in #29
- [RHAIENG-1115] Smoke tests for notebooks by @fabianofranz in #20
- [RHAIENG-1096] Data Preparation for RAG Notebook by @RobuRishabh in #27
- Apply lint job on entire repo by @alinaryan in #21
- Add lint job and README instructions by @alinaryan in #22
- [RHAIENG-1535] Add release strategy documentation for ODH Data Processing repository by @shruthis4 in #32
- [RHAIENG-1043] Add instructions on creating a custom workbench image by @alimaredia in #33
- [RHAIENG-1567] : Add Mergify configuration for auto-merge and conflict handling by @shruthis4 in #34
New Contributors
- @alimaredia made their first contribution in #1
- @fabianofranz made their first contribution in #3
- @alinaryan made their first contribution in #9
- @RobuRishabh made their first contribution in #12
- @shruthis4 made their first contribution in #25
Full Changelog: https://github.com/opendatahub-io/odh-data-processing/commits/v0.0.1