chore : add pipelines, components in library, remove pyproject.toml#96
chore : add pipelines, components in library, remove pyproject.toml#96EncHawk wants to merge 1 commit intokubeflow:mainfrom
Conversation
…oml to integrate to sdk Signed-off-by: D.Leap <dilipkumar2000.r@gmail.com>
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
There was a problem hiding this comment.
Pull request overview
Migrates this repository’s pipelines/components into a shared library/ namespace to support inclusion in the Kubeflow SDK (per issue #80 / kubeflow/sdk#125), and removes standalone packaging configuration.
Changes:
- Removed
pyproject.toml(previous packaging/build configuration). - Added
library/components/*andlibrary/pipelines/*package scaffolding (__init__.py) for category-based imports. - Added an initial example component (
yoda_data_processor) with metadata, owners, and unit/local-runner tests, plus a sample deployment component metadata/owners.
Reviewed changes
Copilot reviewed 1 out of 22 changed files in this pull request and generated no comments.
Show a summary per file
| File | Description |
|---|---|
| pyproject.toml | Removes project packaging/build metadata previously used for distribution. |
| library/pipelines/init.py | Adds top-level pipelines package initializer for the new library/ layout. |
| library/pipelines/training/init.py | Adds training pipelines namespace initializer. |
| library/pipelines/evaluation/init.py | Adds evaluation pipelines namespace initializer. |
| library/pipelines/data_processing/init.py | Adds data-processing pipelines namespace initializer. |
| library/pipelines/deployment/init.py | Adds deployment pipelines namespace initializer. |
| library/components/init.py | Adds top-level components package initializer for the new library/ layout. |
| library/components/training/init.py | Adds training components namespace initializer. |
| library/components/evaluation/init.py | Adds evaluation components namespace initializer. |
| library/components/data_processing/init.py | Adds data-processing components namespace initializer. |
| library/components/data_processing/README.md | Adds category-level README linking to the new component. |
| library/components/deployment/init.py | Adds deployment components namespace initializer. |
| library/components/deployment/component_valid/metadata.yaml | Adds sample component metadata for deployment category. |
| library/components/deployment/component_valid/OWNERS | Adds sample OWNERS for the deployment sample component. |
| library/components/data_processing/yoda_data_processor/component.py | Introduces the prepare_yoda_dataset KFP component implementation. |
| library/components/data_processing/yoda_data_processor/init.py | Re-exports the new component entrypoint. |
| library/components/data_processing/yoda_data_processor/metadata.yaml | Adds metadata for yoda_data_processor. |
| library/components/data_processing/yoda_data_processor/OWNERS | Adds owners for yoda_data_processor. |
| library/components/data_processing/yoda_data_processor/README.md | Adds component documentation for yoda_data_processor. |
| library/components/data_processing/yoda_data_processor/tests/init.py | Declares the tests package for the component. |
| library/components/data_processing/yoda_data_processor/tests/test_component_unit.py | Adds unit tests that mock HuggingFace datasets interactions. |
| library/components/data_processing/yoda_data_processor/tests/test_component_local.py | Adds a LocalRunner execution test for the component. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
@mprahl to review this |
|
Hi @mprahl, could take a look at this |
|
@mprahl could you review this? If there are no issue can this be merged? |
manaswinidas
left a comment
There was a problem hiding this comment.
Thanks for the restructuring work. The renames look correct, but there are a few issues worth addressing before merge — mainly around test quality, a missing library/__init__.py, stale docstrings referencing the now-deleted pyproject.toml packaging, and test fixture data mixed into the production tree. See inline comments.
There was a problem hiding this comment.
Tautological test: test_component_function_exists is redundant — if the top-level import from ..component import prepare_yoda_dataset fails, the entire module fails to load and this test never runs. It's asserting something Python's import system already guarantees. Remove it.
Mock-mirror anti-pattern: test_component_with_default_parameters reproduces the component's internal call sequence step-by-step via mocks. This makes the test brittle to any refactor (e.g., combining rename_column calls) even if output is unchanged. The actual transformation logic (add_yoda_prefix) is never tested against real data — .map.assert_called_once() only checks that .map() was invoked, not that it does the right thing.
Consider testing the add_yoda_prefix function directly with a small fixture dataset instead.
There was a problem hiding this comment.
Two issues:
- Zero assertions. This calls the component but never checks outputs — not even that the output datasets exist or have rows. "Didn't crash" is not a meaningful test contract.
- Undeclared network dependency. This hits the real HuggingFace endpoint (
dvgodoy/yoda_sentences), making it an integration test that will fail in air-gapped CI. Either mock the download or move this to a separate integration test suite with appropriate markers.
There was a problem hiding this comment.
import kfp.compiler is only used inside the if __name__ == "__main__" guard. Move it inside that block to avoid loading the compiler module during normal component usage and test execution.
There was a problem hiding this comment.
The docstring still references from kfp_components.components import training, but pyproject.toml (which defined the kfp_components package mapping) is deleted in this PR. These import paths are now dangling. Update the docstrings to reflect the new package structure, or remove them until the SDK integration defines the final import paths.
There was a problem hiding this comment.
This file has name: invalid-dependency-semantic-versioning with fields like whatever: foo — clearly test fixture data. It shouldn't live in the production component tree under deployment/. Move it to a test_data/ or fixtures/ directory to avoid confusion.
There was a problem hiding this comment.
The "Metadata" section manually duplicates everything from metadata.yaml (tags, dependencies, owners, last-verified date). This creates a maintenance burden — any metadata.yaml change requires a synchronized README update. Consider auto-generating this section or just linking to the YAML.
| @@ -1,123 +0,0 @@ | |||
| [build-system] | |||
There was a problem hiding this comment.
Deleting the entire build/packaging config (dependencies, ruff/pytest config, setuptools mappings) without any replacement leaves the repo un-buildable and un-lintable. If the intent is to absorb this into kubeflow-sdk, the PR should at minimum document what replaces pyproject.toml or land the SDK-side config first.
There was a problem hiding this comment.
Same stale-docstring issue as library/components/__init__.py — references kfp_components.pipelines which no longer resolves to anything after pyproject.toml removal.
Also: library/ itself has no __init__.py, so it's not a Python package. import library.pipelines will fail unless namespace packages are intentionally being used (which isn't documented).
|
Thank you for the review @manaswinidas will get to fixing it! |
pyproject.tomlwas removed to include the repository into kubeflow-sdk as part of pipelines. it is followed herethe
pipelines,componentsdirectories are included under a common library/ namespace, this is to ensure that import command contians library in the sdk as followed here.issues : #80
Pre-Submission Checklist
Learn more about the pull request title convention used in this repository.