Skip to content

Commit 6c32514

Browse files
chore: Updating KFP docs (kubeflow#11927)
* chore: Updating KFP docs Signed-off-by: Francisco Javier Arceo <[email protected]> * remove html reference Signed-off-by: Francisco Javier Arceo <[email protected]> * updating python version Signed-off-by: Francisco Javier Arceo <[email protected]> * Update index.rst Co-authored-by: Matt Prahl <[email protected]> Signed-off-by: Francisco Arceo <[email protected]> * Update quickstart.rst Co-authored-by: Matt Prahl <[email protected]> Signed-off-by: Francisco Arceo <[email protected]> * Update quickstart.rst Co-authored-by: Matt Prahl <[email protected]> Signed-off-by: Francisco Arceo <[email protected]> * rebasing Signed-off-by: Francisco Javier Arceo <[email protected]> * removed overview file Signed-off-by: Francisco Javier Arceo <[email protected]> * updated to make things look better Signed-off-by: Francisco Javier Arceo <[email protected]> * Update docs/source/overview.rst Co-authored-by: Matt Prahl <[email protected]> Signed-off-by: Francisco Arceo <[email protected]> * Update docs/source/overview.rst Co-authored-by: Matt Prahl <[email protected]> Signed-off-by: Francisco Arceo <[email protected]> * Update docs/source/installation.rst Co-authored-by: Matt Prahl <[email protected]> Signed-off-by: Francisco Arceo <[email protected]> * updating conf.py Signed-off-by: Francisco Javier Arceo <[email protected]> --------- Signed-off-by: Francisco Javier Arceo <[email protected]> Signed-off-by: Francisco Arceo <[email protected]> Co-authored-by: Matt Prahl <[email protected]>
1 parent 8329e64 commit 6c32514

File tree

7 files changed

+282
-2
lines changed

7 files changed

+282
-2
lines changed

docs/conf.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -70,7 +70,7 @@
7070
}
7171

7272
html_theme = 'sphinx_immaterial'
73-
html_title = 'KFP SDK API Reference'
73+
html_title = 'Kubeflow Pipelines (KFP)'
7474
html_static_path = ['_static']
7575
html_css_files = ['custom.css']
7676
html_logo = '_static/kubeflow.png'

docs/index.rst

Lines changed: 8 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,14 +1,21 @@
1-
Kubeflow Pipelines SDK API Reference
1+
Kubeflow Pipelines (KFP)
22
====================================
33

44
.. mdinclude:: ../sdk/python/README.md
55

6+
.. mdinclude:: Architecture.md
7+
68
.. toctree::
79
:caption: Contents
810
:hidden:
911

1012
Home <self>
13+
Quickstart <source/quickstart>
14+
GenAI <source/genai>
15+
Overview <source/overview>
16+
Installation <source/installation>
1117
API Reference <source/kfp>
1218
Command Line Interface <source/cli>
19+
1320
Usage Docs (kubeflow.org) <https://kubeflow.org/docs/pipelines/>
1421
Source Code <https://github.com/kubeflow/pipelines/>

docs/source/genai.rst

Lines changed: 62 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,62 @@
1+
GenAI Use Cases
2+
===============
3+
4+
Generative AI (GenAI) workflows typically span multiple stages—from **data preparation** to **model fine-tuning**, **prompt engineering**, **evaluation**, and **deployment**. Kubeflow Pipelines provides a flexible and scalable orchestration engine to support these end-to-end workflows in a reproducible, modular way.
5+
6+
Data Preparation
7+
----------------
8+
Effective GenAI starts with high-quality, well-structured data. Use Kubeflow Pipelines to:
9+
10+
- Ingest and preprocess unstructured data such as PDFs, HTML, images, or audio.
11+
- Convert raw documents into structured formats and chunk them for tokenization.
12+
- Clean, normalize, and deduplicate datasets for training and evaluation.
13+
- Generate embeddings using models like SentenceTransformers or CLIP.
14+
- Create and store metadata-rich artifacts for traceability and downstream reuse.
15+
16+
Fine-tuning & Training
17+
----------------------
18+
Once data is prepared, Kubeflow Pipelines can orchestrate training jobs at scale:
19+
20+
- Automate tokenization and model fine-tuning (e.g., LoRA, full fine-tuning).
21+
- Parallelize hyperparameter sweeps (e.g., learning rate, batch size) using conditional and parallel components.
22+
- Leverage GPUs, TPUs, or managed training backends across environments.
23+
- Use pipeline components to separate data prep, training, and checkpoint saving.
24+
25+
Prompt Engineering Experiments
26+
------------------------------
27+
Experiment with prompt templates using parameterized pipelines:
28+
29+
- Evaluate prompt effectiveness at scale using batch scoring jobs.
30+
- Log and compare model outputs with evaluation metrics and annotations.
31+
- Enable iterative prompt design with easy-to-swap text templates.
32+
33+
Evaluation & Monitoring
34+
-----------------------
35+
Build pipelines to evaluate and monitor model outputs:
36+
37+
- Compare generations against reference outputs using BLEU, ROUGE, or custom metrics.
38+
- Integrate human-in-the-loop review and scoring.
39+
- Run periodic evaluation pipelines to detect degradation or drift in output quality.
40+
41+
Inference & Deployment
42+
----------------------
43+
Turn generative models into production services with reproducible deployment steps:
44+
45+
- Package and deploy models as containerized services using KServe or custom backends.
46+
- Use CI/CD pipelines to roll out new versions with A/B testing or canary releases.
47+
- Scale endpoints dynamically based on request volume and latency metrics.
48+
49+
Multimodal Generative Workflows
50+
-------------------------------
51+
Design rich pipelines that support multiple input/output modalities:
52+
53+
- Combine text, image, and audio generation into a unified DAG.
54+
- Orchestrate complex workflows involving model chaining and data routing.
55+
- Use custom components to process modality-specific inputs and outputs.
56+
57+
58+
See Also
59+
--------
60+
- :doc:`dsl`
61+
- :doc:`components`
62+
- :doc:`compiler`

docs/source/installation.rst

Lines changed: 41 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,41 @@
1+
.. _open-source-deployment:
2+
3+
Deploying Kubeflow Pipelines
4+
========
5+
6+
As an alternative to deploying Kubeflow Pipelines (KFP) as part of the
7+
`Kubeflow deployment <https://www.kubeflow.org/docs/started/installing-kubeflow/>`_,
8+
you also have the option to deploy only Kubeflow Pipelines.
9+
10+
Follow the instructions below to deploy Kubeflow Pipelines standalone using the supplied Kustomize manifests.
11+
12+
You should be familiar with the following tools:
13+
14+
- `Kubernetes <https://kubernetes.io/docs/home/>`_
15+
- `kubectl <https://kubernetes.io/docs/reference/kubectl/overview/>`_
16+
- `kustomize <https://kustomize.io/>`_
17+
18+
Deploying Kubeflow Pipelines
19+
----------------------------
20+
21+
1. Deploy Kubeflow Pipelines:
22+
23+
.. code-block:: bash
24+
25+
export PIPELINE_VERSION={{% pipelines/latest-version %}}
26+
kubectl apply -k "github.com/kubeflow/pipelines/manifests/kustomize/cluster-scoped-resources?ref=$PIPELINE_VERSION"
27+
kubectl wait --for condition=established --timeout=60s crd/applications.app.k8s.io
28+
kubectl apply -k "github.com/kubeflow/pipelines/manifests/kustomize/env/dev?ref=$PIPELINE_VERSION"
29+
30+
The Kubeflow Pipelines deployment takes approximately 3 minutes to complete. During this time, it is normal for pods to crash in the `kubeflow` namespace until the deployment completes.
31+
32+
2. Port-forward the Kubeflow Pipelines UI:
33+
34+
.. code-block:: bash
35+
36+
kubectl port-forward -n kubeflow svc/ml-pipeline-ui 8080:80
37+
38+
3. Open the following URL in your browser to access the UI:
39+
40+
`http://localhost:8080 <http://localhost:8080>`_
41+

docs/source/kfp.rst

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,5 @@
1+
.. _kfp-python-sdk:
2+
13
API Reference
24
==========================
35

docs/source/overview.rst

Lines changed: 58 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,58 @@
1+
Overview
2+
========
3+
4+
What is Kubeflow Pipelines?
5+
----------------------------
6+
7+
Kubeflow Pipelines (KFP) is a platform for building and deploying portable and scalable machine learning (ML) workflows using containers on Kubernetes-based systems.
8+
With KFP you can author :ref:`components <what-is-a-component>` and :ref:`pipelines <what-is-a-pipeline>` using the :ref:`KFP Python SDK <kfp-python-sdk>`, compile pipelines
9+
to an :ref:`intermediate representation YAML <what-is-a-compiled-pipeline>`, and submit the pipeline to run on a KFP-conformant backend such as the :ref:`open source KFP backend <open-source-deployment>`, `Google Cloud Vertex AI Pipelines <https://cloud.google.com/vertex-ai/docs/pipelines/introduction>`_, or KFP local.
10+
11+
The open source KFP backend is available as a core component of Kubeflow or as a standalone installation.
12+
13+
Why Kubeflow Pipelines?
14+
-----------------------
15+
16+
KFP enables data scientists and machine learning engineers to:
17+
18+
* Author end-to-end ML workflows natively in Python
19+
* Create fully custom ML components or leverage an ecosystem of existing components
20+
* Easily pass parameters and ML artifacts between pipeline components
21+
* Easily manage, track, and visualize pipeline definitions, runs, experiments, and ML artifacts
22+
* Efficiently use compute resources through parallel task execution and through caching to eliminate redundant executions
23+
* Keep experimentation and iteration light and Python-centric, minimizing the need to (re)build and maintain containers
24+
* Maintain cross-platform pipeline portability through a platform-neutral IR YAML pipeline definition
25+
* Abstract Kubernetes complexity while running pipelines on your organization's existing infrastructure investments (on-prem, cloud, or hybrid)
26+
27+
.. _what-is-a-pipeline:
28+
29+
What is a pipeline?
30+
-------------------
31+
32+
A `pipeline` is a definition of a workflow that composes one or more `components` together to form a computational directed acyclic graph (DAG). At runtime, each component execution corresponds to a single container execution, which may create ML artifacts. Pipelines may also feature `control flow`.
33+
34+
.. _what-is-a-component:
35+
36+
What is a component?
37+
--------------------
38+
Components are the building blocks of KFP pipelines. A component is a remote function definition; it specifies inputs, has user-defined logic in its body, and can create outputs. When the component template is instantiated with input parameters, we call it a task.
39+
40+
KFP provides two high-level ways to author components: Python Components and Container Components.
41+
42+
Python Components are a convenient way to author components implemented in pure Python. There are two specific types of Python components: Lightweight Python Components and Containerized Python Components.
43+
44+
Container Components expose a more flexible, advanced authoring approach by allowing you to define a component using an arbitrary container definition. This is the recommended approach for components that are not implemented in pure Python.
45+
46+
Importer Components are a special "pre-baked" component provided by KFP which allows you to import an artifact into your pipeline when that artifact was not created by tasks within the pipeline.
47+
48+
.. _what-is-a-compiled-pipeline:
49+
50+
What is a compiled pipeline?
51+
----------------------------
52+
A compiled pipeline, often referred to as an IR YAML, is an intermediate representation (IR) of a compiled pipeline or component. The IR YAML is not intended to be written directly.
53+
54+
While IR YAML is not intended to be easily human-readable, you can still inspect it if you know a bit about its contents:
55+
56+
.. _pipelines: #what-is-a-pipeline
57+
.. _components: #what-is-a-component
58+
.. _compiled-pipeline: #what-is-a-compiled-pipeline

docs/source/quickstart.rst

Lines changed: 110 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,110 @@
1+
Quickstart
2+
==========
3+
4+
This guide shows how to create, compile, and run a simple pipeline with Kubeflow Pipelines (KFP).
5+
6+
Prerequisites
7+
-------------
8+
- Python 3.9+
9+
- A running Kubeflow Pipelines deployment (local or remote).
10+
11+
Installation
12+
------------
13+
Install the Kubeflow Pipelines SDK:
14+
15+
.. code-block:: bash
16+
17+
pip install kfp
18+
19+
Local Initialization
20+
--------------------
21+
Use the `SubprocessRunner` for local execution without Docker:
22+
23+
.. code-block:: python
24+
25+
from kfp import local
26+
local.init(runner=local.SubprocessRunner())
27+
28+
Writing a Simple Component
29+
--------------------------
30+
31+
Define a lightweight component using the ``@dsl.component`` decorator:
32+
33+
.. code-block:: python
34+
35+
from kfp import dsl
36+
37+
@dsl.component
38+
def say_hello(name: str) -> str:
39+
message = f"Hello, {name}!"
40+
print(message)
41+
return message
42+
43+
You can run this component directly like a Python function:
44+
45+
.. code-block:: python
46+
47+
task = say_hello(name="World")
48+
assert task.output == "Hello, World!"
49+
50+
Writing and Running a Pipeline
51+
------------------------------
52+
53+
Define a pipeline using the ``@dsl.pipeline`` decorator:
54+
55+
.. code-block:: python
56+
57+
@dsl.pipeline
58+
def hello_pipeline(recipient: str) -> str:
59+
hello_task = say_hello(name=recipient)
60+
return hello_task.output
61+
62+
Run the pipeline locally as a regular function:
63+
64+
.. code-block:: python
65+
66+
pipeline_task = hello_pipeline(recipient="Local Dev")
67+
assert pipeline_task.output == "Hello, Local Dev!"
68+
69+
70+
The ``@dsl.component`` and ``@dsl.pipeline`` decorators turn type-annotated Python functions into reusable pipeline components and workflows.
71+
72+
Working with Artifacts
73+
----------------------
74+
75+
You can also write artifacts to disk and read them locally:
76+
77+
.. code-block:: python
78+
79+
from kfp.dsl import Output, Artifact
80+
import json
81+
82+
@dsl.component
83+
def add(a: int, b: int, out_artifact: Output[Artifact]):
84+
result = a + b
85+
with open(out_artifact.path, 'w') as f:
86+
f.write(json.dumps(result))
87+
out_artifact.metadata['operation'] = 'addition'
88+
89+
task = add(a=1, b=2)
90+
with open(task.outputs['out_artifact'].path) as f:
91+
result = json.loads(f.read())
92+
93+
assert result == 3
94+
assert task.outputs['out_artifact'].metadata['operation'] == 'addition'
95+
96+
97+
Running the pipeline
98+
----------------------
99+
You can run the pipeline locally with Python:
100+
101+
.. code-block:: bash
102+
103+
python my_pipeline.py
104+
105+
106+
Next steps
107+
----------
108+
- Explore the DSL: :doc:`dsl`
109+
- Learn about Components: :doc:`components`
110+
- See the CLI reference: :doc:`cli`

0 commit comments

Comments
 (0)