Skip to content

Commit f504789

Browse files
author
arpechenin
committed
- fix scheme
- add description how to mount sa to global driver - add test plan Signed-off-by: arpechenin <[email protected]>
1 parent d7dbc76 commit f504789

File tree

4 files changed

+40
-58
lines changed

4 files changed

+40
-58
lines changed

proposals/separate-standalone-driver/README.md

Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -196,6 +196,21 @@ The sample of the Argo Workflow system-container-driver template based on plugin
196196
jsonPath: $.condition
197197
```
198198
199+
## Test Plan
200+
[x] I/we understand the owners of the involved components may require updates to existing tests to make this code solid enough prior to committing the changes necessary to implement this enhancement.
201+
202+
Unit Tests
203+
Unit tests will primarily validate the compilation from KFP pipelines to Argo Workflow specs, while most other logic will be covered by integration tests.
204+
205+
Integration tests
206+
Add an additional E2E test to verify the behavior of the global driver server.
207+
208+
Additionally, it is nice to have end-to-end (E2E) tests to verify basic functionality. Existing tests should be reused if available. The E2E tests should cover at least the following scenarios:
209+
- A simple pipeline with a single component, waiting for successful completion of the run.
210+
- A pipeline with a chain of components passing inputs and outputs between them, waiting for successful completion of the run.
211+
- A pipeline designed to fail, waiting for the run to end with an error.
212+
- A pipeline which fails but has retries enabled(pipeline/ and component level), waiting for the run to complete successfully.
213+
199214
## Conclusion
200215
This proposal introduces an optimization for Kubeflow Pipelines (KFP) that replaces per-task driver pods with a lightweight standalone service based on Argo Workflows’ Executor Plugin mechanism. It significantly reduces pipeline task startup time by eliminating the overhead of scheduling a separate driver pod for each task — particularly beneficial for large pipelines with multiple steps and caching enabled.
201216
Instead of launching a new driver pod per task, the driver logic is offloaded to a shared agent pod that is scheduled per workflow, and completes once the workflow ends. This reduces latency in cache lookups and metadata initialization.
-25.1 KB
Loading

proposals/separate-standalone-driver/plugin.md

Lines changed: 8 additions & 58 deletions
Original file line numberDiff line numberDiff line change
@@ -15,9 +15,10 @@ As a result, driver-plugin implementations should merely act as a proxy between
1515

1616
1. Implement the driver plugin that simply proxies requests from the workflow controller to the kfp-driver-server and back.
1717
2. Build the image for the driver plugin.
18-
3. Create the [yaml description](src/driver-plugin/plugin.yaml) of the plugin
18+
3. Create the [yaml description](plugin.yaml) of the plugin
1919
4. [Create](https://argo-workflows.readthedocs.io/en/latest/cli/argo_executor-plugin_build/) the configmap by executing ```argo executor-plugin build .``` in the yaml description folder from the step 3
2020
5. Apply the created ConfigMap to the workflow-controller Kubernetes namespace.
21+
6. Create the service account driver-plugin-executor-plugin and set automountServiceAccountToken: true in the sidecar plugin ConfigMap (required for Kubernetes API access; see details below).
2122

2223
After that, you will be able to reference the corresponding driver plugin in your Argo Workflow using:
2324
```yaml
@@ -26,61 +27,10 @@ plugin:
2627
...
2728
```
2829

29-
### Problem: Interaction With the Kubernetes API From a Sidecar Container
30+
### Interaction With the Kubernetes API From a Sidecar Container
3031
The driver [requires](https://github.com/kubeflow/pipelines/blob/master/backend/src/v2/driver/k8s.go#L68) access to the k8s API.
31-
However, the required volume with the service account secret (/var/run/secrets/kubernetes.io/serviceaccount) is mounted only into the main (driver) container, but not into the sidecar container.
32-
Below is a sample YAML snippet showing the container definitions in the agent pod.
33-
```yaml
34-
Containers:
35-
driver-plugin:
36-
Image: .../kfp-driver-agent:2.4.1-63
37-
Port: 2948/TCP
38-
Host Port: 0/TCP
39-
Restart Count: 0
40-
Limits:
41-
cpu: 1
42-
memory: 1Gi
43-
Requests:
44-
cpu: 250m
45-
memory: 512Mi
46-
Environment:
47-
DRIVER_HOST: http://ml-pipeline-kfp-driver.kubeflow.svc
48-
DRIVER_PORT: 2948
49-
SERVER_PORT: 2948
50-
TIMEOUT_SECONDS: 120
51-
Mounts:
52-
/etc/gitconfig from gitconfig (ro,path="gitconfig")
53-
/var/run/argo from var-run-argo (ro,path="driver-plugin")
54-
main:
55-
Image: .../ml-platform/argoexec:v3.6.7
56-
Command:
57-
argoexec
58-
Args:
59-
agent
60-
main
61-
--loglevel
62-
info
63-
--log-format
64-
text
65-
--gloglevel
66-
0
67-
Ready: True
68-
Restart Count: 2
69-
Limits:
70-
cpu: 100m
71-
memory: 256M
72-
Requests:
73-
cpu: 10m
74-
memory: 64M
75-
Environment:
76-
ARGO_WORKFLOW_NAME: debug-component-pipeline-7bgps
77-
ARGO_WORKFLOW_UID: 03caea4e-70c1-4113-b700-b7183271f3b6
78-
ARGO_AGENT_PATCH_RATE: 10s
79-
ARGO_PLUGIN_ADDRESSES: ["http://localhost:2948"]
80-
ARGO_PLUGIN_NAMES: ["driver-plugin"]
81-
Mounts:
82-
/etc/gitconfig from gitconfig (ro,path="gitconfig")
83-
/var/run/argo from var-run-argo (rw)
84-
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-z9nt6 (ro)
85-
```
86-
As a workaround, it makes sense to use the agent pod only as a proxy to the kfp-driver-server.
32+
However, by default, the Argo Workflow Controller does not mount the service account token in the executor plugin's sidecar container. Moreover, it [disabled](https://github.com/argoproj/argo-workflows/pull/8028) the ability to mount the Workflow's service account to the executor plugin.
33+
As a result, to enable access to the Kubernetes API:
34+
1. Create ServiceAccount in each profile namespace with the name `driver-plugin-executor-plugin`. Argo WF [expects](https://github.com/argoproj/argo-workflows/blob/main/workflow/controller/agent.go#L285) the format <plugin-name>-executor-plugin
35+
2. Add a Role with appropriate Kubernetes API access and bind it to the service account.
36+
3. Configure `sidecar.automountServiceAccountToken` see [example](plugin.yaml)
Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,17 @@
1+
# Sample of the config map for mounting the executor plugin
2+
apiVersion: v1
3+
kind: ConfigMap
4+
metadata:
5+
labels:
6+
workflows.argoproj.io/configmap-type: ExecutorPlugin # Workflow Controller applies the plugin configuration based on this label
7+
name: driver-plugin
8+
data:
9+
sidecar.automountServiceAccountToken: "true" # Enables automatic mounting of the service account token in the sidecar
10+
sidecar.container: |
11+
image: ...
12+
name: driver-plugin
13+
resources:
14+
...
15+
securityContext:
16+
runAsNonRoot: true
17+
runAsUser: 65534

0 commit comments

Comments
 (0)