Skip to content

Commit 792029b

Browse files
Merge pull request opendatahub-io#161 from gmfrasca/add-mlmd-component
Update Documentation and Unit Tests for MLMD Component
2 parents e66c464 + b1dcd59 commit 792029b

13 files changed

+901
-1
lines changed

README.md

Lines changed: 69 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -197,15 +197,83 @@ When a `DataSciencePipelinesApplication` is deployed, the following components a
197197
* APIServer
198198
* Persistence Agent
199199
* Scheduled Workflow controller
200-
* MLPipelines UI
201200

202201
If specified in the `DataSciencePipelinesApplication` resource, the following components may also be additionally deployed:
203202
* MariaDB
204203
* Minio
204+
* MLPipelines UI
205+
* MLMD (ML Metadata)
205206

206207
To understand how these components interact with each other please refer to the upstream
207208
[Kubeflow Pipelines Architectural Overview] documentation.
208209

210+
## Deploying Optional Components
211+
212+
### MariaDB
213+
To deploy a standalone MariaDB metadata database (rather than providing your own database connection details), simply add a `mariaDB` item under the `spec.database` in your DSPA definition with an `deploy` key set to `true`. All other fields are defaultable/optional, see [All Fields DSPA Example](./config/samples/dspa_all_fields.yaml) for full details. Note that this component is mutually exclusive with externally-provided databases (defined by `spec.database.externalDB`).
214+
215+
```
216+
apiVersion: datasciencepipelinesapplications.opendatahub.io/v1alpha1
217+
kind: DataSciencePipelinesApplication
218+
metadata:
219+
name: sample
220+
spec:
221+
...
222+
database:
223+
mariaDB: # mutually exclusive with externalDB
224+
deploy: true
225+
226+
```
227+
228+
### Minio
229+
To deploy a Minio Object Storage component (rather than providing your own object storage connection details), simply add a `minio` item under the `spec.objectStorage` in your DSPA definition with an `image` key set to a valid minio component container image. All other fields are defaultable/optional, see [All Fields DSPA Example](./config/samples/dspa_all_fields.yaml) for full details. Note that this component is mutually exclusive with externally-provided object stores (defined by `spec.objectStorage.externalStorage`).
230+
231+
```
232+
apiVersion: datasciencepipelinesapplications.opendatahub.io/v1alpha1
233+
kind: DataSciencePipelinesApplication
234+
metadata:
235+
name: sample
236+
spec:
237+
...
238+
objectStorage:
239+
minio: # mutually exclusive with externalStorage
240+
deploy: true
241+
# Image field is required
242+
image: 'quay.io/opendatahub/minio:RELEASE.2019-08-14T20-37-41Z-license-compliance'
243+
```
244+
245+
### ML Pipelines UI
246+
To deploy the standalone DS Pipelines UI component, simply add a `spec.mlpipelineUI` item to your DSPA with an `image` key set to a valid ui component container image. All other fields are defaultable/optional, see [All Fields DSPA Example](./config/samples/dspa_all_fields.yaml) for full details.
247+
248+
```
249+
apiVersion: datasciencepipelinesapplications.opendatahub.io/v1alpha1
250+
kind: DataSciencePipelinesApplication
251+
metadata:
252+
name: sample
253+
spec:
254+
...
255+
mlpipelineUI:
256+
deploy: true
257+
# Image field is required
258+
image: 'quay.io/opendatahub/odh-ml-pipelines-frontend-container:beta-ui'
259+
```
260+
261+
262+
### ML Metadata
263+
To deploy the ML Metadata artifact linage/metadata component, simply add a `spec.mlmd` item to your DSPA with `deploy` set to `true`. All other fields are defaultable/optional, see [All Fields DSPA Example](./config/samples/dspa_all_fields.yaml) for full details.
264+
265+
```
266+
apiVersion: datasciencepipelinesapplications.opendatahub.io/v1alpha1
267+
kind: DataSciencePipelinesApplication
268+
metadata:
269+
name: sample
270+
spec:
271+
...
272+
mlmd:
273+
deploy: true
274+
```
275+
276+
209277
# Using a DataSciencePipelinesApplication
210278

211279
When a `DataSciencePipelinesApplication` is deployed, use the MLPipelines UI endpoint to interact with DSP, either via a GUI or via API calls.

config/samples/dspa_all_fields.yaml

Lines changed: 30 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -128,6 +128,36 @@ spec:
128128
# secretName: somesecret-db-sample
129129
# accessKey: somekey
130130
# secretKey: somekey
131+
mlmd: # Deploys an optional ML-Metadata Component
132+
deploy: true
133+
envoy:
134+
image: quay.io/opendatahub/ds-pipelines-metadata-envoy:1.7.0
135+
resources:
136+
limits:
137+
cpu: 100m
138+
memory: 256Mi
139+
requests:
140+
cpu: 100m
141+
memory: 256Mi
142+
grpc:
143+
image: quay.io/opendatahub/ds-pipelines-metadata-grpc:1.0.0
144+
port: "8080"
145+
resources:
146+
limits:
147+
cpu: 100m
148+
memory: 256Mi
149+
requests:
150+
cpu: 100m
151+
memory: 256Mi
152+
writer:
153+
image: quay.io/opendatahub/ds-pipelines-metadata-writer:1.1.0
154+
resources:
155+
limits:
156+
cpu: 100m
157+
memory: 256Mi
158+
requests:
159+
cpu: 100m
160+
memory: 256Mi
131161
status:
132162
# Reports True iff:
133163
# * ApiServerReady, PersistenceAgentReady, ScheduledWorkflowReady, DatabaseReady, ObjectStorageReady report True
Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,15 @@
1+
# When a minimal DSPA is deployed
2+
Images:
3+
ApiServer: api-server:test5
4+
Artifact: artifact-manager:test5
5+
PersistentAgent: persistenceagent:test5
6+
ScheduledWorkflow: scheduledworkflow:test5
7+
Cache: ubi-minimal:test5
8+
MoveResultsImage: busybox:test5
9+
MlPipelineUI: frontend:test5
10+
MariaDB: mariadb:test5
11+
Minio: minio:test5
12+
OAuthProxy: oauth-proxy:test5
13+
MlmdEnvoy: metadata-envoy:changeme
14+
MlmdGrpc: metadata-grpc:changeme
15+
MlmdWriter: metadata-grpc:changeme
Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,19 @@
1+
apiVersion: datasciencepipelinesapplications.opendatahub.io/v1alpha1
2+
kind: DataSciencePipelinesApplication
3+
metadata:
4+
name: testdsp5
5+
spec:
6+
objectStorage:
7+
minio:
8+
image: minio:test5
9+
mlpipelineUI:
10+
image: frontend:test5
11+
mlmd:
12+
deploy: true
13+
envoy:
14+
image: metadata-envoy:test5
15+
grpc:
16+
image: metadata-grpc:test5
17+
port: "1337"
18+
writer:
19+
image: metadata-writer:test5
Lines changed: 202 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,202 @@
1+
apiVersion: apps/v1
2+
kind: Deployment
3+
metadata:
4+
name: ds-pipeline-testdsp5
5+
namespace: default
6+
labels:
7+
app: ds-pipeline-testdsp5
8+
component: data-science-pipelines
9+
spec:
10+
selector:
11+
matchLabels:
12+
app: ds-pipeline-testdsp5
13+
component: data-science-pipelines
14+
template:
15+
metadata:
16+
labels:
17+
app: ds-pipeline-testdsp5
18+
component: data-science-pipelines
19+
spec:
20+
containers:
21+
- env:
22+
- name: POD_NAMESPACE
23+
value: "default"
24+
- name: DBCONFIG_USER
25+
value: "mlpipeline"
26+
- name: DBCONFIG_PASSWORD
27+
valueFrom:
28+
secretKeyRef:
29+
key: "password"
30+
name: "ds-pipeline-db-testdsp5"
31+
- name: DBCONFIG_DBNAME
32+
value: "mlpipeline"
33+
- name: DBCONFIG_HOST
34+
value: "mariadb-testdsp5.default.svc.cluster.local"
35+
- name: DBCONFIG_PORT
36+
value: "3306"
37+
- name: ARTIFACT_BUCKET
38+
value: "mlpipeline"
39+
- name: ARTIFACT_ENDPOINT
40+
value: "http://minio-testdsp5.default.svc.cluster.local:9000"
41+
- name: ARTIFACT_SCRIPT
42+
valueFrom:
43+
configMapKeyRef:
44+
key: "artifact_script"
45+
name: "ds-pipeline-artifact-script-testdsp5"
46+
- name: ARTIFACT_IMAGE
47+
value: "artifact-manager:test5"
48+
- name: ARCHIVE_LOGS
49+
value: "false"
50+
- name: TRACK_ARTIFACTS
51+
value: "true"
52+
- name: STRIP_EOF
53+
value: "true"
54+
- name: PIPELINE_RUNTIME
55+
value: "tekton"
56+
- name: DEFAULTPIPELINERUNNERSERVICEACCOUNT
57+
value: "pipeline-runner-testdsp5"
58+
- name: INJECT_DEFAULT_SCRIPT
59+
value: "true"
60+
- name: APPLY_TEKTON_CUSTOM_RESOURCE
61+
value: "true"
62+
- name: TERMINATE_STATUS
63+
value: "Cancelled"
64+
- name: AUTO_UPDATE_PIPELINE_DEFAULT_VERSION
65+
value: "true"
66+
- name: DBCONFIG_CONMAXLIFETIMESEC
67+
value: "120"
68+
- name: ML_PIPELINE_VISUALIZATIONSERVER_SERVICE_HOST
69+
value: "ds-pipeline-visualizationserver"
70+
- name: ML_PIPELINE_VISUALIZATIONSERVER_SERVICE_PORT
71+
value: "8888"
72+
- name: OBJECTSTORECONFIG_BUCKETNAME
73+
value: "mlpipeline"
74+
- name: OBJECTSTORECONFIG_ACCESSKEY
75+
valueFrom:
76+
secretKeyRef:
77+
key: "accesskey"
78+
name: "mlpipeline-minio-artifact"
79+
- name: OBJECTSTORECONFIG_SECRETACCESSKEY
80+
valueFrom:
81+
secretKeyRef:
82+
key: "secretkey"
83+
name: "mlpipeline-minio-artifact"
84+
- name: OBJECTSTORECONFIG_SECURE
85+
value: "false"
86+
- name: MINIO_SERVICE_SERVICE_HOST
87+
value: "minio-testdsp5.default.svc.cluster.local"
88+
- name: MINIO_SERVICE_SERVICE_PORT
89+
value: "9000"
90+
- name: CACHE_IMAGE
91+
value: "ubi-minimal:test5"
92+
- name: MOVERESULTS_IMAGE
93+
value: "busybox:test5"
94+
image: api-server:test5
95+
imagePullPolicy: Always
96+
name: ds-pipeline-api-server
97+
ports:
98+
- containerPort: 8888
99+
name: http
100+
protocol: TCP
101+
- containerPort: 8887
102+
name: grpc
103+
protocol: TCP
104+
livenessProbe:
105+
exec:
106+
command:
107+
- wget
108+
- -q
109+
- -S
110+
- -O
111+
- '-'
112+
- http://localhost:8888/apis/v1beta1/healthz
113+
initialDelaySeconds: 3
114+
periodSeconds: 5
115+
timeoutSeconds: 2
116+
readinessProbe:
117+
exec:
118+
command:
119+
- wget
120+
- -q
121+
- -S
122+
- -O
123+
- '-'
124+
- http://localhost:8888/apis/v1beta1/healthz
125+
initialDelaySeconds: 3
126+
periodSeconds: 5
127+
timeoutSeconds: 2
128+
resources:
129+
requests:
130+
cpu: 250m
131+
memory: 500Mi
132+
limits:
133+
cpu: 500m
134+
memory: 1Gi
135+
volumeMounts:
136+
- mountPath: /config/sample_config.json
137+
name: sample-config
138+
subPath: sample_config.json
139+
- mountPath: /samples/
140+
name: sample-pipeline
141+
- name: oauth-proxy
142+
args:
143+
- --https-address=:8443
144+
- --provider=openshift
145+
- --openshift-service-account=ds-pipeline-testdsp5
146+
- --upstream=http://localhost:8888
147+
- --tls-cert=/etc/tls/private/tls.crt
148+
- --tls-key=/etc/tls/private/tls.key
149+
- --cookie-secret=SECRET
150+
- '--openshift-delegate-urls={"/": {"group":"route.openshift.io","resource":"routes","verb":"get","name":"ds-pipeline-testdsp5","namespace":"default"}}'
151+
- '--openshift-sar={"namespace":"default","resource":"routes","resourceName":"ds-pipeline-testdsp5","verb":"get","resourceAPIGroup":"route.openshift.io"}'
152+
- --skip-auth-regex='(^/metrics|^/apis/v1beta1/healthz)'
153+
image: oauth-proxy:test5
154+
ports:
155+
- containerPort: 8443
156+
name: oauth
157+
protocol: TCP
158+
livenessProbe:
159+
httpGet:
160+
path: /oauth/healthz
161+
port: oauth
162+
scheme: HTTPS
163+
initialDelaySeconds: 30
164+
timeoutSeconds: 1
165+
periodSeconds: 5
166+
successThreshold: 1
167+
failureThreshold: 3
168+
readinessProbe:
169+
httpGet:
170+
path: /oauth/healthz
171+
port: oauth
172+
scheme: HTTPS
173+
initialDelaySeconds: 5
174+
timeoutSeconds: 1
175+
periodSeconds: 5
176+
successThreshold: 1
177+
failureThreshold: 3
178+
resources:
179+
limits:
180+
cpu: 100m
181+
memory: 256Mi
182+
requests:
183+
cpu: 100m
184+
memory: 256Mi
185+
volumeMounts:
186+
- mountPath: /etc/tls/private
187+
name: proxy-tls
188+
volumes:
189+
- name: proxy-tls
190+
secret:
191+
secretName: ds-pipelines-proxy-tls-testdsp5
192+
defaultMode: 420
193+
- configMap:
194+
defaultMode: 420
195+
name: sample-config-testdsp5
196+
name: sample-config
197+
- configMap:
198+
defaultMode: 420
199+
name: sample-pipeline-testdsp5
200+
name: sample-pipeline
201+
202+
serviceAccountName: ds-pipeline-testdsp5
Lines changed: 28 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,28 @@
1+
apiVersion: v1
2+
data:
3+
artifact_script: |-
4+
#!/usr/bin/env sh
5+
push_artifact() {
6+
if [ -f "$2" ]; then
7+
tar -cvzf $1.tgz $2
8+
aws s3 --endpoint http://minio-testdsp5.default.svc.cluster.local:9000 cp $1.tgz s3://mlpipeline/artifacts/$PIPELINERUN/$PIPELINETASK/$1.tgz
9+
else
10+
echo "$2 file does not exist. Skip artifact tracking for $1"
11+
fi
12+
}
13+
push_log() {
14+
cat /var/log/containers/$PODNAME*$NAMESPACE*step-main*.log > step-main.log
15+
push_artifact main-log step-main.log
16+
}
17+
strip_eof() {
18+
if [ -f "$2" ]; then
19+
awk 'NF' $2 | head -c -1 > $1_temp_save && cp $1_temp_save $2
20+
fi
21+
}
22+
kind: ConfigMap
23+
metadata:
24+
name: ds-pipeline-artifact-script-testdsp5
25+
namespace: default
26+
labels:
27+
app: ds-pipeline-testdsp5
28+
component: data-science-pipelines

0 commit comments

Comments
 (0)