Skip to content

Commit 655a9d5

Browse files
dhirajsbAl-Pragliolapboyd
authored
Add support for Experiment tracking in Model Registry, fixes kubeflow#1224 (kubeflow#1318)
* feat: initial version of experiments and runs API Signed-off-by: Dhiraj Bokde <[email protected]> * feat: experiments and runs initial implementation (wip) Signed-off-by: Dhiraj Bokde <[email protected]> * fix: fixed failing unit tests for experiments and runs Signed-off-by: Dhiraj Bokde <[email protected]> * fix: added experiment and experimentrun tests Signed-off-by: Dhiraj Bokde <[email protected]> * feat: added DataSet, Metric, and Parameter types Signed-off-by: Dhiraj Bokde <[email protected]> * feat: added implementatio of DataSet, Metric, and Param, including service tests Signed-off-by: Dhiraj Bokde <[email protected]> * fix: replace int properties for timestamps with string because mlmd type properties only support int32, not int64 Signed-off-by: Dhiraj Bokde <[email protected]> * feat: add support for artifactType query param to filter artifact types in artifact queries Signed-off-by: Dhiraj Bokde <[email protected]> * fix: add metrics history endpoint and metric history storage for experiment run metrics Signed-off-by: Dhiraj Bokde <[email protected]> * fix: fix artifactType query param type in generated service Signed-off-by: Dhiraj Bokde <[email protected]> * fix: fix go lint error in unit test Signed-off-by: Dhiraj Bokde <[email protected]> * fix: filter out metric history from artifacts endpoints Signed-off-by: Dhiraj Bokde <[email protected]> * fix: fix metric history name to use last update time to avoid name conflicts Signed-off-by: Dhiraj Bokde <[email protected]> * feat: add filterQuery param on all context types to search by properties and custom properties Signed-off-by: Dhiraj Bokde <[email protected]> * feat: initial version of experiment tracking implemented on embedmd, rebased on main Signed-off-by: Dhiraj Bokde <[email protected]> * feat: add support for filterQuery parameter for all ListResponse endpoints for embedmd datastore Signed-off-by: Dhiraj Bokde <[email protected]> * fix: add support for stepIds query parameter in embedmd datastore Signed-off-by: Dhiraj Bokde <[email protected]> * feat: refactor embedmd db service to use generic repository implementation to reduce code duplication Signed-off-by: Dhiraj Bokde <[email protected]> * fix: add support for artifactType query parameter for embedmd datastore Signed-off-by: Dhiraj Bokde <[email protected]> * fix: use mysql 8.3 in unit tests Signed-off-by: Dhiraj Bokde <[email protected]> * fix: refactor name mapping and default name handling in embedmd datastore Signed-off-by: Dhiraj Bokde <[email protected]> * feat: support updating metrics and parameters by name, fix ignoring metric history when retrieving all artifacts for runs and versions Signed-off-by: Dhiraj Bokde <[email protected]> * fix: add missing generated openapi python client files for PR github action check Signed-off-by: Dhiraj Bokde <[email protected]> * fix: fix failing shared db tests Signed-off-by: Dhiraj Bokde <[email protected]> * fix: add support for metric and parameter description, add missing type property migraiton Signed-off-by: Dhiraj Bokde <[email protected]> * chore: update files from main Signed-off-by: Alessio Pragliola <[email protected]> * fix: added missing godoc comments in pkg/api/api.go Signed-off-by: Dhiraj Bokde <[email protected]> * fix: replace ambiguous ArtifactListReponse return type from GetExperimentRunMetricHistory with MetricListResponse Signed-off-by: Dhiraj Bokde <[email protected]> * fix: fixed incorrect artifactType in dataset response, added tests to verify all artifact types Signed-off-by: Dhiraj Bokde <[email protected]> * feat: add validation for endTimeSinceEpoch property on experiment run updates Signed-off-by: Dhiraj Bokde <[email protected]> * Replace value type validation map with a switch in query_translator.go Co-authored-by: Paul Boyd <[email protected]> Signed-off-by: Dhiraj Bokde <[email protected]> * fix: add service e2e tests for filterQuery, fix name query param handling, fix DB tests that didn't use parent id prefix Signed-off-by: Dhiraj Bokde <[email protected]> * chore: code cleanup, replace interface{} with any, added vetting for internal/db/filter Signed-off-by: Dhiraj Bokde <[email protected]> * chore: added flag vF for fixed string grep exclude Signed-off-by: Dhiraj Bokde <[email protected]> * fix: copied orderby and parameters back to registry and catalog to have different values Signed-off-by: Dhiraj Bokde <[email protected]> * fix: fixed mlmd query translator handling of escaped backslashes Signed-off-by: Dhiraj Bokde <[email protected]> * chore: add test to verify parseCustomPropertyField won't panic with a property name ending in dot Signed-off-by: Dhiraj Bokde <[email protected]> * fix: sync generated python client code Signed-off-by: Dhiraj Bokde <[email protected]> * fix: readiness probe tests and new types Signed-off-by: Alessio Pragliola <[email protected]> * chore: refactor readiness_test Signed-off-by: Alessio Pragliola <[email protected]> * fix: ensure parentResourceId is used to filter resource lookup by params, add unit tests for duplicate child resource lookups Signed-off-by: Dhiraj Bokde <[email protected]> * fix: throw an error if a metric value is missing, add test to validate Signed-off-by: Dhiraj Bokde <[email protected]> * fix: fix http status error code for invalid ids Signed-off-by: Dhiraj Bokde <[email protected]> * fix: more id validation, fixed filterQuery passing to DB layer Signed-off-by: Dhiraj Bokde <[email protected]> * fix: fix failing unit test Signed-off-by: Dhiraj Bokde <[email protected]> * fix: validate experiment id when listing runs Signed-off-by: Dhiraj Bokde <[email protected]> * fix: fix failing validation test after fixing http status codes Signed-off-by: Dhiraj Bokde <[email protected]> * fix: avoid duplicate key errors if externalid is set in metric when creating metric history entries Signed-off-by: Dhiraj Bokde <[email protected]> * fix: add fuzzer tests for experiment runs and new artifact types Signed-off-by: Dhiraj Bokde <[email protected]> * chore: code cleanup and format fuzzer tests Signed-off-by: Dhiraj Bokde <[email protected]> * fix: log error in fuzzer test Signed-off-by: Dhiraj Bokde <[email protected]> * fix: handle null artifact names correctly on create Signed-off-by: Dhiraj Bokde <[email protected]> --------- Signed-off-by: Dhiraj Bokde <[email protected]> Signed-off-by: Alessio Pragliola <[email protected]> Signed-off-by: Alessio Pragliola <[email protected]> Co-authored-by: Alessio Pragliola <[email protected]> Co-authored-by: Alessio Pragliola <[email protected]> Co-authored-by: Paul Boyd <[email protected]>
1 parent 60ab78a commit 655a9d5

File tree

226 files changed

+49784
-5813
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

226 files changed

+49784
-5813
lines changed

Makefile

Lines changed: 5 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -176,7 +176,10 @@ endif
176176

177177
.PHONY: vet
178178
vet:
179-
${GO} vet ./...
179+
@echo "Running go vet on all packages..."
180+
@${GO} vet $$(${GO} list ./... | grep -vF github.com/kubeflow/model-registry/internal/db/filter) && \
181+
echo "Checking filter package (parser.go excluded due to participle struct tags)..." && \
182+
cd internal/db/filter && ${GO} build -o /dev/null . 2>&1 | grep -E "vet:|error:" || echo "✓ Filter package builds successfully"
180183

181184
.PHONY: clean/csi
182185
clean/csi:
@@ -397,7 +400,7 @@ controller/vet: ## Run go vet against code.
397400

398401
.PHONY: controller/test
399402
controller/test: controller/manifests controller/generate controller/fmt controller/vet bin/envtest ## Run tests.
400-
KUBEBUILDER_ASSETS="$(shell $(ENVTEST) use $(ENVTEST_K8S_VERSION) --bin-dir $(PROJECT_BIN) -p path)" go test $$(go list ./internal/controller/... | grep -v /e2e) -coverprofile cover.out
403+
KUBEBUILDER_ASSETS="$(shell $(ENVTEST) use $(ENVTEST_K8S_VERSION) --bin-dir $(PROJECT_BIN) -p path)" go test $$(go list ./internal/controller/... | grep -vF /e2e) -coverprofile cover.out
401404

402405
##@ Build
403406

api/openapi/catalog.yaml

Lines changed: 70 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -137,6 +137,15 @@ paths:
137137
required: true
138138
components:
139139
schemas:
140+
ArtifactTypeQueryParam:
141+
description: Supported artifact types for querying.
142+
enum:
143+
- model-artifact
144+
- doc-artifact
145+
- dataset-artifact
146+
- metric
147+
- parameter
148+
type: string
140149
BaseModel:
141150
type: object
142151
properties:
@@ -569,6 +578,43 @@ components:
569578
type: string
570579
in: query
571580
required: false
581+
filterQuery:
582+
examples:
583+
filterQuery:
584+
value: "name='my-model' AND state='LIVE'"
585+
name: filterQuery
586+
description: |
587+
A SQL-like query string to filter the list of entities. The query supports rich filtering capabilities with automatic type inference.
588+
589+
**Supported Operators:**
590+
- Comparison: `=`, `!=`, `<>`, `>`, `<`, `>=`, `<=`
591+
- Pattern matching: `LIKE`, `ILIKE` (case-insensitive)
592+
- Set membership: `IN`
593+
- Logical: `AND`, `OR`
594+
- Grouping: `()` for complex expressions
595+
596+
**Data Types:**
597+
- Strings: `"value"` or `'value'`
598+
- Numbers: `42`, `3.14`, `1e-5`
599+
- Booleans: `true`, `false` (case-insensitive)
600+
601+
**Property Access:**
602+
- Standard properties: `name`, `id`, `state`, `createTimeSinceEpoch`
603+
- Custom properties: Any user-defined property name
604+
- Escaped properties: Use backticks for special characters: `` `custom-property` ``
605+
- Type-specific access: `property.string_value`, `property.double_value`, `property.int_value`, `property.bool_value`
606+
607+
**Examples:**
608+
- Basic: `name = "my-model"`
609+
- Comparison: `accuracy > 0.95`
610+
- Pattern: `name LIKE "%tensorflow%"`
611+
- Complex: `(name = "model-a" OR name = "model-b") AND state = "LIVE"`
612+
- Custom property: `framework.string_value = "pytorch"`
613+
- Escaped property: `` `mlflow.source.type` = "notebook" ``
614+
schema:
615+
type: string
616+
in: query
617+
required: false
572618
pageSize:
573619
examples:
574620
pageSize:
@@ -598,6 +644,30 @@ components:
598644
$ref: "#/components/schemas/SortOrder"
599645
in: query
600646
required: false
647+
artifactType:
648+
style: form
649+
explode: true
650+
examples:
651+
artifactType:
652+
value: model-artifact
653+
name: artifactType
654+
description: "Specifies the artifact type for listing artifacts."
655+
schema:
656+
$ref: "#/components/schemas/ArtifactTypeQueryParam"
657+
in: query
658+
required: false
659+
stepIds:
660+
style: form
661+
explode: true
662+
examples:
663+
stepIds:
664+
value: "1,2,3"
665+
name: stepIds
666+
description: "Comma-separated list of step IDs to filter metrics by."
667+
schema:
668+
type: string
669+
in: query
670+
required: false
601671
securitySchemes:
602672
Bearer:
603673
scheme: bearer

0 commit comments

Comments
 (0)