Skip to content

Commit 769906d

Browse files
author
Akshay Chitneni
committed
GSOC: OptimizationJob CRD project
Signed-off-by: Akshay Chitneni <[email protected]>
1 parent b4eacc0 commit 769906d

File tree

1 file changed

+32
-0
lines changed

1 file changed

+32
-0
lines changed

content/en/events/upcoming-events/gsoc-2026.md

Lines changed: 32 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -104,3 +104,35 @@ To participate in GSoC with Kubeflow, you **must** meet the GSoC [eligibility re
104104

105105
---
106106

107+
## Project 3: OptimizationJob CRD for Hyperparameter Optimization
108+
109+
**Components:** [kubeflow/katib](https://www.github.com/kubeflow/katib), [kubeflow/sdk](https://www.github.com/kubeflow/sdk), [kubeflow/trainer](https://www.github.com/kubeflow/trainer)
110+
111+
**Mentors:** [@akshaychitneni](https://github.com/akshaychitneni), [@andreyvelich](https://github.com/andreyvelich)
112+
113+
**Contributor:**
114+
115+
**Details:**
116+
117+
Hyperparameter optimization (HPO) is critical for maximizing model performance in machine learning workflows. While Katib currently provides HPO capabilities through the `Experiment` CRD, it was designed for broad use cases including Neural Architecture Search (NAS) and arbitrary workloads.
118+
119+
This project aims to design and implement a new **OptimizationJob CRD** (`optimizer.kubeflow.org/v1alpha1`) specifically focused on hyperparameter optimization for TrainJobs. The new CRD will provide:
120+
121+
- **Tighter TrainJob Integration**: Replace unstructured trial specifications with typed TrainJob templates, enabling strong validation
122+
- **Shared Initialization**: Implement a common initializer pattern that runs once and shares model/dataset artifacts across all trials reducing trial startup time and storage costs
123+
- **Simplified API**: Focus exclusively on HPO use cases
124+
- **Modern Metrics Collection**: Support push-based metrics reporting via the Kubeflow SDK
125+
- **SDK Alignment**: Integrate with `OptimizerClient` API from [KEP-46: Hyperparameter Optimization in Kubeflow SDK](https://github.com/kubeflow/sdk/blob/main/docs/proposals/46-hyperparameter-optimization/README.md)
126+
127+
Tracking issue: [kubeflow/katib#2605](https://github.com/kubeflow/katib/issues/2605)
128+
129+
**Difficulty:** Hard
130+
131+
**Size:** 350 hours (Large)
132+
133+
**Skills Required/Preferred:**
134+
* Go
135+
* Python
136+
* Familiarity with Kubernetes controllers, CRDs
137+
* Basic understanding of machine learning training workflows
138+
* Experience with HPO frameworks

0 commit comments

Comments
 (0)