|
| 1 | +# KEP-nnn: Renaming "Model Registry" to reflect Registry and Catalog use-cases |
| 2 | + |
| 3 | +Status: `provisional`. |
| 4 | + |
| 5 | +## Summary |
| 6 | + |
| 7 | +This KEP proposes renaming the current "Model Registry" Kubeflow project to better reflect its evolution that encompasses both _model registry_ capabilities (for tracking model evolution during development) and _model catalog_ capabilities (for showcasing organization-approved models). The current naming might under-evaluate the project's capabilities and goals, as the "Kubeflow Model Registry" project has well grown beyond its original scope to include GenAI/LLM showcasing, enterprise-wide model sharing, and more than a singular use-case pattern (i.e.: multi-tenant Registries and a cluster-wide Catalog). |
| 8 | + |
| 9 | +## Motivation |
| 10 | + |
| 11 | +The Kubeflow Model Registry project, originally proposed in [this Model Registry Platform proposal](../model-registry-proposal.md), has evolved significantly since its original onboarding. The project now serves two distinct but complementary use cases that are not clearly reflected in its current naming ("Model Registry"): |
| 12 | + |
| 13 | +1. **Model Registry**: Tenant-scoped model tracking during development lifecycle |
| 14 | +2. **Model Catalog**: Cluster-scoped showcase of organization-approved models, including GenAI/LLM models |
| 15 | + |
| 16 | +According to the [Kubeflow 2023 survey](https://blog.kubeflow.org/kubeflow-user-survey-2023/), 44% of users identified Model Registry as a critical gap in the ML lifecycle. Meanwhile, the current implementation addresses a broader model management need, that extends beyond the traditional registry concepts, especially into newer GenAI and LLM use-cases. |
| 17 | + |
| 18 | +This proposal captures a [community discussion](https://github.com/kubeflow/community/pull/892#discussion_r2263804358) as a proper KEP form. |
| 19 | + |
| 20 | +### Goals |
| 21 | + |
| 22 | +<!-- these are the Goals of the KEP --> |
| 23 | + |
| 24 | +- Clarify the project's ability to cover the 2 use-cases through improved naming |
| 25 | +- Better reflect the project's evolution to support GenAI/LLM model showcasing |
| 26 | +- Align terminology with Industry standards and User expectations |
| 27 | +- Facilitate better understanding for new Users and Enterprise adoption |
| 28 | + |
| 29 | +### Non-Goals |
| 30 | + |
| 31 | +<!-- these are the non-Goals of this KEP --> |
| 32 | + |
| 33 | +- Change the underlying technical architecture or functionality, API, integrations patterns, etc. |
| 34 | +- Impact current deployment or operational procedures |
| 35 | +- Modify, replace, deprecate existing WG |
| 36 | + |
| 37 | +## Proposal |
| 38 | + |
| 39 | +We propose renaming the "Model Registry" Kubeflow project to better reflect its current, more comprehensive model management capabilities. This KEP presents the analysis and considerations for potential naming options (below) while keeping the discussion open for community inputs! |
| 40 | + |
| 41 | +### Current Project Analysis |
| 42 | + |
| 43 | +The existing "Model Registry" project actually encompasses 2 distinct use-case and patterns: |
| 44 | + |
| 45 | +#### Model Registry (Tenant-Scoped) |
| 46 | +- Purpose: track model evolution during development lifecycle |
| 47 | +- Scope: per-tenant/per-namespace deployment |
| 48 | +- Use Cases: |
| 49 | + - Training experiments and iterations |
| 50 | + - Model fine-tuning and alignment tracking |
| 51 | + - Version management during development |
| 52 | + - Metadata for training runs, parameters, metrics |
| 53 | + - Model lineage from data --> training --> checkpoints |
| 54 | +- Users: Data Scientists, ML engineers within specific teams/projects |
| 55 | +- Deployment: 0..N instances per cluster (1 per Tenant) |
| 56 | + |
| 57 | +#### Model Catalog (Cluster-Scoped, Company-scoped) |
| 58 | +- Purpose: showcase "blessed" models for organization-wide consumption |
| 59 | +- Scope: single instance per cluster/Kubeflow installation |
| 60 | +- Use Cases: |
| 61 | + - GenAI/LLM model showcase |
| 62 | + - Organization-approved models (might be internal models, + external models) |
| 63 | + - Model discovery and sharing across teams |
| 64 | + - Corporate model governance and compliance |
| 65 | + - Integration with external model sources (HuggingFace, etc.) |
| 66 | +- Users: all organization members, model consumers |
| 67 | +- Deployment: (1) singleton per cluster |
| 68 | + |
| 69 | +#### Summarizing the 2 use-cases |
| 70 | + |
| 71 | +The following table summarizes the key differences between Registry and Catalog concepts. |
| 72 | + |
| 73 | +> ![NOTE] |
| 74 | +> These distinctions represent common usage patterns and blueprints. Organizations may _still_ choose to adopt a single Model Registry approach if it better fits their needs! This table is provided only for illustrative purposes and to highlight the different use cases the current Kubeflow project supports. |
| 75 | +
|
| 76 | +| Aspect | Model Registry pattern | Model Catalog pattern | |
| 77 | +|--------|------------------------|------------------------| |
| 78 | +| **Lifecycle Stage** | Development & Training | Production & Sharing | |
| 79 | +| **Audience** | Team/Project members | Company-/Organization-wide | |
| 80 | +| **Model Types** | Work-in-progress models | Blessed/approved models for the whole Company | |
| 81 | +| **Versioning** | More fine-grained iterations | Major releases | |
| 82 | +| **Governance** | Team-level (MLOps) | Enterprise-level (Admin + Stakeholders) | |
| 83 | +| **Source** | Typically only internal training/tuning models | Internal + External models | |
| 84 | +| **Deployment** | Multiple-tenants (1 per Tenant) | Single instance (cluster-wide) | |
| 85 | +| **Discovery** | Project-scoped search | Organization-wide catalog | |
| 86 | +| **Compliance** | Development standards | Enterprise policies enforced by Stakeholders | |
| 87 | + |
| 88 | +#### Naming Patterns in AI/ML Ecosystem |
| 89 | + |
| 90 | +Existing and common terminology patterns in other projects/products: |
| 91 | + |
| 92 | +- **"Registry"**: current name, MlFlow, Azure, AWS |
| 93 | +- **"Hub"**: Docker Hub, HuggingFace Hub |
| 94 | +- **"Model Garden"**: Vertex AI |
| 95 | + |
| 96 | +### User Stories |
| 97 | + |
| 98 | +#### Story 1: Data Scientist workflow |
| 99 | +As a Data Scientist, I want to track my model experiments and iterations within my team's workspace (_Model Registry_ pattern) while also being able to discover and use approved models from our organization's catalog (_Model Catalog_ pattern). |
| 100 | + |
| 101 | +#### Story 2: Organization model's Governance |
| 102 | +As a Stakeholder, I want to distinguish between development model tracking (per-team registries) and organization-wide model sharing (centralized catalog) to implement appropriate governance policies. |
| 103 | + |
| 104 | +#### Story 3: the Admin |
| 105 | +As a platform Administrator, I need to showcase _the_ approved GenAI/LLM models Organization-wide through a centralized catalog, while maintaining separate registries for individual team development needs. |
| 106 | + |
| 107 | +### Naming Considerations |
| 108 | + |
| 109 | +#### Evaluation Criteria |
| 110 | + |
| 111 | +As we ask for Community inputs, the following criteria could be used later to evaluate the final selected name. |
| 112 | + |
| 113 | +- **Scope Clarity**: does the name clearly indicate the 2 model management (registry, catalog) capabilities? |
| 114 | +- **Industry Alignment**: is the name consistent with established ML/AI terminology? Do we actually need to be aligned, or can we introduce a "creative" name? |
| 115 | +- **Future-Proofing**: can it accommodate GenAI/LLM evolution and Organization needs? |
| 116 | +- **Kubeflow Integration**: does it fit within the Kubeflow ecosystem naming? |
| 117 | + |
| 118 | +#### Candidate Names for Discussion |
| 119 | + |
| 120 | +Captured each one from the previous [community discussion](https://github.com/kubeflow/community/pull/892#discussion_r2263804358), and the Community can provide inputs directly in markdown below: |
| 121 | + |
| 122 | +- "AI Asset Registry" |
| 123 | +- "AI Asset" |
| 124 | +- "Kubeflow Registry" (simplified by dropping "Model") |
| 125 | +- "Kubeflow Tracker" |
| 126 | +- "Kubeflow Tracking" |
| 127 | +- "AI Hub" |
| 128 | +- `<add your proposal name in this list>` |
| 129 | + |
| 130 | +### Risks and Mitigations |
| 131 | + |
| 132 | +- **Risk**: Community confusion for those familiar with "Kubeflow Model Registry" |
| 133 | +- **Mitigation**: phased communication plan, clear documentation updates |
| 134 | + |
| 135 | +## Design Details |
| 136 | + |
| 137 | +### Migration Strategy |
| 138 | + |
| 139 | +1. **Phase 1**: Community discussion and name selection |
| 140 | +2. **Phase 2**: Implement any needed Repository renaming and documentation updates |
| 141 | +3. **Phase 3**: Update naming across Kubeflow and external communications |
| 142 | + |
| 143 | +## Implementation History |
| 144 | + |
| 145 | +- **KEP Creation**: 2025-09-29 |
| 146 | +- **Community Discussion**: WIP |
| 147 | +- **Name Selection**: WIP |
| 148 | +- **Implementation Start**: WIP |
| 149 | + |
| 150 | +## Drawbacks |
| 151 | + |
| 152 | +- **Transition Overhead**: Requires some coordination and documentation updates |
| 153 | +- **Community Learning Curve**: Kubeflow Users must adapt to new terminology |
| 154 | +- **External Impact**: Existing external references and materials might need update |
| 155 | + |
| 156 | +## Alternatives |
| 157 | + |
| 158 | +### Alternative 0: Maintain Current Naming (the "do nothing" alternative) |
| 159 | +- **Pros**: no transition overhead, existing naming |
| 160 | +- **Cons**: continued confusion and under-evaluation about project scope, misalignment with evolved capabilities |
| 161 | + |
| 162 | +### Alternative 1: Add Descriptive Suffixes (eg "Kubeflow Model Registry and Catalog") |
| 163 | +- **Pros**: maintain base naming while adding clarity only in documentation updates |
| 164 | +- **Cons**: verbose naming, inconsistent with Kubeflow patterns |
| 165 | + |
| 166 | +## References |
| 167 | +- https://github.com/kubeflow/community/pull/892#discussion_r2263804358 |
0 commit comments