Skip to content

feat(poc): catalog-gen, a cli tool to generate new entity catalogs [V2] #2219

Draft
Al-Pragliola wants to merge 8 commits intokubeflow:mainfrom
Al-Pragliola:al-pragliola-poc-catalog-gen-v2
Draft

feat(poc): catalog-gen, a cli tool to generate new entity catalogs [V2] #2219
Al-Pragliola wants to merge 8 commits intokubeflow:mainfrom
Al-Pragliola:al-pragliola-poc-catalog-gen-v2

Conversation

@Al-Pragliola
Copy link
Contributor

@Al-Pragliola Al-Pragliola commented Feb 11, 2026

Description

This PR introduces a plugin-based catalog architecture that evolves the Model Catalog from a single-purpose service into a generic, extensible platform for managing diverse AI assets —
models, MCP servers, datasets, and more.

The existing Model Catalog is wrapped as a plugin with zero breaking changes: all current API paths, schemas, and behaviors are preserved. New asset types are added as independent
plugins, each with their own API surface, database tables, and data providers, all running under a unified catalog-server.

A companion CLI tool, catalog-gen, scaffolds complete catalog plugins from a declarative YAML configuration, generating all boilerplate deterministically. Post-boilerplate steps (business
logic, provider implementation, testing) are supported by generated AI agent workflows.

For a detailed architecture description, see #2220

Key Features

Plugin System

  • Unified catalog server orchestrating multiple catalog plugins in a single process
  • CatalogPlugin interface with lifecycle management: Init, Start, Stop, Healthy, RegisterRoutes, Migrations
  • Compile-time registration via Go init() — adding a plugin is a blank import
  • Runtime configuration via sources.yaml — each plugin reads its own data source definitions
  • Shared database (SQLite/MySQL/PostgreSQL) with per-plugin migration support
  • Health endpoints: /healthz, /readyz, /api/plugins

Model Catalog as a Plugin

  • Existing model catalog wrapped into catalog/plugins/model/ with no API changes
  • All paths preserved under /api/model_catalog/v1alpha1/
  • Legacy infrastructure (loader, providers, DB service) reused as-is
  • No client-side changes required

Generic Catalog Framework

  • Type-parameterized building blocks: Loader[E, A], ProviderRegistry[E, A], ProviderFunc[E, A]
  • Shared source configuration, hot-reload, and file watching
  • filterQuery parameter across all list endpoints — SQL-like syntax with comparison, pattern matching, set membership, and logical operators
  • Consistent pagination, ordering, and response envelopes

Plugin API Schema Strategy

  • Each plugin owns its OpenAPI spec under catalog/plugins/<name>/api/openapi/
  • Shared base schemas (BaseResource, BaseResourceList, MetadataValue, etc.) in api/openapi/src/lib/common.yaml
  • Plugin schemas reference shared schemas via symlink — single source of truth
  • Automated merge process (scripts/merge_catalog_specs.sh) combines all plugin specs into a unified catalog-spec.yaml with schema prefixing, path absolutization, and conflict avoidance
  • CI validation via make openapi/validate

catalog-gen — Deterministic Scaffolding

  • CLI tool inspired by kubebuilder: catalog-gen init <name> --entity=<Entity> --package=<pkg>
  • Generates complete plugin scaffold from a declarative catalog.yaml: models, repositories, OpenAPI specs, providers, filter mappings, Makefile
  • catalog-gen generate regenerates non-editable files after schema changes; editable files (service impl, providers) are never overwritten
  • Additional commands: add-property, add-artifact, add-artifact-property, gen-testdata
  • Deterministic output: same input always produces same output

Agentic Workflows

  • catalog-gen generates .claude/commands/ and .claude/skills/ per plugin
  • Slash commands: /add-property, /add-artifact, /regenerate, /fix-build, /gen-testdata
  • CLAUDE.md generated per plugin with architecture summary, type mappings, and workflow guide
  • Separates deterministic work (schema → boilerplate) from judgment-dependent work (business logic, providers, tests)

Project Structure

  cmd/
  ├── catalog-server/          # Unified server entry point
  │   └── main.go              # Plugin imports, config loading, server startup
  └── catalog-gen/             # Code generation CLI
      ├── main.go              # CLI entry point (init, generate, add-*)
      ├── gen_plugin.go        # Plugin scaffold generation
      ├── gen_api.go           # OpenAPI generation
      ├── gen_models.go        # Entity & artifact model generation
      ├── gen_service.go       # Repository & spec generation
      ├── gen_providers.go     # Provider generation
      └── templates/           # 45+ Go templates organized by domain

  pkg/catalog/
  ├── plugin/
  │   ├── plugin.go            # CatalogPlugin interface
  │   ├── registry.go          # Plugin registry (init()-based)
  │   ├── server.go            # Unified server orchestration
  │   └── config.go            # Configuration types (sources.yaml)
  ├── loader.go                # Generic Loader[E, A]
  ├── provider.go              # Generic ProviderRegistry[E, A]
  ├── source.go                # Source definitions & filtering
  └── watcher.go               # File watching for hot-reload

  catalog/plugins/
  ├── model/                   # Model catalog plugin (wraps legacy)
  │   ├── plugin.go
  │   └── register.go
  └── mcp/                     # MCP plugin (generated by catalog-gen)
      ├── catalog.yaml          # Plugin schema definition
      ├── plugin.go             # Generated plugin lifecycle
      ├── internal/             # Models, repositories, providers, API handlers
      └── api/openapi/          # Plugin OpenAPI spec

  api/openapi/
  ├── src/lib/common.yaml      # Shared base schemas (BaseResource, etc.)
  ├── catalog.yaml              # Main catalog spec (model paths)
  └── catalog-spec.yaml         # Merged unified spec (all plugins)

  scripts/
  ├── merge_openapi.sh          # Merges OpenAPI source files
  └── merge_catalog_specs.sh    # Merges plugin specs into unified spec

How Has This Been Tested?

Merge criteria:

  • All the commits have been signed-off (To pass the DCO check)
  • The commits have meaningful messages
  • Automated tests are provided as part of the PR for major new functionalities; testing instructions have been added in the PR body (for PRs involving changes that are not immediately obvious).
  • The developer has manually tested the changes and verified that the changes work.
  • Code changes follow the kubeflow contribution guidelines.
  • For first time contributors: Please reach out to the Reviewers to ensure all tests are being run, ensuring the label ok-to-test has been added to the PR.

Signed-off-by: Alessio Pragliola <seth.pro@gmail.com>
Signed-off-by: Alessio Pragliola <seth.pro@gmail.com>
Signed-off-by: Alessio Pragliola <seth.pro@gmail.com>
Signed-off-by: Alessio Pragliola <seth.pro@gmail.com>
Signed-off-by: Alessio Pragliola <seth.pro@gmail.com>
Signed-off-by: Alessio Pragliola <seth.pro@gmail.com>
Signed-off-by: Alessio Pragliola <seth.pro@gmail.com>
@google-oss-prow
Copy link

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please ask for approval from al-pragliola. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Signed-off-by: Alessio Pragliola <seth.pro@gmail.com>
@Al-Pragliola Al-Pragliola force-pushed the al-pragliola-poc-catalog-gen-v2 branch from 486c71e to f984f2a Compare February 11, 2026 17:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant

Comments