Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 9 additions & 0 deletions .claude/settings.local.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
{
"permissions": {
"allow": [
"Bash(chmod:*)"
],
"deny": [],
"ask": []
}
}
296 changes: 296 additions & 0 deletions LLMD_INTEGRATION_SUMMARY.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,296 @@
# LLM-D Integration Summary

This document provides a summary of the llm-d controller integration into the opendatahub-operator.

## What Was Implemented

A complete POC integration of llm-d as a component in the OpenDataHub operator using an operator SDK + Helm-based approach.

## Components Created

### 1. API Types
**Location**: `api/components/v1alpha1/llmd_types.go`

Defines the CRD for llm-d component including:
- `Llmd`: Main CRD
- `LlmdSpec`: Configuration for three Helm charts (ModelService, Infra, Gateway API)
- `LlmdStatus`: Status information
- `DSCLlmd`: Integration with DataScienceCluster

### 2. Component Handler
**Location**: `internal/controller/components/llmd/`

Files created:
- `llmd.go`: Component handler implementing registry interface
- `llmd_controller.go`: Reconciliation controller with resource watching
- `llmd_controller_actions.go`: Custom reconciliation actions
- `llmd_support.go`: Helper functions for manifest paths
- `monitoring/llmd-prometheusrules.tmpl.yaml`: Prometheus monitoring rules

### 3. Helm Chart Integration
**Location**: `hack/fetch-helm-charts.sh`

Script that:
- Downloads three Helm charts:
- llm-d-modelservice (v0.2.11)
- llm-d-infra (v1.3.3)
- Gateway API Inference Extension (v1.0.1)
- Templates them with default values
- Creates Kustomize-compatible manifests in `opt/manifests/llmd/`

### 4. Integration Points

#### DataScienceCluster Integration
**Modified**: `api/datasciencecluster/v2/datasciencecluster_types.go`
- Added `Llmd` field to `Components` struct
- Added `Llmd` field to `ComponentsStatus` struct

#### Main Controller Registration
**Modified**: `cmd/main.go`
- Added import for llmd controller

#### Project Configuration
**Modified**: `PROJECT`
- Added Llmd resource definition

#### Manifest Fetching
**Modified**: `get_all_manifests.sh`
- Added call to `hack/fetch-helm-charts.sh`

### 5. Documentation & Examples

Created:
- `docs/llmd-integration.md`: Comprehensive integration guide
- `internal/controller/components/llmd/README.md`: Component-specific README
- `config/samples/components_v1alpha1_llmd.yaml`: Example Llmd CR
- `config/samples/datasciencecluster_v2_llmd_example.yaml`: Example DSC with llm-d

## How It Works

### Build-Time
1. `make get-manifests` runs `hack/fetch-helm-charts.sh`
2. Script downloads and templates Helm charts
3. Manifests are stored in `opt/manifests/llmd/`
4. Operator image includes these pre-rendered manifests

### Runtime
1. User creates DataScienceCluster or Llmd CR
2. Component handler creates/updates Llmd CR
3. Controller reconciles Llmd CR
4. Deploys manifests from `opt/manifests/llmd/overlays/default`
5. Updates status in Llmd and DataScienceCluster

## Deployment Example

### Using DataScienceCluster

```yaml
apiVersion: datasciencecluster.opendatahub.io/v2
kind: DataScienceCluster
metadata:
name: default-dsc
spec:
components:
llmd:
managementState: Managed
modelService:
enabled: true
version: "0.2.11"
infra:
enabled: true
version: "1.3.3"
gatewayAPI:
enabled: true
version: "1.0.1"
```

### Using Llmd CR Directly

```yaml
apiVersion: components.platform.opendatahub.io/v1alpha1
kind: Llmd
metadata:
name: default-llm-d
spec:
modelService:
enabled: true
version: "0.2.11"
infra:
enabled: true
version: "1.3.3"
gatewayAPI:
enabled: true
version: "1.0.1"
```

## Testing the Integration

### 1. Generate Manifests
```bash
make get-manifests
```

This will:
- Fetch all component manifests
- Download and template llm-d Helm charts

### 2. Generate CRDs
```bash
make manifests
```

This generates the Kubernetes CRDs including the new Llmd CRD.

### 3. Build Operator
```bash
make image IMG=quay.io/<username>/opendatahub-operator:<tag>
```

### 4. Deploy Operator
```bash
make deploy IMG=quay.io/<username>/opendatahub-operator:<tag>
```

### 5. Create Test Instance
```bash
kubectl apply -f config/samples/datasciencecluster_v2_llmd_example.yaml
```

### 6. Verify Deployment
```bash
# Check Llmd CR status
kubectl get llmd default-llm-d -o yaml

# Check DataScienceCluster status
kubectl get datasciencecluster default-dsc -o jsonpath='{.status.components.llmd}'

# Check deployed resources
kubectl get all -n llm-d -l app.kubernetes.io/part-of=llm-d
```

## Files Modified/Created

### Created Files (22 total)
1. `api/components/v1alpha1/llmd_types.go`
2. `internal/controller/components/llmd/llmd.go`
3. `internal/controller/components/llmd/llmd_controller.go`
4. `internal/controller/components/llmd/llmd_controller_actions.go`
5. `internal/controller/components/llmd/llmd_support.go`
6. `internal/controller/components/llmd/monitoring/llmd-prometheusrules.tmpl.yaml`
7. `internal/controller/components/llmd/README.md`
8. `hack/fetch-helm-charts.sh`
9. `config/samples/components_v1alpha1_llmd.yaml`
10. `config/samples/datasciencecluster_v2_llmd_example.yaml`
11. `docs/llmd-integration.md`
12. `LLMD_INTEGRATION_SUMMARY.md` (this file)

### Modified Files (4 total)
1. `api/datasciencecluster/v2/datasciencecluster_types.go`
2. `cmd/main.go`
3. `PROJECT`
4. `get_all_manifests.sh`

## Architecture Diagram

```
┌─────────────────────────────────────────────────────────────┐
│ DataScienceCluster CR │
│ spec.components.llmd.managementState: Managed │
└─────────────────────┬───────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────┐
│ Component Handler (llmd.go) │
│ - Registered in component registry │
│ - Creates/Updates Llmd CR from DSC spec │
└─────────────────────┬───────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────┐
│ Llmd CR (CRD) │
│ apiVersion: components.platform.opendatahub.io/v1alpha1 │
│ kind: Llmd │
└─────────────────────┬───────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────┐
│ Controller (llmd_controller.go) │
│ - Watches Llmd CR │
│ - Renders Kustomize manifests │
│ - Deploys resources to cluster │
│ - Updates status │
└─────────────────────┬───────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────┐
│ Deployed Resources in llm-d namespace │
│ │
│ ┌──────────────────┐ ┌──────────────────┐ │
│ │ ModelService │ │ Infra │ │
│ │ (v0.2.11) │ │ (v1.3.3) │ │
│ └──────────────────┘ └──────────────────┘ │
│ │
│ ┌──────────────────────────────────────────┐ │
│ │ Gateway API Inference Extension │ │
│ │ (v1.0.1) │ │
│ └──────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────┘
```

## Next Steps

To use this integration:

1. **Build Prerequisites**:
- Install Helm CLI on build machine
- Ensure network access to Helm chart repositories

2. **Generate Manifests**:
```bash
make get-manifests
```

3. **Generate CRDs**:
```bash
make manifests
```

4. **Build & Deploy**:
```bash
make image-build
make deploy
```

5. **Create Instance**:
```bash
kubectl apply -f config/samples/datasciencecluster_v2_llmd_example.yaml
```

## Limitations & Future Work

### Current Limitations
1. Helm charts are templated at build time, not runtime
2. Limited support for custom Helm values override
3. Fixed namespace (`llm-d`)
4. Chart version changes require operator rebuild

### Recommended Enhancements
1. **Dynamic Helm Integration**: Use Helm SDK to deploy charts at runtime
2. **Full Values Support**: Allow complete Helm values.yaml override
3. **Namespace Configuration**: Make namespace configurable
4. **Chart Upgrade Logic**: Implement proper Helm upgrade/rollback
5. **Chart Repository Config**: Make chart URLs and versions configurable

## References

- **LLM-D ModelService Chart**: https://llm-d-incubation.github.io/llm-d-modelservice/
- **LLM-D Infra Chart**: https://llm-d-incubation.github.io/llm-d-infra/
- **Gateway API Inference Extension**: https://github.com/kubernetes-sigs/gateway-api-inference-extension
- **OpenDataHub Operator**: https://github.com/opendatahub-io/opendatahub-operator

## Questions & Support

For questions or issues with this integration, refer to:
- `docs/llmd-integration.md` - Full integration guide
- `internal/controller/components/llmd/README.md` - Component README
- OpenDataHub documentation - https://opendatahub.io/
8 changes: 8 additions & 0 deletions PROJECT
Original file line number Diff line number Diff line change
Expand Up @@ -144,6 +144,14 @@ resources:
kind: LlamaStackOperator
path: github.com/opendatahub-io/opendatahub-operator/v2/api/components/v1alpha1
version: v1alpha1
- api:
crdVersion: v1alpha1
controller: true
domain: platform.opendatahub.io
group: components
kind: Llmd
path: github.com/opendatahub-io/opendatahub-operator/v2/api/components/v1alpha1
version: v1alpha1
- api:
crdVersion: v1alpha1
controller: true
Expand Down
Loading