Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
70 changes: 70 additions & 0 deletions CLAUDE.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,70 @@
# CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

> **Context Optimization**: This file is structured for efficient agent usage. The "Agent Routing" section defines what context each agent needs. When spawning subagents, pass only relevant sections—not the entire file. Sections marked `<!-- reference -->` are lookup tables; don't include them in agent prompts unless specifically needed.


## Agent Routing

**MANDATORY: All implementation work MUST be performed by subagents.** Never directly edit code, configuration, or documentation in the parent conversation. Instead, always delegate to the appropriate specialized agent from the table below. The parent conversation should only coordinate agents, pass context between them, and communicate results to the user.

Do NOT ask the user which agent to use - pick the appropriate one based on what files or features are being modified.

| Task Type | Agent | When to Use |
|-----------|-------|-------------|
| UI/Frontend | `datum-platform:frontend-dev` | React, TypeScript, CSS, anything in `ui/` directory |
| Go Backend | `datum-platform:api-dev` | Go code in `cmd/`, `internal/`, `pkg/` directories |
| Infrastructure | `datum-platform:sre` | Kustomize, Dockerfile, CI/CD, `config/` directory, `.infra/` for deployment |
| Tests | `datum-platform:test-engineer` | Writing or fixing Go tests |
| Code Review | `datum-platform:code-reviewer` | After implementation, before committing |
| Documentation | `datum-platform:tech-writer` | README, docs/, guides, API documentation |
| Architecture | `Plan` | Designing new features or significant refactors |
| Exploration | `Explore` | Understanding codebase structure or finding code |

**Key principles:**
- **Always use subagents** — never write code, edit files, or run build/test commands directly in the parent conversation
- Use agents proactively without being asked
- For multi-step tasks, use the appropriate agent for each step (launch independent agents in parallel when possible)
- After making code changes, always use `code-reviewer` to validate
- For UI changes, run `npm run build` and `npm run test:e2e` to verify
- **Always test infrastructure changes in a test environment before opening a PR** - Deploy to the test-infra KIND cluster (`task test-infra:cluster-up`) and verify resources work correctly before pushing changes to staging/production repos
- **Use Telepresence for debugging staging issues** - When investigating bugs that only reproduce in staging, intercept the service and run it locally with `task test-infra:telepresence:intercept SERVICE=<name>`. See "Remote Debugging with Telepresence" section.

### Agent Context Requirements

Each agent only needs specific context. When spawning agents, pass minimal relevant info in prompts—don't repeat the entire CLAUDE.md:

| Agent | Required Context | Skip (don't include in prompt) |
|-------|-----------------|--------------------------------|
| `frontend-dev` | UI commands, file paths in `ui/` | Go architecture, ClickHouse, NATS, data pipeline |
| `api-dev` | Go patterns, API resource types, key directories | UI commands, dev environment setup, migrations |
| `sre` | Config structure, build commands, deployment | Code architecture details, CEL patterns |
| `test-engineer` | Test commands, package being tested | Full architecture, deployment, UI |
| `Explore` | Key directories, architecture overview | Build commands, dev setup, deployment |
| `code-reviewer` | Architecture, multi-tenancy model, conventions | Dev environment, build commands |
| `tech-writer` | API resources, architecture overview | Implementation details, build commands |

### Agent Output Guidelines

Agents should return **concise summaries** to minimize context bloat in the parent conversation:

| Agent | Return | Don't Return |
|-------|--------|--------------|
| `Explore` | File paths + 1-line descriptions | Full file contents, extensive code quotes |
| `api-dev` | What was changed + file paths | Full diffs, unchanged code |
| `frontend-dev` | Components modified + any build errors | Full file contents |
| `code-reviewer` | Numbered findings list with file:line refs | Full code blocks for context |
| `test-engineer` | Pass/fail summary + failure messages only | Full test output, passing test details |
| `sre` | Changed manifests + deployment notes | Full YAML contents |

### Multi-Step Task Decomposition

For complex tasks, decompose to minimize per-agent context:

1. **Explore first** (use `model: "haiku"`): Find relevant files → return only paths
2. **Plan if needed**: Design approach → return bullet points only
3. **Implement** (sonnet): Work on specific files identified in step 1
4. **Review**: Check only the changed files

**Critical**: Pass only what's needed between steps. Don't re-explore what's already known.
9 changes: 8 additions & 1 deletion cmd/search/indexer/command.go
Original file line number Diff line number Diff line change
Expand Up @@ -46,6 +46,9 @@ type ResourceIndexerOptions struct {
BatchSize int
FlushInterval time.Duration
BatchMaxConcurrentUploads int

// Multi-tenancy settings.
EnableMultiTenancy bool
}

// NewResourceIndexerOptions creates a new ResourceIndexerOptions with default values.
Expand All @@ -65,6 +68,7 @@ func NewResourceIndexerOptions() *ResourceIndexerOptions {
MeilisearchMaxRetries: 3,
MeilisearchRetryDelay: 500 * time.Millisecond,
BatchMaxConcurrentUploads: 100,
EnableMultiTenancy: false,
}
}

Expand All @@ -89,6 +93,9 @@ func (o *ResourceIndexerOptions) AddFlags(fs *pflag.FlagSet) {
fs.IntVar(&o.MeilisearchMaxRetries, "meilisearch-max-retries", o.MeilisearchMaxRetries, "The maximum number of retries for transient Meilisearch errors.")
fs.DurationVar(&o.MeilisearchRetryDelay, "meilisearch-retry-delay", o.MeilisearchRetryDelay, "The base delay between Meilisearch retries.")
fs.IntVar(&o.BatchMaxConcurrentUploads, "batch-max-concurrent-uploads", o.BatchMaxConcurrentUploads, "The maximum number of concurrent uploads to Meilisearch.")

// Multi-tenancy
fs.BoolVar(&o.EnableMultiTenancy, "enable-multi-tenancy", o.EnableMultiTenancy, "Enable multi-tenant mode to index resources from all project control planes.")
}

// Validate checks if the resource indexer options are valid.
Expand Down Expand Up @@ -295,7 +302,7 @@ func Run(o *ResourceIndexerOptions, ctx context.Context) error {
auditBatcher.Start(ctx)
reindexBatcher.Start(ctx)

auditIdx := indexer.NewIndexer(auditConsumer, indexPolicyCache, auditBatcher)
auditIdx := indexer.NewIndexer(auditConsumer, indexPolicyCache, auditBatcher, o.EnableMultiTenancy)
reindexIdx := indexer.NewReindexConsumer(reindexJSConsumer, reindexPolicyCache, reindexBatcher)

klog.Info("Starting audit indexer and re-index consumer...")
Expand Down
49 changes: 49 additions & 0 deletions cmd/search/manager/command.go
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@ package manager
import (
"context"
"crypto/tls"
"errors"
"fmt"
"os"
"time"
Expand All @@ -12,6 +13,7 @@ import (
"github.com/spf13/cobra"
"github.com/spf13/pflag"
"go.miloapis.net/search/internal/indexer"
"go.miloapis.net/search/internal/tenant"
"go.miloapis.net/search/pkg/apis/search/install"
"k8s.io/apimachinery/pkg/runtime"
utilruntime "k8s.io/apimachinery/pkg/util/runtime"
Expand Down Expand Up @@ -59,6 +61,10 @@ type ControllerManagerOptions struct {
NatsTLSCA string
NatsTLSCert string
NatsTLSKey string

// Multi-tenancy settings.
EnableMultiTenancy bool
ProjectLabelSelector string
}

// NewControllerManagerOptions creates a new ControllerManagerOptions with default values
Expand All @@ -77,6 +83,7 @@ func NewControllerManagerOptions() *ControllerManagerOptions {
MeilisearchDomain: "http://meilisearch.meilisearch-system.svc.cluster.local:7700",
NatsURL: "nats://nats.nats-system.svc.cluster.local:4222",
NatsReindexSubject: "reindex.all",
EnableMultiTenancy: false,
}
}

Expand Down Expand Up @@ -107,6 +114,10 @@ func (o *ControllerManagerOptions) AddFlags(fs *pflag.FlagSet) {
fs.StringVar(&o.NatsTLSCA, "nats-tls-ca", o.NatsTLSCA, "The path to the NATS TLS CA file.")
fs.StringVar(&o.NatsTLSCert, "nats-tls-cert", o.NatsTLSCert, "The path to the NATS TLS certificate file.")
fs.StringVar(&o.NatsTLSKey, "nats-tls-key", o.NatsTLSKey, "The path to the NATS TLS key file.")

// Multi-tenancy
fs.BoolVar(&o.EnableMultiTenancy, "enable-multi-tenancy", o.EnableMultiTenancy, "Enable multi-tenant mode to index resources from all project control planes.")
fs.StringVar(&o.ProjectLabelSelector, "project-label-selector", o.ProjectLabelSelector, "Label selector to filter which projects are indexed (empty = all projects).")
}

// Validate validates the options
Expand Down Expand Up @@ -243,6 +254,43 @@ func Run(o *ControllerManagerOptions, ctx context.Context) error {

reindexPub := indexer.NewReindexPublisher(js, o.NatsReindexSubject)

// Build TenantRegistry based on deployment mode.
var registry tenant.TenantRegistry
if o.EnableMultiTenancy {
// Create a PolicyCache backed by the manager's shared informer cache.
// requireReadyCondition=true ensures only fully-initialized policies
// (index created, attributes synced) are included in the cache.
policyCache, err := indexer.NewPolicyCache(mgr.GetCache(), true)
if err != nil {
setupLog.Error(err, "unable to create policy cache")
os.Exit(1)
}
if err := policyCache.RegisterHandlers(ctx); err != nil {
setupLog.Error(err, "unable to register policy cache handlers")
os.Exit(1)
}

// ProjectWatcher handles tenant lifecycle: on disengagement it purges all
// tenant documents from each index.
projectWatcher := tenant.NewProjectWatcher(policyCache, searchSDK)

multiRegistry := tenant.NewMultiTenantRegistry(
cfg,
dynamicClient,
o.ProjectLabelSelector,
projectWatcher.OnTenantEngaged,
projectWatcher.OnTenantDisengaged,
)
go func() {
if err := multiRegistry.Run(ctx); err != nil && !errors.Is(err, context.Canceled) {
setupLog.Error(err, "MultiTenantRegistry stopped unexpectedly")
}
}()
registry = multiRegistry
} else {
registry = tenant.NewSingleTenantRegistry(dynamicClient)
}

if err = (&policycontroller.ResourceIndexPolicyReconciler{
Client: mgr.GetClient(),
Scheme: mgr.GetScheme(),
Expand All @@ -251,6 +299,7 @@ func Run(o *ControllerManagerOptions, ctx context.Context) error {
DynamicClient: dynamicClient,
RESTMapper: mgr.GetRESTMapper(),
ReindexPublisher: reindexPub,
TenantRegistry: registry,
}).SetupWithManager(mgr); err != nil {
setupLog.Error(err, "unable to create controller", "controller", "ResourceIndexPolicy")
os.Exit(1)
Expand Down
6 changes: 6 additions & 0 deletions config/base/controller-manager/deployment.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -48,6 +48,8 @@ spec:
- --nats-tls-cert=$(NATS_TLS_CERT)
- --nats-tls-key=$(NATS_TLS_KEY)
- --leader-elect-resource-namespace=$(LEADER_ELECT_RESOURCE_NAMESPACE)
- --enable-multi-tenancy=$(ENABLE_MULTI_TENANCY)
- --project-label-selector=$(PROJECT_LABEL_SELECTOR)
env:
- name: POD_NAMESPACE
valueFrom:
Expand Down Expand Up @@ -77,6 +79,10 @@ spec:
value: ""
- name: LEADER_ELECT_RESOURCE_NAMESPACE
value: ""
- name: ENABLE_MULTI_TENANCY
value: "false"
- name: PROJECT_LABEL_SELECTOR
value: ""
- name: MEILISEARCH_API_KEY
valueFrom:
secretKeyRef:
Expand Down
3 changes: 3 additions & 0 deletions config/base/resource-indexer/deployment.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,7 @@ spec:
- --nats-tls-cert=$(NATS_TLS_CERT)
- --nats-tls-key=$(NATS_TLS_KEY)
- --meilisearch-domain=$(MEILISEARCH_DOMAIN)
- --enable-multi-tenancy=$(ENABLE_MULTI_TENANCY)
env:
- name: NATS_URL
value: "nats://nats.nats-system.svc.cluster.local:4222"
Expand All @@ -47,6 +48,8 @@ spec:
value: "AUDIT_EVENTS"
- name: MEILISEARCH_DOMAIN
value: "http://meilisearch.meilisearch-system.svc.cluster.local:7700"
- name: ENABLE_MULTI_TENANCY
value: "false"
- name: MEILISEARCH_API_KEY
valueFrom:
secretKeyRef:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -7,3 +7,10 @@ spec:
spec:
serviceAccountName: search-controller-manager
automountServiceAccountToken: true
containers:
- name: manager
env:
- name: ENABLE_MULTI_TENANCY
value: "true"
- name: PROJECT_LABEL_SELECTOR
value: ""
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,14 @@ rules:
- get
- list
- watch
- apiGroups:
- resourcemanager.miloapis.com
resources:
- projects
verbs:
- get
- list
- watch
- apiGroups:
- search.miloapis.com
resources:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -7,3 +7,8 @@ spec:
spec:
serviceAccountName: resource-indexer
automountServiceAccountToken: true
containers:
- name: indexer
env:
- name: ENABLE_MULTI_TENANCY
value: "true"
23 changes: 23 additions & 0 deletions config/samples/policy_v1alpha1_dnszone_index_policy.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
apiVersion: search.miloapis.com/v1alpha1
kind: ResourceIndexPolicy
metadata:
name: dnszone-index-policy
spec:
targetResource:
group: dns.networking.miloapis.com
version: v1alpha1
kind: DNSZone

conditions:
- name: has-name
expression: "metadata.name != ''"

fields:
- path: ".metadata.name"
searchable: true
- path: ".metadata.namespace"
searchable: true
- path: ".spec.domainName"
searchable: true
- path: ".spec.dnsZoneClassName"
searchable: true
29 changes: 29 additions & 0 deletions config/samples/policy_v1alpha1_domain_index_policy.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
apiVersion: search.miloapis.com/v1alpha1
kind: ResourceIndexPolicy
metadata:
name: domain-index-policy
spec:
targetResource:
group: networking.datumapis.com
version: v1alpha
kind: Domain

conditions:
- name: has-name
expression: "metadata.name != ''"

fields:
- path: ".metadata.name"
searchable: true
- path: ".metadata.namespace"
searchable: true
- path: ".spec.domainName"
searchable: true
- path: ".status.apex"
searchable: true
- path: ".status.nameservers[0].hostname"
searchable: true
- path: ".status.registration.registrar.name"
searchable: true
- path: ".status.registration.registry.name"
searchable: true
Loading
Loading