feat: Add MLFlow integration by ruivieira · Pull Request #646 · trustyai-explainability/trustyai-service-operator

ruivieira · 2026-02-14T18:11:58Z

Summary by Sourcery

Integrate EvalHub with MLFlow by configuring RBAC, environment variables, and pod volumes required for Kubernetes-authenticated access.

New Features:

Grant EvalHub service accounts MLFlow access via namespace-scoped RoleBindings to the built-in edit ClusterRole.
Expose MLFlow-related configuration to the EvalHub deployment via environment variables for CA cert path, workspace, and projected token path.
Mount service CA and a projected MLFlow service account token into the EvalHub pod to support secure communication with MLFlow.

Summary by CodeRabbit

Release Notes

New Features
- Added MLFlow integration support with automatic token and certificate management.
- Enabled service certificate authority (CA) injection for secure communication.
Chores
- Restructured role-based access control (RBAC) permissions for improved security and granularity.
- Enhanced service account configuration for better resource management and isolation.

sourcery-ai · 2026-02-14T18:12:04Z

Reviewer's Guide

Adds MLFlow integration by granting EvalHub service accounts the necessary Kubernetes RBAC via RoleBindings to the built-in "edit" ClusterRole, and wiring MLFlow-related environment variables and projected service account token/CA volumes into the EvalHub deployment.

Sequence diagram for MLFlow kubernetes-auth SubjectAccessReview with new RoleBinding

sequenceDiagram
    actor User
    participant EvalHubPod
    participant MlflowServer
    participant KubeApi
    participant RbacAuthz

    User->>EvalHubPod: Invoke MLFlow-tracked operation
    EvalHubPod->>EvalHubPod: Read MLFLOW_TOKEN_PATH
    EvalHubPod->>MlflowServer: HTTP request with SA token
    MlflowServer->>KubeApi: SubjectAccessReview for token in workspace namespace
    KubeApi->>RbacAuthz: Evaluate permissions using RoleBindings and ClusterRoles
    RbacAuthz->>RbacAuthz: Find RoleBinding evalhub-mlflow-proxy/jobs
    RbacAuthz->>RbacAuthz: Resolve ClusterRole edit
    RbacAuthz-->>KubeApi: Allow if edit grants requested verbs
    KubeApi-->>MlflowServer: SubjectAccessReview allowed
    MlflowServer-->>EvalHubPod: Request succeeds
    EvalHubPod-->>User: Operation completed

Flow diagram for createServiceAccount with MLFlow RoleBindings

graph TD
    Start["Start createServiceAccount"] --> CreateSA["Create main ServiceAccount"]
    CreateSA --> CreateJobsSA["Create jobs ServiceAccount"]
    CreateJobsSA --> RBProxy["Create evalhub proxy RoleBinding (existing)"]
    RBProxy --> RBJobsProxy["Create jobs proxy RoleBinding (existing)"]
    RBJobsProxy --> RBMlflowProxy["createMLFlowAccessRoleBinding for main SA (suffix proxy)"]
    RBMlflowProxy --> RBMlflowJobs["createMLFlowAccessRoleBinding for jobs SA (suffix jobs)"]
    RBMlflowJobs --> End["Done"]

File-Level Changes

Change	Details	Files
Grant MLFlow access permissions to EvalHub service accounts via RoleBindings to the built-in "edit" ClusterRole.	Invoke a new helper to create MLFlow access RoleBindings for both the main and jobs service accounts during service account creation. Introduce a constant for the MLFlow access ClusterRole and implement helper logic to create or reconcile RoleBindings, including owner references, logging, and subject/RoleRef update handling.	`controllers/evalhub/service_accounts.go`
Wire MLFlow authentication and TLS configuration into the EvalHub deployment via env vars and volumes.	Add MLFlow-related environment variables (CA cert path, workspace namespace, token path) to the EvalHub container spec. Mount a ConfigMap-based volume for the service CA and a projected volume exposing a service account token for MLFlow, and attach corresponding volume mounts to the container. Define constants for service CA and MLFlow token volume names, paths, filenames, and token expiration to standardize configuration.	`controllers/evalhub/deployment.go` `controllers/evalhub/constants.go`

Tips and commands

Interacting with Sourcery

Trigger a new review: Comment @sourcery-ai review on the pull request.
Continue discussions: Reply directly to Sourcery's review comments.
Generate a GitHub issue from a review comment: Ask Sourcery to create an
issue from a review comment by replying to it. You can also reply to a
review comment with @sourcery-ai issue to create an issue from it.
Generate a pull request title: Write @sourcery-ai anywhere in the pull
request title to generate a title at any time. You can also comment
@sourcery-ai title on the pull request to (re-)generate the title at any time.
Generate a pull request summary: Write @sourcery-ai summary anywhere in
the pull request body to generate a PR summary at any time exactly where you
want it. You can also comment @sourcery-ai summary on the pull request to
(re-)generate the summary at any time.
Generate reviewer's guide: Comment @sourcery-ai guide on the pull
request to (re-)generate the reviewer's guide at any time.
Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
pull request to resolve all Sourcery comments. Useful if you've already
addressed all the comments and don't want to see them anymore.
Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
request to dismiss all existing Sourcery reviews. Especially useful if you
want to start fresh with a new review - don't forget to comment
@sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

Enable or disable review features such as the Sourcery-generated pull request
summary, the reviewer's guide, and others.
Change the review language.
Add, remove or edit custom review instructions.
Adjust other review settings.

Getting Help

Contact our support team for questions or feedback.
Visit our documentation for detailed guides and information.
Keep in touch with the Sourcery team by following us on X/Twitter, LinkedIn or GitHub.

coderabbitai · 2026-02-14T18:12:21Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

📝 Walkthrough

Walkthrough

This PR refactors EvalHub RBAC from proxy-centric to API and MLFlow-centric authorization with per-instance scoping. It introduces Service CA and MLFlow token handling in deployments, replaces monolithic resource-manager roles with granular least-privilege roles, and updates tests and manifests accordingly.

Changes

Cohort / File(s)	Summary
Constants & Configuration `controllers/evalhub/constants.go`, `controllers/evalhub/configmap.go`	Added Service CA and MLFlow token configuration constants; enhanced proxy config with "name" field in resourceAttributes for per-instance identity.
Deployment Configuration `controllers/evalhub/deployment.go`, `controllers/evalhub/deployment_test.go`	Added MLFLOW_CA_CERT_PATH, MLFLOW_WORKSPACE, MLFLOW_TOKEN_PATH environment variables; injected service-ca and mlflow-token volume mounts; added corresponding volumes (ConfigMap-based and projected ServiceAccountToken). Test expectations updated for increased volume and mount counts; service account naming changed from -proxy to -api suffix.
RBAC & Service Account Reconciliation `controllers/evalhub/service_accounts.go`, `controllers/evalhub/evalhub_controller.go`	Major refactor: renamed proxy-related service account generation to API-oriented (-api suffix); introduced per-instance Roles for API access and jobs API access; added MLFlow access RoleBindings for both API and jobs service accounts; replaced monolithic resource-manager RBAC with granular role-binding creation; implemented equality utilities for RoleBinding/ClusterRoleBinding comparison and idempotent updates; simplified cleanup to single auth-reviewer ClusterRoleBinding deletion.
RBAC Manifest Reorganization `config/rbac/kustomization.yaml`, `config/rbac/role.yaml`	Replaced four legacy RBAC entries (evalhub_proxy_role, evalhub_jobs_proxy_role, evalhub_resource_manager_role, evalhub_resource_manager_binding) with nine new granular entries under evalhub/ path; added "roles" resource permission under rbac.authorization.k8s.io; removed "update" verb from rolebindings.
New Granular RBAC Roles `config/rbac/evalhub/*_role.yaml`	Added six new ClusterRoles: evalhub-auth-reviewer (authentication reviews), evalhub-jobs-writer (batch/jobs CRUD), evalhub-job-config (configmaps CRUD), evalhub-mlflow-access (mlflow experiments full CRUD), evalhub-mlflow-jobs-access (mlflow experiments read-only for jobs), evalhub-service-proxy (services/proxy access). Each defines least-privilege permissions for specific responsibilities.
New RBAC Bindings `config/rbac/evalhub/*_binding.yaml`	Added five new ClusterRoleBindings: evalhub-jobs-writer-binding, evalhub-mlflow-access-binding, evalhub-mlflow-jobs-binding, evalhub-service-proxy-binding. These bind ClusterRoles to the controller-manager service account to enable operator-created namespace-scoped RoleBindings.
Deleted Legacy RBAC Manifests `config/rbac/evalhub_jobs_proxy_role.yaml`, `config/rbac/evalhub_resource_manager_role.yaml`	Removed obsolete proxy and resource-manager ClusterRole definitions; functionality split across new granular roles.
Updated Legacy Auth-Reviewer Role `config/rbac/evalhub/evalhub_auth_reviewer_role.yaml`	Removed trustyai.opendatahub.io rules (evalhubs/proxy); retained authentication.k8s.io rule for subjectaccessreviews; rebranded metadata from evalhub-proxy-role to evalhub-auth-reviewer-role.
Updated Job Config Binding `config/rbac/evalhub/evalhub_job_config_binding.yaml`	Changed binding from evalhub-resource-manager-binding to evalhub-job-config-binding; updated ClusterRole reference from evalhub-resource-manager to evalhub-job-config.
Unit Tests `controllers/evalhub/unit_test.go`	Renamed proxy-related RoleBinding tests to API access semantics; added tests for per-instance API access Role and RoleBinding creation; updated MLFlow access RoleBinding tests with per-instance names; expanded volume/mount expectations; adjusted cleanup tests for auth-reviewer CRB deletion.
Integration Tests `controllers/evalhub/proxy_rbac_test.go`	Rebranded test suite from "Proxy RBAC" to "API RBAC"; added per-instance API Role and namespace-scoped RoleBinding tests; introduced split resource-manager tests (jobs-writer, job-config); added proxy config name field validation; adjusted all expectations to use -api suffix and per-instance scoping.
RBAC Manifest Validation Tests `controllers/evalhub/rbac_manifests_test.go`	Added new test validating evalhub_job_config_role.yaml ClusterRole permissions for configmaps (create, delete, get, update verbs).

Sequence Diagram(s)

sequenceDiagram
    participant Operator as Operator Reconciler
    participant K8sAPI as Kubernetes API
    participant RBAC as RBAC System
    participant Pod as EvalHub Pod
    
    Operator->>K8sAPI: Create per-instance API Access Role
    K8sAPI-->>Operator: Role created
    
    Operator->>K8sAPI: Create API Access RoleBinding<br/>(binds Role to API SA)
    K8sAPI-->>Operator: RoleBinding created
    
    Operator->>K8sAPI: Create MLFlow Access RoleBinding<br/>(binds ClusterRole to API SA)
    K8sAPI-->>Operator: RoleBinding created
    
    Operator->>K8sAPI: Create Deployment with volumes<br/>(service-ca, mlflow-token)
    K8sAPI-->>Operator: Deployment created
    
    K8sAPI->>Pod: Inject service-ca volume<br/>and mlflow-token volume
    K8sAPI->>Pod: Mount volumes as read-only
    K8sAPI->>Pod: Set env vars (MLFLOW_CA_CERT_PATH, MLFLOW_TOKEN_PATH)
    
    Pod->>RBAC: Check permissions for operations
    RBAC-->>Pod: Authorized via per-instance Role<br/>and MLFlow ClusterRole
    
    Pod->>Pod: Access service-ca from mount
    Pod->>Pod: Access mlflow-token from mount

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

feat: Add RBAC, SAs and Roles for EvalHub #633: Directly related — main PR refactors and extends the EvalHub controller code paths introduced in this PR (proxy→API renaming, per-instance RBAC, service-CA and MLFlow handling).
feat: Add EvalHub controller #611: Related — refactors EvalHub controller files modified in this PR (service account naming, RBAC structure, deployment configuration) to introduce API and MLFlow-specific authorization constructs.

Suggested labels

ok-to-test, feature

Suggested reviewers

tarilabs
RobGeada

Poem

🐰 Per-instance roles now bloom so bright,
With MLFlow tokens held tight,
Service CA secrets safely stowed,
No proxy overhead to bear the load,
EvalHub hops forth with least-privilege flight! 🚀

🚥 Pre-merge checks | ✅ 3 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 76.47% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title 'feat: Add MLFlow integration' directly and clearly summarizes the main objective of the changeset, which is integrating MLFlow with EvalHub.
Merge Conflict Detection	✅ Passed	✅ No merge conflicts detected when merging into `main`

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

sourcery-ai

Hey - I've found 1 issue, and left some high level feedback:

Using the built-in edit ClusterRole for MLFlow access is quite broad; if MLFlow only needs a subset of verbs/resources for its SubjectAccessReview checks, consider defining a narrower Role/ClusterRole to reduce the permission surface.
The ServiceAccountTokenProjection for the MLFlow token volume does not set an Audience; consider specifying a dedicated audience expected by MLFlow to avoid overly generic tokens that could be reused by other consumers.

Prompt for AI Agents

Please address the comments from this code review:

## Overall Comments
- Using the built-in `edit` ClusterRole for MLFlow access is quite broad; if MLFlow only needs a subset of verbs/resources for its SubjectAccessReview checks, consider defining a narrower Role/ClusterRole to reduce the permission surface.
- The `ServiceAccountTokenProjection` for the MLFlow token volume does not set an `Audience`; consider specifying a dedicated audience expected by MLFlow to avoid overly generic tokens that could be reused by other consumers.

## Individual Comments

### Comment 1
<location> `controllers/evalhub/service_accounts.go:261` </location>
<code_context>

+// MLFlow access uses the built-in "edit" ClusterRole which provides the permissions
+// that MLFlow's kubernetes-auth plugin checks via SubjectAccessReview.
+const mlflowAccessClusterRoleName = "edit"
+
+// createMLFlowAccessRoleBinding creates a RoleBinding for a ServiceAccount to the "edit"
</code_context>

<issue_to_address>
**🚨 suggestion (security):** Binding to the built-in "edit" ClusterRole may grant broader permissions than strictly required for MLFlow access.

Relying on the broad "edit" role is convenient but likely over-privileged for MLFlow’s SubjectAccessReview needs. Consider defining a dedicated ClusterRole with only the specific verbs/resources required by the kubernetes-auth plugin and binding to that instead to limit blast radius if MLFlow credentials are compromised.

Suggested implementation:

```golang
// MLFlow access uses a dedicated ClusterRole which provides the minimal permissions
// that MLFlow's kubernetes-auth plugin checks via SubjectAccessReview.
const mlflowAccessClusterRoleName = "mlflow-access"

```

```golang
// createMLFlowAccessRoleBinding creates a RoleBinding for a ServiceAccount to the
// mlflow-access ClusterRole in the instance namespace. This allows the ServiceAccount
// to pass MLFlow's kubernetes-auth SubjectAccessReview checks in the workspace namespace
// while avoiding the broader permissions of the built-in "edit" role.

```

To fully implement the least-privilege approach:
1. Define a `ClusterRole` named `mlflow-access` in your RBAC manifests (YAML) with only the specific verbs/resources required by the MLFlow kubernetes-auth plugin’s SubjectAccessReview checks.
2. Ensure that any existing references in manifests or code that assumed binding to the built-in `edit` ClusterRole are updated to use the new `mlflow-access` role instead.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨

_{Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.}

controllers/evalhub/service_accounts.go

coderabbitai

Actionable comments posted: 2

🤖 Fix all issues with AI agents

In `@controllers/evalhub/service_accounts.go`:
- Around line 85-95: The RoleBindings for MLFlow are created unconditionally —
wrap the two calls to createMLFlowAccessRoleBinding (for serviceAccountName and
jobsServiceAccountName) in the same MLFlow-enabled check used elsewhere (e.g.,
the deployment MLFlow gate in deployment.go); only call
r.createMLFlowAccessRoleBinding(ctx, instance, serviceAccountName, "proxy") and
r.createMLFlowAccessRoleBinding(ctx, instance, jobsServiceAccountName, "jobs")
when the MLFlow feature flag/config check (the function or boolean you use to
decide MLFlow deployment) returns true so RBAC is only applied when MLFlow is
configured.
- Around line 259-261: The constant mlflowAccessClusterRoleName currently points
to the built-in "edit" ClusterRole which is overly permissive and does not grant
MLFlow the specific SubjectAccessReview permission it needs; create a minimal
custom ClusterRole (e.g., name it
"trustyai-service-operator-evalhub-mlflow-access") that grants
authorization.k8s.io/subjectaccessreviews:create (and optionally
authentication.k8s.io/tokenreviews:create) and replace the value of
mlflowAccessClusterRoleName with that custom role name; ensure any RBAC
manifests/creation logic uses the new ClusterRole name so the service account
binds to the minimal custom role rather than "edit".

🧹 Nitpick comments (3)

controllers/evalhub/deployment.go (2)

122-133: MLFlow environment variables are injected unconditionally.

MLFLOW_CA_CERT_PATH, MLFLOW_WORKSPACE, and MLFLOW_TOKEN_PATH are always set regardless of whether MLFlow integration is actually needed. This is consistent with the unconditional volume mounts (flagged above). If you gate the volumes behind an MLFlow-enabled check, these env vars should be gated similarly.

146-155: MLFlow volume mounts also unconditionally added — keep in sync with any gating applied to the volumes.

Same concern as the volumes and env vars above. If the volumes are gated or made optional, these mounts should follow the same pattern.

controllers/evalhub/service_accounts.go (1)

266-331: Consider extracting common RoleBinding reconciliation into a shared helper.

The create-or-update-RoleBinding pattern (check existence → create if missing → update subjects or delete+recreate if RoleRef changed) is repeated across createResourceManagementRoleBinding, createJobsResourceManagementRoleBinding, createJobsProxyRoleBinding, and now createMLFlowAccessRoleBinding. A shared helper like reconcileRoleBinding(ctx, instance, name, subjects, roleRef) would reduce ~50 lines of duplication per call site and make future additions less error-prone.

controllers/evalhub/service_accounts.go

…d RBAC

coderabbitai

Actionable comments posted: 1

🤖 Fix all issues with AI agents

In `@controllers/evalhub/service_accounts.go`:
- Around line 85-95: The comment above the MLFlow RoleBinding creation is stale
(it mentions the "edit" ClusterRole) — update the comment to reflect that
createMLFlowAccessRoleBinding now binds to the custom ClusterRole
"evalhub-mlflow-access" (used in createMLFlowAccessRoleBinding). Edit the
comment to describe that MLFlow's kubernetes-auth plugin validates tokens via
SubjectAccessReview against the workspace namespace and that the custom
"evalhub-mlflow-access" ClusterRole provides the required permissions for both
service accounts (serviceAccountName and jobsServiceAccountName).

🧹 Nitpick comments (1)

controllers/evalhub/service_accounts.go (1)

259-334: Consider extracting a shared reconcileRoleBinding helper to reduce duplication.

createMLFlowAccessRoleBinding, createResourceManagementRoleBinding, createJobsResourceManagementRoleBinding, and createJobsProxyRoleBinding all follow the identical get-or-create/update/delete-recreate pattern, differing only in the RoleBinding name, labels, subjects, and roleRef. A single parameterized helper would eliminate ~80 lines of near-duplicate code and make it easier to consistently fix any reconciliation bugs across all bindings.

controllers/evalhub/service_accounts.go

coderabbitai

Actionable comments posted: 1

🤖 Fix all issues with AI agents

In `@controllers/evalhub/deployment_test.go`:
- Around line 374-375: The test assertions for evalhub container VolumeMounts
are outdated: update the Expect(...VolumeMounts).To(HaveLen(...)) checks so they
match the current mounts (evalhub-config, service-ca, mlflow-token and optional
DB secret). Specifically, change the assertion that currently expects 2 mounts
(the Expect(container.VolumeMounts).To(HaveLen(2)) in the DB-configured test) to
expect 4 mounts, and change the assertion that expects 1 mount (the
Expect(container.VolumeMounts).To(HaveLen(1)) in the non-DB test) to expect 3
mounts; leave the existing deployment.Spec.Template.Spec.Volumes expectation
as-is. Use the container variable/identifier used in the tests to locate the two
assertions to update.

🧹 Nitpick comments (1)

controllers/evalhub/service_accounts.go (1)

260-335: Consider extracting a generic RoleBinding reconciliation helper.

createMLFlowAccessRoleBinding, createResourceManagementRoleBinding, createJobsResourceManagementRoleBinding, and createJobsProxyRoleBinding share nearly identical get→create / compare→update / delete→recreate logic. A single reconcileRoleBinding(ctx, instance, name, namespace, labels, subjects, roleRef) helper would eliminate ~100 lines of duplication and make future additions (like this one) trivial.

controllers/evalhub/deployment_test.go

ppadashe-psp

lgtm

tarilabs

adds the necessary Operator and EH Job grants
/lgtm

thanks @ruivieira

…-scoped Roles and split resource-manager

ruivieira · 2026-02-15T17:19:04Z

@coderabbitai full review

coderabbitai · 2026-02-15T17:19:12Z

✅ Actions performed

Full review triggered.

coderabbitai

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

controllers/evalhub/service_accounts.go (1)

331-394: ⚠️ Potential issue | 🔴 Critical

Auth-reviewer CRB correctly omits owner references — but label issue remains.

Good: The comment at line 367 correctly explains why owner references cannot be set on cluster-scoped resources. The subjects/roleRef comparison and update logic is sound.

However, this is where the app.kubernetes.io/name label is set to the full CRB name (line 343), which is the root cause of the pipeline failures flagged above.

🤖 Fix all issues with AI agents

In `@controllers/evalhub/proxy_rbac_test.go`:
- Around line 17-23: The test failures are caused by cluster role binding names
(produced by generateAuthReviewerClusterRoleBindingName) exceeding Kubernetes'
63-char label limit and being placed verbatim into the app.kubernetes.io/name
label in service_accounts.go; fix service_accounts.go so the code that assigns
the app.kubernetes.io/name label for service accounts (the label assignment
logic near where service accounts are created) enforces Kubernetes label
length/format constraints by truncating or hashing the generated name to a
DNS-1123 compatible string no longer than 63 characters (preserve uniqueness,
e.g., keep a human-readable prefix and append a short hash), and use that
normalized value instead of the raw generateAuthReviewerClusterRoleBindingName
output.

In `@controllers/evalhub/service_accounts.go`:
- Around line 33-35: generateAuthReviewerClusterRoleBindingName builds a long
string (<name>-<namespace>-auth-reviewer-crb) that is later used as the
app.kubernetes.io/name label in createAuthReviewerClusterRoleBinding, which can
exceed Kubernetes' 63-char label limit; modify the code so the full CRB
metadata.name still uses generateAuthReviewerClusterRoleBindingName but the
label uses a safe, truncation-or-hash-derived value (e.g., take the first 63
characters or use a stable short hash of the full name) before setting
app.kubernetes.io/name in createAuthReviewerClusterRoleBinding; ensure the
label-producing logic is deterministic and referenced by both
createAuthReviewerClusterRoleBinding and any other places reading that label.

🧹 Nitpick comments (4)

controllers/evalhub/rbac_manifests_test.go (2)
38-49: Minor: only the first rule containing configmaps is checked.

If the manifest ever has multiple rules referencing configmaps (unlikely but possible), only the first one's verbs are validated. Current behavior is fine given the manifest structure, but worth noting.

54-65: Consider using reflect.DeepEqual or slices.Equal for the sorted comparison.

The manual element-by-element comparison works correctly but could be simplified.
♻️ Optional simplification
+	"reflect"
...
 	sort.Strings(gotVerbs)
 	sort.Strings(wantVerbs)
 
-	if len(gotVerbs) != len(wantVerbs) {
-		t.Fatalf("unexpected verbs for configmaps: got=%v want=%v", gotVerbs, wantVerbs)
-	}
-	for i := range wantVerbs {
-		if gotVerbs[i] != wantVerbs[i] {
-			t.Fatalf("unexpected verbs for configmaps: got=%v want=%v", gotVerbs, wantVerbs)
-		}
+	if !reflect.DeepEqual(gotVerbs, wantVerbs) {
+		t.Fatalf("unexpected verbs for configmaps: got=%v want=%v", gotVerbs, wantVerbs)
 	}
controllers/evalhub/service_accounts.go (2)

38-128: createServiceAccount has grown into a large orchestration function creating ~8 RBAC resources.

This function now creates a ServiceAccount, a ClusterRoleBinding, 2 Roles, and 5 RoleBindings. If any step fails mid-way, partial RBAC state is left behind (though reconciliation will retry). Consider extracting an ensureRBACForInstance helper to improve readability and make the orchestration explicit.

Not a blocker given the reconciler's idempotent retry design, but worth considering for maintainability.

504-531: equalPolicyRules uses positional comparison — fragile if rule order changes.

The function compares rules by index position. If the desired rules are reordered (e.g., during a refactor), existing roles will be unnecessarily updated. For the current usage where both sides are constructed in the same code this is acceptable, but consider sorting before comparison for robustness.

coderabbitai · 2026-02-15T17:24:07Z

controllers/evalhub/proxy_rbac_test.go

+var _ = Describe("EvalHub API RBAC", func() {
 	const (
-		testNamespacePrefix     = "evalhub-proxy-rbac-test"
+		testNamespacePrefix     = "evalhub-api-rbac-test"
 		operatorNamespacePrefix = "operator-system"
-		evalHubName             = "proxy-rbac-evalhub"
+		evalHubName             = "api-rbac-evalhub"
 		configMapName           = "trustyai-service-operator-config"
 	)


⚠️ Potential issue | 🔴 Critical

All pipeline failures in this file trace to app.kubernetes.io/name label exceeding 63 chars.

The test namespace names include nanosecond timestamps (line 38), making them ~40 chars long. Combined with evalHubName in generateAuthReviewerClusterRoleBindingName, the CRB name exceeds 63 characters and is then set verbatim as the app.kubernetes.io/name label in service_accounts.go (line 343). Kubernetes rejects label values >63 chars.

This is addressed in the service_accounts.go review comment — the fix needs to happen there, not in the tests. The test is correctly exposing a real production-time risk.

🤖 Prompt for AI Agents

In `@controllers/evalhub/proxy_rbac_test.go` around lines 17 - 23, The test failures are caused by cluster role binding names (produced by generateAuthReviewerClusterRoleBindingName) exceeding Kubernetes' 63-char label limit and being placed verbatim into the app.kubernetes.io/name label in service_accounts.go; fix service_accounts.go so the code that assigns the app.kubernetes.io/name label for service accounts (the label assignment logic near where service accounts are created) enforces Kubernetes label length/format constraints by truncating or hashing the generated name to a DNS-1123 compatible string no longer than 63 characters (preserve uniqueness, e.g., keep a human-readable prefix and append a short hash), and use that normalized value instead of the raw generateAuthReviewerClusterRoleBindingName output.

coderabbitai · 2026-02-15T17:24:07Z

controllers/evalhub/service_accounts.go

+func generateAuthReviewerClusterRoleBindingName(instance *evalhubv1alpha1.EvalHub) string {
+	return instance.Name + "-" + instance.Namespace + "-auth-reviewer-crb"
 }


⚠️ Potential issue | 🔴 Critical

CRB name used as label value can exceed Kubernetes' 63-character limit — root cause of all pipeline failures.

generateAuthReviewerClusterRoleBindingName produces <name>-<namespace>-auth-reviewer-crb. This value is then set as the app.kubernetes.io/name label (line 343), but Kubernetes label values are capped at 63 characters. With even moderately sized instance names and namespace names this limit is easily breached, as confirmed by the pipeline failures.

The .metadata.name field allows up to 253 characters, so the CRB name itself is fine — only the label needs to be truncated or computed differently.

Proposed fix

func generateAuthReviewerClusterRoleBindingName(instance *evalhubv1alpha1.EvalHub) string { return instance.Name + "-" + instance.Namespace + "-auth-reviewer-crb" } + +// truncateLabel truncates a string to the Kubernetes label value limit (63 chars). +func truncateLabel(s string, maxLen int) string { + if len(s) <= maxLen { + return s + } + return s[:maxLen] +}

Then in createAuthReviewerClusterRoleBinding (line 343):

Labels: map[string]string{ "app": "eval-hub", - "app.kubernetes.io/name": clusterRoleBindingName, + "app.kubernetes.io/name": truncateLabel(clusterRoleBindingName, 63), "app.kubernetes.io/instance": instance.Name, "app.kubernetes.io/part-of": "eval-hub", "app.kubernetes.io/version": constants.Version, },

🤖 Prompt for AI Agents

In `@controllers/evalhub/service_accounts.go` around lines 33 - 35, generateAuthReviewerClusterRoleBindingName builds a long string (<name>-<namespace>-auth-reviewer-crb) that is later used as the app.kubernetes.io/name label in createAuthReviewerClusterRoleBinding, which can exceed Kubernetes' 63-char label limit; modify the code so the full CRB metadata.name still uses generateAuthReviewerClusterRoleBindingName but the label uses a safe, truncation-or-hash-derived value (e.g., take the first 63 characters or use a stable short hash of the full name) before setting app.kubernetes.io/name in createAuthReviewerClusterRoleBinding; ensure the label-producing logic is deterministic and referenced by both createAuthReviewerClusterRoleBinding and any other places reading that label.

…3 chars

…e label normalization

openshift-ci · 2026-02-15T19:54:35Z

@ruivieira: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name	Commit	Details	Required	Rerun command
ci/prow/trustyai-service-operator-e2e	`0f29606`	link	true	`/test trustyai-service-operator-e2e`

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

tarilabs

thanks @ruivieira

minor comments below for your 👀

config/rbac/evalhub/evalhub_service_proxy_binding.yaml

config/rbac/evalhub/evalhub_service_proxy_role.yaml

controllers/evalhub/service_accounts.go

openshift-ci · 2026-02-16T10:33:05Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: ppadashe-psp, tarilabs

The full list of commands accepted by this bot can be found here.

Details

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Co-authored-by: Matteo Mortari <matteo.mortari@gmail.com>

openshift-ci · 2026-02-16T16:05:29Z

New changes are detected. LGTM label has been removed.

poc: Add MLFlow integration

942c498

ruivieira requested review from RobGeada, julpayne, ppadashe-psp and tarilabs February 14, 2026 18:11

ruivieira self-assigned this Feb 14, 2026

ruivieira added the kind/enhancement New feature or request label Feb 14, 2026

sourcery-ai bot reviewed Feb 14, 2026

View reviewed changes

controllers/evalhub/service_accounts.go Outdated Show resolved Hide resolved

coderabbitai bot reviewed Feb 14, 2026

View reviewed changes

controllers/evalhub/service_accounts.go Outdated Show resolved Hide resolved

fix(evalhub): replace edit ClusterRole with mlflow.kubeflow.org-scope…

d88a542

…d RBAC

coderabbitai bot reviewed Feb 14, 2026

View reviewed changes

controllers/evalhub/service_accounts.go Show resolved Hide resolved

chore: Replace stale comment

fb31803

ruivieira added this to TrustyAI planning Feb 14, 2026

ruivieira moved this to In Progress in TrustyAI planning Feb 14, 2026

fix: MLFlow Cluster role name

a489199

coderabbitai bot reviewed Feb 14, 2026

View reviewed changes

controllers/evalhub/deployment_test.go Show resolved Hide resolved

ruivieira added 3 commits February 14, 2026 21:27

fix: Tests

a3beb19

fix: Additional test fixes

1655df6

fix(evalhub): remove resource-manager RBAC from jobs SA

80901cc

ppadashe-psp approved these changes Feb 15, 2026

View reviewed changes

openshift-ci bot assigned ppadashe-psp Feb 15, 2026

openshift-ci bot added the lgtm label Feb 15, 2026

tarilabs approved these changes Feb 15, 2026

View reviewed changes

openshift-ci bot assigned tarilabs Feb 15, 2026

ruivieira added 4 commits February 15, 2026 11:58

fix(evalhub): namespace-scope API RBAC and rename proxy to api

6c4ffe1

feat(evalhub): replace static ClusterRoles with per-instance resource…

5af2ebf

…-scoped Roles and split resource-manager

refactor(evalhub): Remove deprecated RBAC

dd91838

refactor(evalhub): Refactor RBAC

302089c

openshift-ci bot removed the lgtm label Feb 15, 2026

ruivieira added 4 commits February 15, 2026 15:53

refactor(evalhub): Refactor RBAC

73bff9d

refactor(evalhub): Add evalhubs/proxy RBAC

6b09442

refactor(evalhub): Job CM permissions

ceffcd5

refactor(evalhub): Add missing test

7434a1a

ruivieira requested review from ppadashe-psp and tarilabs February 15, 2026 17:17

coderabbitai bot reviewed Feb 15, 2026

View reviewed changes

ruivieira added 2 commits February 15, 2026 18:11

fix(evalhub): normalize app.kubernetes.io/name labels to DNS-1123 <=6…

30579b1

…3 chars

refactor(evalhub): centralize auth-reviewer CRB app.kubernetes.io/nam…

0f29606

…e label normalization

ruivieira added the project/evalhub label Feb 15, 2026

tarilabs approved these changes Feb 16, 2026

View reviewed changes

config/rbac/evalhub/evalhub_service_proxy_binding.yaml Outdated Show resolved Hide resolved

config/rbac/evalhub/evalhub_service_proxy_role.yaml Outdated Show resolved Hide resolved

controllers/evalhub/service_accounts.go Show resolved Hide resolved

openshift-ci bot added the lgtm label Feb 16, 2026

Update controllers/evalhub/service_accounts.go

e4e722f

Co-authored-by: Matteo Mortari <matteo.mortari@gmail.com>

openshift-ci bot removed the lgtm label Feb 16, 2026

chore: Delete obsolete role

8afc77d

ruivieira merged commit 353b718 into trustyai-explainability:main Feb 16, 2026
9 of 10 checks passed

ruivieira deleted the evalhub-mlflow branch February 16, 2026 16:36

This was referenced Feb 17, 2026

chore(evalhub): hardcoded providers not needed #649

Closed

feat(evalhub): Providers as ConfigMaps #650

Merged

This was referenced Feb 26, 2026

Split different controllers into separate Kustomize components #651

Open

feat(operator): add per-tenant namespace RBAC for multi-tenant job execution #656

Merged

Conversation

ruivieira commented Feb 14, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by Sourcery

Summary by CodeRabbit

Release Notes

Uh oh!

sourcery-ai bot commented Feb 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviewer's Guide

Sequence diagram for MLFlow kubernetes-auth SubjectAccessReview with new RoleBinding

Flow diagram for createServiceAccount with MLFlow RoleBindings

File-Level Changes

Interacting with Sourcery

Customizing Your Experience

Getting Help

Uh oh!

coderabbitai bot commented Feb 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested labels

Suggested reviewers

Poem

Uh oh!

sourcery-ai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

ppadashe-psp left a comment

Choose a reason for hiding this comment

Uh oh!

tarilabs left a comment

Choose a reason for hiding this comment

Uh oh!

ruivieira commented Feb 15, 2026

Uh oh!

coderabbitai bot commented Feb 15, 2026

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 15, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 15, 2026

Choose a reason for hiding this comment

Uh oh!

openshift-ci bot commented Feb 15, 2026

Uh oh!

tarilabs left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

openshift-ci bot commented Feb 16, 2026

Uh oh!

openshift-ci bot commented Feb 16, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

ruivieira commented Feb 14, 2026 •

edited by coderabbitai bot

Loading

sourcery-ai bot commented Feb 14, 2026 •

edited

Loading

coderabbitai bot commented Feb 14, 2026 •

edited

Loading