Skip to content

feat: create agent RBAC resources on the control plane side#7268

Open
Park-Jiyeonn wants to merge 1 commit intokarmada-io:masterfrom
Park-Jiyeonn:feat/secure-agent-rbac
Open

feat: create agent RBAC resources on the control plane side#7268
Park-Jiyeonn wants to merge 1 commit intokarmada-io:masterfrom
Park-Jiyeonn:feat/secure-agent-rbac

Conversation

@Park-Jiyeonn
Copy link
Copy Markdown
Contributor

@Park-Jiyeonn Park-Jiyeonn commented Mar 8, 2026

What this PR does:

This PR improves the security of pull-mode cluster bootstrap, as discussed in #7147.

Previously, joining a pull-mode cluster required karmadactl register to use a bootstrap credential bound to the highly privileged system:karmada:agent:rbac-generator role so it could generate the control-plane RBAC needed by karmada-agent. If a member cluster were compromised, that credential could be abused.

So, here is the new flow:

  • The system:karmada:agent:rbac-generator role is removed.
  • The bootstrap token is now only used for certificate bootstrap.
  • After obtaining its client certificate, karmada-agent creates the pull-mode Cluster object and waits for control-plane bootstrap to complete.
  • The cluster_controller watches pull-mode Cluster objects and automatically creates the cluster-specific RBAC resources required by that agent.
  • When the cluster is unregistered, the cluster_controller also cleans up those RBAC resources.
  • The old RBAC generation logic is removed from karmadactl register and karmadactl unregister.

Which issue this PR fixes:

Closes #7147

Copilot AI review requested due to automatic review settings March 8, 2026 15:16
@karmada-bot karmada-bot added the do-not-merge/invalid-commit-message Indicates that a PR should not merge because it has an invalid commit message. label Mar 8, 2026
@karmada-bot
Copy link
Copy Markdown
Contributor

Welcome @Park-Jiyeonn! It looks like this is your first PR to karmada-io/karmada 🎉

@karmada-bot
Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign xishanyongye-chang for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@karmada-bot karmada-bot added the size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. label Mar 8, 2026
@gemini-code-assist
Copy link
Copy Markdown

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the security posture of Karmada by revamping the RBAC management for pull-mode karmada-agent registrations. Instead of requiring a highly privileged bootstrap token to generate RBAC resources on the agent side, the process has been shifted to the control plane. The karmada-agent now uses a minimally privileged token to create its Cluster object, and the control plane's cluster_controller then takes over to generate and manage the exact RBAC permissions needed for that specific agent. This change reduces the attack surface by limiting the scope of bootstrap tokens and centralizing RBAC provisioning.

Highlights

  • Enhanced Security Model: The bootstrap token for karmada-agent now operates with minimal privileges, only allowing the creation of Cluster objects and CSRs, eliminating the high-privileged system:karmada:agent:rbac-generator user.
  • Control Plane RBAC Generation: The cluster_controller on the control plane is now responsible for dynamically generating precise, minimal RBAC resources for each pull-mode karmada-agent upon Cluster object creation.
  • Agent-Side Simplification: The karmada-agent now creates its Cluster object and then waits for the control plane to provision its necessary RBAC, removing the complex RBAC generation logic from karmadactl register and unregister.
  • Automated Cleanup: The cluster_controller automatically cleans up the agent's RBAC resources when a cluster is unregistered.
Changelog
  • artifacts/deploy/bootstrap-token-configuration.yaml
    • Renamed ClusterRole and ClusterRoleBinding from system:karmada:agent-rbac-generator to system:karmada:agent-cluster.
    • Restricted permissions for system:karmada:agent-cluster to only create verbs on cluster.karmada.io/clusters resources.
    • Changed the subject kind in ClusterRoleBinding from User to Group for system:karmada:agents.
  • charts/karmada/templates/_karmada_bootstrap_token_configuration.tpl
    • Updated ClusterRole and ClusterRoleBinding names and permissions to align with the new minimal RBAC for agent cluster creation.
    • Changed the subject kind in ClusterRoleBinding from User to Group.
  • cmd/agent/app/agent.go
    • Added logic to pre-set secret references before cluster creation.
    • Implemented a polling mechanism (waitForPullModeAgentRBACReady) for the agent to wait until its RBAC resources are ready on the control plane.
    • Removed the direct call to util.RegisterClusterInControllerPlane and replaced it with a sequence of generateClusterInControllerPlane and waitForPullModeAgentRBACReady.
  • pkg/controllers/cluster/agent_rbac.go
    • Added a new file containing functions to generate, ensure, and clean up agent-specific ClusterRoles, ClusterRoleBindings, Roles, and RoleBindings in the control plane.
  • pkg/controllers/cluster/cluster_controller.go
    • Integrated ensureAgentRBACResourcesInControlPlane into syncCluster for pull-mode clusters.
    • Integrated cleanupAgentRBACResourcesInControlPlane into removeCluster for pull-mode clusters.
  • pkg/controllers/cluster/cluster_controller_test.go
    • Added new test cases for ensureAgentRBACResourcesInControlPlane and cleanupAgentRBACResourcesInControlPlane.
  • pkg/karmadactl/cmdinit/bootstraptoken/agent/tlsbootstrap.go
    • Defined new constants KarmadaAgentClusterRoleName and KarmadaAgentClusterRoleBindingName.
    • Added AllowKarmadaAgentsToRegisterClusters function to create the necessary minimal RBAC for agents to register clusters.
  • pkg/karmadactl/cmdinit/bootstraptoken/agent/tlsbootstrap_test.go
    • Added test cases for the AllowKarmadaAgentsToRegisterClusters function.
  • pkg/karmadactl/cmdinit/karmada/deploy.go
    • Removed the call to the deprecated grantAccessPermissionToAgentRBACGenerator function.
    • Added a call to the new bootstrapagent.AllowKarmadaAgentsToRegisterClusters function.
  • pkg/karmadactl/cmdinit/karmada/rbac.go
    • Removed constants and the function related to the old karmada-agent-rbac-generator.
  • pkg/karmadactl/cmdinit/karmada/rbac_test.go
    • Removed the test for the grantAccessPermissionToAgentRBACGenerator function.
  • pkg/karmadactl/register/register.go
    • Removed the AgentRBACGenerator constant.
    • Removed the rbacResources field and related logic for direct RBAC generation and cleanup.
    • Removed the EnsureNecessaryResourcesExistInControlPlane function.
  • pkg/karmadactl/unregister/unregister.go
    • Removed the rbacResources field and related logic for RBAC cleanup.
  • pkg/util/cluster.go
    • Modified Validate to allow agents to skip duplicate cluster validation if they lack list permissions.
    • Refactored CreateOrUpdateClusterObject to prioritize creation and handle AlreadyExists errors gracefully by attempting an an update.
  • pkg/util/cluster_test.go
    • Added new test cases for Validate and CreateOrUpdateClusterObject to cover the updated logic.
Activity
  • This pull request introduces a new feature to enhance security as discussed in issue Create RBAC resources for karmada-agent on the control plane side #7147.
  • The author, Park-Jiyeonn, has implemented the proposed changes to shift RBAC generation for pull-mode agents to the control plane.
  • No specific review comments or further progress updates are available in the provided context.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request enhances security by shifting the creation of karmada-agent RBAC resources from client-side bootstrap tokens to the cluster-controller on the control plane, which now provisions least-privilege RBAC for new pull-mode clusters. However, the current implementation introduces critical and high-severity RBAC misconfigurations. The cluster_controller allows agents to specify arbitrary namespaces for their RBAC resources, risking privilege escalation into sensitive namespaces like kube-system. Additionally, agents are granted excessive cluster-wide permissions to list all clusters and manage all leases, leading to information disclosure and potential Denial of Service. These vulnerabilities require stricter validation of agent-provided namespaces and adherence to the principle of least privilege for generated RBAC rules. Furthermore, a new function needs refactoring to align with the project's style guide regarding parameter count and to improve maintainability through documentation.

Comment on lines +125 to +133
func getClusterNamespace(cluster *clusterv1alpha1.Cluster) string {
if cluster.Spec.SecretRef != nil && cluster.Spec.SecretRef.Namespace != "" {
return cluster.Spec.SecretRef.Namespace
}
if cluster.Spec.ImpersonatorSecretRef != nil && cluster.Spec.ImpersonatorSecretRef.Namespace != "" {
return cluster.Spec.ImpersonatorSecretRef.Namespace
}
return defaultPullModeClusterNamespace
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-critical critical

The cluster_controller determines the namespace for agent-specific RBAC resources based on the SecretRef.Namespace or ImpersonatorSecretRef.Namespace fields in the Cluster object's specification. Since the karmada-agent has permission to create and potentially update its own Cluster object, a compromised agent can specify an arbitrary namespace (e.g., kube-system or karmada-system). The cluster_controller will then create a Role and RoleBinding in that namespace, granting the agent permissions to create any Secret and manage specific secrets in that sensitive namespace. This allows an attacker to escalate privileges by interfering with system components or accessing sensitive credentials.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This namespace-selection model predates this PR. --cluster-namespace is an existing user-facing option, and the previous karmadactl register flow also generated the agent secret Role/RoleBinding in that namespace.

Comment on lines +162 to +166
APIGroups: []string{"cluster.karmada.io"},
Resources: []string{"clusters"},
Verbs: []string{"create", "list", "watch"},
},
{
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-high high

The generateClusterRole function grants every agent list and watch permissions on all clusters resources at the cluster scope. This allows any registered agent to discover and monitor all other clusters in the Karmada federation, leaking sensitive metadata such as API endpoints, cluster IDs, and status information. This violates the principle of least privilege, as an agent only needs access to its own Cluster object.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The previous karmadactl-generated agent ClusterRole already granted the same clusters access, and this PR preserves that existing steady-state RBAC instead of broadening it.

Comment on lines +183 to +187
APIGroups: []string{"coordination.k8s.io"},
Resources: []string{"leases"},
Verbs: []string{"get", "create", "update"},
},
{
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-high high

The generateClusterRole function grants every agent get, create, and update permissions on all leases resources across all namespaces. Karmada uses leases for cluster heartbeats and potentially for leader election. An attacker with control over one agent can use these permissions to modify or delete leases belonging to other clusters or system components, leading to a Denial of Service (by making other clusters appear offline) or interfering with the control plane's operational logic. These permissions should be restricted using ResourceNames to only allow the agent to manage its own lease.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This lease permission also existed in the previous agent RBAC and is preserved by this PR rather than newly introduced by it.

Comment on lines +152 to +157
- apiGroups:
- cluster.karmada.io
resources:
- clusters
verbs:
- create
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-medium medium

The RBAC configuration grants the system:karmada:agents group the create permission on all clusters resources. Since every registered agent's certificate belongs to this group, any compromised agent can create arbitrary new Cluster objects in the control plane. This can be used to clutter the system, block the registration of legitimate clusters by pre-empting their names, or trigger the automatic RBAC generation for fake clusters. The create permission should be restricted to the bootstrap token group used during initial registration.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Previously, the Cluster object was created during karmadactl register. In this PR, it is created by karmada-agent after certificate bootstrap so that the privileged system:karmada:agent:rbac-generator path can be removed entirely.

I agree that constraining cluster creation further would require an additional mechanism beyond RBAC alone (for example admission-level validation or a different bootstrap handshake), so I see that as follow-up scope rather than part of this PR.

Comment on lines +155 to +161
{
APIGroups: []string{"cluster.karmada.io"},
Resources: []string{"clusters"},
ResourceNames: []string{clusterName},
Verbs: []string{"get", "delete"},
},
{
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-medium medium

The generateClusterRole function grants get and delete permissions on the agent's own Cluster resource but omits the update and patch verbs. However, the karmada-agent implementation attempts to update its own Cluster object's specification (e.g., in pkg/util/cluster.go). Without these permissions, the agent will be unable to update its metadata (like API endpoints or provider info) after the initial registration, leading to a functional failure or a state where the control plane has stale information about the cluster.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment was based on an earlier revision. In the current implementation the agent no longer updates its own Cluster spec after registration;

return true, nil
}

func waitForPullModeAgentRBACReady(ctx context.Context, controlPlaneKubeClient kubeclientset.Interface, karmadaClient karmadaclientset.Interface, clusterName, clusterNamespace string, timeout time.Duration) error {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This function has 6 parameters, which exceeds the recommended limit of 5 according to the repository style guide. To improve readability and maintainability, consider encapsulating these parameters into a struct.

Additionally, this function contains key logic for the agent's startup sequence but lacks a doc comment. Adding a comment explaining its purpose—polling until the control plane has provisioned the necessary RBAC permissions for the agent—would improve code clarity.

For example:

type rbacReadyOptions struct {
	controlPlaneKubeClient kubeclientset.Interface
	karmadaClient          karmadaclientset.Interface
	clusterName            string
	clusterNamespace       string
	timeout                time.Duration
}

// waitForPullModeAgentRBACReady polls the control plane to check if the required RBAC resources
// for the pull-mode agent have been created. It performs dry-run checks for cluster, secret,
// and work access until they are all permitted or the timeout is reached.
func waitForPullModeAgentRBACReady(ctx context.Context, opts rbacReadyOptions) error {
    // ...
}
References
  1. A function should not have more than 5 parameters. If it exceeds this, consider refactoring the function or encapsulating the parameters. (link)
  2. Add doc comments to unexported functions that contain key logic to improve maintainability.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This no longer applies to the current revision because waitForPullModeAgentRBACReady() was removed during the refactor.`

@Park-Jiyeonn Park-Jiyeonn force-pushed the feat/secure-agent-rbac branch from fa62429 to afda1ce Compare March 8, 2026 15:21
@karmada-bot karmada-bot removed the do-not-merge/invalid-commit-message Indicates that a PR should not merge because it has an invalid commit message. label Mar 8, 2026
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR shifts pull-mode karmada-agent RBAC resource creation/cleanup from edge-side CLI operations to the Karmada control plane, reducing the privileges required during cluster registration and removing the legacy “RBAC generator” user flow.

Changes:

  • Add control-plane cluster_controller logic to generate and clean up per-agent RBAC resources for pull-mode clusters.
  • Update karmada-agent startup flow to create the Cluster object first and wait until its RBAC becomes available before proceeding with registration steps that require those permissions.
  • Remove RBAC-generator-related logic from karmadactl register/unregister and update bootstrap-token RBAC manifests to the new, narrower permissions model.

Reviewed changes

Copilot reviewed 16 out of 16 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
pkg/util/cluster.go Skips duplicate-cluster validation for agents on Forbidden list errors; refactors create-or-update to “create-first then fallback to get/update”.
pkg/util/cluster_test.go Adds/updates tests for the new agent validation behavior and create-first semantics.
pkg/karmadactl/unregister/unregister.go Removes CLI-side deletion/manual reporting of karmada-agent RBAC resources on the control plane.
pkg/karmadactl/unregister/unregister_test.go Adjusts unregister test setup to reflect removed RBAC resource handling.
pkg/karmadactl/register/register.go Removes legacy RBAC-generator code path from pull-mode registration.
pkg/karmadactl/cmdinit/karmada/rbac.go Removes “agent RBAC generator” cluster-admin-like RBAC creation.
pkg/karmadactl/cmdinit/karmada/rbac_test.go Drops tests for removed RBAC-generator grant function.
pkg/karmadactl/cmdinit/karmada/deploy.go Adds creation of RBAC allowing agents to create Cluster objects during initial registration.
pkg/karmadactl/cmdinit/bootstraptoken/agent/tlsbootstrap.go Adds RBAC helper to allow karmada agents to create Cluster objects.
pkg/karmadactl/cmdinit/bootstraptoken/agent/tlsbootstrap_test.go Adds tests verifying the new cluster-role/cluster-rolebinding creation/update behavior.
pkg/controllers/cluster/cluster_controller.go Ensures pull-mode agent RBAC on sync; cleans it up on cluster removal.
pkg/controllers/cluster/cluster_controller_test.go Adds tests for ensuring and cleaning up agent RBAC resources.
pkg/controllers/cluster/agent_rbac.go Introduces control-plane RBAC resource generation/cleanup for pull-mode agents.
cmd/agent/app/agent.go Creates the Cluster object early; polls until RBAC is usable before continuing registration.
charts/karmada/templates/_karmada_bootstrap_token_configuration.tpl Updates bootstrap RBAC manifests from RBAC-generator user to agent cluster creation permissions.
artifacts/deploy/bootstrap-token-configuration.yaml Updates the static bootstrap-token RBAC manifest accordingly.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +486 to +505
secret := &corev1.Secret{
ObjectMeta: metav1.ObjectMeta{
Namespace: clusterNamespace,
Name: clusterName,
},
Data: map[string][]byte{
clusterv1alpha1.SecretTokenKey: []byte("dummy-token"),
clusterv1alpha1.SecretCADataKey: []byte("dummy-ca-data"),
},
}
_, err = controlPlaneKubeClient.CoreV1().Secrets(clusterNamespace).Create(ctx, secret, metav1.CreateOptions{
DryRun: []string{metav1.DryRunAll},
})
if apierrors.IsForbidden(err) || apierrors.IsNotFound(err) {
return false, nil
}
if err != nil && !apierrors.IsAlreadyExists(err) {
return false, err
}

Copy link

Copilot AI Mar 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

waitForPullModeAgentRBACReady() always does a dry-run Secret create in clusterNamespace to decide RBAC readiness. If the Cluster was created without SecretRef/ImpersonatorSecretRef (e.g. --report-secrets=None), the controller will generate secret RBAC in its default namespace (karmada-cluster), but this check still targets the user-supplied clusterNamespace and can remain Forbidden forever when a non-default --cluster-namespace is used. Consider skipping the Secret check when secrets aren’t being reported, or deriving the namespace from the Cluster spec (same logic as the controller) before probing permissions.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This no longer applies to the current revision because waitForPullModeAgentRBACReady() was removed during the refactor.`

Comment on lines +112 to +115
if err := c.Client.Create(ctx, obj); err != nil && !apierrors.IsAlreadyExists(err) {
return err
}
return nil
Copy link

Copilot AI Mar 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

createObjectIfNotExists() treats AlreadyExists as success and never reconciles spec/rules. If a per-cluster RBAC object already exists with outdated or broader rules (e.g., upgrades from older versions), the controller will leave it unchanged even though syncCluster() calls this an “ensure”. Consider switching to a create-or-update/patch approach so the generated RBAC converges to the intended minimal rules.

Suggested change
if err := c.Client.Create(ctx, obj); err != nil && !apierrors.IsAlreadyExists(err) {
return err
}
return nil
// Try to create the object first.
if err := c.Client.Create(ctx, obj); err == nil {
return nil
} else if !apierrors.IsAlreadyExists(err) {
// If the error is something other than AlreadyExists, surface it.
return err
}
// If the object already exists, fetch it and reconcile its spec/rules
// with the desired state represented by obj.
key := client.ObjectKeyFromObject(obj)
existingObj, ok := obj.DeepCopyObject().(client.Object)
if !ok {
// Should not happen for well-formed client.Object types, but if it does
// we fail fast rather than silently skipping reconciliation.
return fmt.Errorf("failed to deep-copy object for reconciliation: %T", obj)
}
if err := c.Client.Get(ctx, key, existingObj); err != nil {
if apierrors.IsNotFound(err) {
// Object was deleted after the initial AlreadyExists; try to create again.
return c.Client.Create(ctx, obj)
}
return err
}
switch desired := obj.(type) {
case *rbacv1.Role:
current, ok := existingObj.(*rbacv1.Role)
if !ok {
return fmt.Errorf("existing object type mismatch for Role: %T", existingObj)
}
current.Rules = desired.Rules
current.Labels = desired.Labels
current.Annotations = desired.Annotations
case *rbacv1.ClusterRole:
current, ok := existingObj.(*rbacv1.ClusterRole)
if !ok {
return fmt.Errorf("existing object type mismatch for ClusterRole: %T", existingObj)
}
current.Rules = desired.Rules
current.Labels = desired.Labels
current.Annotations = desired.Annotations
case *rbacv1.RoleBinding:
current, ok := existingObj.(*rbacv1.RoleBinding)
if !ok {
return fmt.Errorf("existing object type mismatch for RoleBinding: %T", existingObj)
}
current.Subjects = desired.Subjects
current.RoleRef = desired.RoleRef
current.Labels = desired.Labels
current.Annotations = desired.Annotations
case *rbacv1.ClusterRoleBinding:
current, ok := existingObj.(*rbacv1.ClusterRoleBinding)
if !ok {
return fmt.Errorf("existing object type mismatch for ClusterRoleBinding: %T", existingObj)
}
current.Subjects = desired.Subjects
current.RoleRef = desired.RoleRef
current.Labels = desired.Labels
current.Annotations = desired.Annotations
default:
// For object kinds not explicitly handled here, fall back to a no-op
// on updates to avoid unintentionally mutating unknown types.
return nil
}
return c.Client.Update(ctx, existingObj)

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is already addressed in the current revision. The controller now reconciles agent RBAC with CreateOrUpdate instead of treating AlreadyExists as success, so existing objects are converged to the desired rules.`

Comment on lines +135 to +141
func generateAgentRBACResources(clusterName, clusterNamespace string) *agentRBACResources {
return &agentRBACResources{
clusterRoles: []*rbacv1.ClusterRole{generateClusterRole(clusterName)},
clusterRoleBindings: []*rbacv1.ClusterRoleBinding{generateClusterRoleBinding(clusterName)},
roles: []*rbacv1.Role{generateSecretAccessRole(clusterName, clusterNamespace), generateWorkAccessRole(clusterName)},
roleBindings: []*rbacv1.RoleBinding{generateSecretAccessRoleBinding(clusterName, clusterNamespace), generateWorkAccessRoleBinding(clusterName)},
}
Copy link

Copilot AI Mar 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

generateAgentRBACResources() always includes secret-related Role/RoleBinding, even when the Cluster has no SecretRef/ImpersonatorSecretRef. That grants secret create/get/patch permissions in a (potentially shared) namespace even when the agent isn’t configured to report or use secrets. Consider only generating the secret RBAC resources when at least one of the secret refs is set on the Cluster spec, to better match the “minimal RBAC” goal.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The previous karmadactl flow also generated the secret-related Role/RoleBinding as part of the agent RBAC bundle. This PR significantly improves the security posture by removing the bootstrap-time all-privileged rbac-generator identity; it does not attempt to fully minimize every steady-state agent permission in the same change.

Comment on lines +144 to +149
func generateClusterRole(clusterName string) *rbacv1.ClusterRole {
clusterRoleName := fmt.Sprintf("system:karmada:%s:agent", clusterName)
return &rbacv1.ClusterRole{
TypeMeta: metav1.TypeMeta{
APIVersion: rbacv1.SchemeGroupVersion.String(),
Kind: "ClusterRole",
Copy link

Copilot AI Mar 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The RBAC generation logic here largely duplicates pkg/karmadactl/register/register.go (GenerateClusterRole/Binding, GenerateSecretAccessRole, etc.). Having the same rule sets defined in two places increases drift risk when permissions evolve. Consider extracting the shared RBAC builder into a non-CLI package (e.g. pkg/util/cluster/...) that both the controller and karmadactl can reuse.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was addressed during the refactor. The agent RBAC builders were extracted into a shared utility package, and karmadactl no longer carries its own copy of those control-plane RBAC rules.`

@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented Mar 8, 2026

⚠️ Please install the 'codecov app svg image' to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

❌ Patch coverage is 35.37037% with 349 lines in your changes missing coverage. Please review.
✅ Project coverage is 42.15%. Comparing base (3424bc7) to head (30833a7).

Files with missing lines Patch % Lines
pkg/util/agentrbac.go 0.00% 172 Missing ⚠️
pkg/karmadactl/register/register.go 44.21% 51 Missing and 2 partials ⚠️
pkg/controllers/cluster/cluster_controller.go 58.33% 38 Missing and 7 partials ⚠️
cmd/agent/app/agent.go 0.00% 34 Missing ⚠️
pkg/util/credential.go 67.00% 20 Missing and 13 partials ⚠️
pkg/karmadactl/cmdinit/karmada/rbac.go 0.00% 9 Missing ⚠️
...dactl/cmdinit/bootstraptoken/agent/tlsbootstrap.go 77.77% 1 Missing and 1 partial ⚠️
pkg/karmadactl/cmdinit/karmada/deploy.go 0.00% 1 Missing ⚠️
❗ Your organization needs to install the Codecov GitHub app to enable full functionality.
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #7268      +/-   ##
==========================================
+ Coverage   42.03%   42.15%   +0.11%     
==========================================
  Files         874      875       +1     
  Lines       53551    53731     +180     
==========================================
+ Hits        22511    22648     +137     
- Misses      29349    29376      +27     
- Partials     1691     1707      +16     
Flag Coverage Δ
unittests 42.15% <35.37%> (+0.11%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@Park-Jiyeonn
Copy link
Copy Markdown
Contributor Author

some tests failed, I'll fix it

@Park-Jiyeonn Park-Jiyeonn marked this pull request as draft March 8, 2026 16:13
@karmada-bot karmada-bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Mar 8, 2026
@RainbowMango RainbowMango added this to the v1.18 milestone Mar 9, 2026
Previously, joining a pull-mode cluster required the bootstrap token to have full admin-like permissions (via the `system:karmada:agent:rbac-generator` role) just so `karmadactl register` could create the necessary RBAC resources for the agent. This was a security risk.

This commit shifts that responsibility to the control plane:
- Removed the over-privileged `rbac-generator` role.
- Granted the bootstrap token minimal permissions (only allowed to create `Cluster` objects).
- The `karmada-agent` now creates a `Cluster` object on startup and waits for its RBAC to be ready.
- The `cluster_controller` on the control plane watches for new pull-mode clusters and automatically provisions (and cleans up) the specific RBAC resources for that agent.
- Cleaned up the old RBAC generation logic from `karmadactl`.

Signed-off-by: Park.Jiyeon <jiyeonnn2@icloud.com>
@Park-Jiyeonn Park-Jiyeonn force-pushed the feat/secure-agent-rbac branch from afda1ce to 30833a7 Compare March 22, 2026 15:58
@karmada-bot karmada-bot added size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. and removed size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Mar 22, 2026
@Park-Jiyeonn Park-Jiyeonn marked this pull request as ready for review March 23, 2026 01:54
@karmada-bot karmada-bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Mar 23, 2026
@karmada-bot karmada-bot requested a review from calvin0327 March 23, 2026 01:54
@karmada-bot karmada-bot requested a review from Garrybest March 23, 2026 01:54
@Park-Jiyeonn
Copy link
Copy Markdown
Contributor Author

Park-Jiyeonn commented Mar 23, 2026

The latest commit has been pushed and the related unit tests are now passing (Operator / Test on Kubernetes (v1.34.0) failed due to a transient network issue and appears unrelated to this PR). I also fixed #7295 separately while debugging this change locally, and that fix has already been merged. This PR is ready for review.

@Park-Jiyeonn
Copy link
Copy Markdown
Contributor Author

This PR includes a small e2e update because it changes karmada-agent bootstrap behavior. The previous karmadactl_test flow hardcoded docker.io/karmada/karmada-agent:latest, which could pull a published image instead of the newly built agent image for this PR. The e2e now prefers a locally built agent image and loads it into kind so the test actually exercises the updated agent code.

@Park-Jiyeonn
Copy link
Copy Markdown
Contributor Author

Hey @RainbowMango @zhzhuang-zju @XiShanYongYe-Chang , could you please help review this PR?
It’s quite a big one, so I really appreciate your time!
Let me know if there’s anything else I can do to make the review easier.

@XiShanYongYe-Chang
Copy link
Copy Markdown
Member

They have gone to attend the KubeCon conference, so it might take some time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Create RBAC resources for karmada-agent on the control plane side

6 participants