Skip to content

Conversation

@ljluestc
Copy link

This PR implements the feature requested in GitHub issue #3459 to enable interpreting customized scheduling results in the ResourceInterpreter.

The issue describes two critical scenarios that need to be addressed:

  1. Remainder Distribution Problem: When distributing replicas across clusters with equal weights, remainders from uneven division always go to the same cluster. For workloads with replicas = 1 or when replica counts can't be evenly divided by cluster weights, Karmada currently schedules remainders to the same cluster consistently, leading to imbalanced resource utilization.

  2. Member Cluster Synchronization: When HPA controllers work independently in member clusters, replica changes need to be synchronized back to the control plane to maintain a consistent global perspective and prevent workload disruption during cluster transitions.

This PR adds a new InterpretSchedulingResult operation to the ResourceInterpreter framework that allows customizing how replicas are distributed among clusters through:

  • Declarative Lua scripts for custom scheduling logic
  • Webhook-based customization for external scheduling services
  • Third-party resource support for extended scheduling capabilities
  • Full backward compatibility with existing behavior

Which issue(s) this PR fixes

Fixes #3459

Special notes for your reviewer

This implementation provides a flexible framework for custom scheduling algorithms while maintaining full backward compatibility. The feature is opt-in and only activates when explicitly configured.

Key architectural decisions:

  • Added new InterpreterOperationInterpretSchedulingResult operation type
  • Extended ResourceInterpreter interface with InterpretSchedulingResult() method
  • Implemented in all interpreter types (Default, Declarative, Webhook, Third-party)
  • Added comprehensive test coverage including unit and integration tests

Does this PR introduce a user-facing change?

`ResourceInterpreter`: Added new `InterpretSchedulingResult` operation to enable customizing replica distribution across clusters. This allows:

1. Custom scheduling logic for distributing remainders across clusters instead of always assigning them to the same cluster
2. Synchronization of replica changes from member clusters back to control plane when HPA controllers work independently

Users can now configure custom scheduling behavior through:
- Lua scripts in ResourceInterpreterCustomization
- Webhook services for external scheduling logic
- Third-party resource interpreters

The feature is backward compatible and only activates when explicitly configured.

Detailed Implementation

API Extensions

  • Added InterpreterOperationInterpretSchedulingResult operation type to configv1alpha1
  • Extended ResourceInterpreter interface with InterpretSchedulingResult(object, schedulingResult) ([]TargetCluster, error) method
  • Added SchedulingResultInterpretation type for declarative configurations
  • Extended webhook request/response attributes to include scheduling results

Interpreter Implementations

  • Default Interpreter: Returns scheduling result unchanged (maintains backward compatibility)
  • Declarative Interpreter: Supports Lua scripts for custom scheduling logic
  • Webhook Interpreter: Enables external services to customize scheduling results
  • Third-party Interpreter: Supports third-party resource scheduling customization

Lua VM Integration

  • Added InterpretSchedulingResult() method to Lua VM for script execution
  • Implemented ConvertLuaResultToTargetClusters() for Lua-to-Go conversion
  • Full Lua script support for complex scheduling algorithms

Add new InterpretSchedulingResult operation to enable customizing how replicas
are distributed among clusters. This addresses GitHub issue karmada-io#3459 by allowing:

1. Custom scheduling logic for distributing remainders across clusters
   instead of always assigning them to the same cluster

2. Synchronization of replica changes from member clusters back to control plane
   when HPA controllers work independently

Key changes:
- Add InterpreterOperationInterpretSchedulingResult operation type
- Extend ResourceInterpreter interface with InterpretSchedulingResult method
- Implement InterpretSchedulingResult in all interpreter types:
  * Default: returns result unchanged (backward compatible)
  * Declarative: supports Lua scripts for custom logic
  * Webhook: enables external scheduling services
  * Third-party: supports third-party resource scheduling
- Add SchedulingResultInterpretation type for declarative configs
- Extend webhook request/response attributes for scheduling results
- Add Lua VM support for scheduling result interpretation
- Comprehensive test coverage for all implementations

This provides a flexible framework for custom scheduling algorithms while
maintaining full backward compatibility.
@karmada-bot karmada-bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Nov 29, 2025
@karmada-bot
Copy link
Collaborator

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign rainbowmango for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@karmada-bot karmada-bot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Nov 29, 2025
@XiShanYongYe-Chang
Copy link
Member

Hi @ljluestc, thank you for participating in Karmada.

I would like to know, apart from the issue you mentioned, what is your user scenario?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Enable Interpreting Customized Scheduling Result in ResourceInterpreter as well

3 participants