Skip to content

[Scheduler] backoffQ uses incorrect sorting function, potentially blocking ready-to-retry bindings #6986

@rayo1uo

Description

@rayo1uo

What happened:
I was reading source code of scheduling_queue.go and noticed that backoffQ is initialized used the Less function:

backoffQ: heap.NewWithRecorder(BindingKeyFunc, Less, metrics.NewBackoffBindingsRecorder())

The Less function (defined in types.go) sorts bindings based on Priority and Timestamp(added to scheduling queue):

func Less(bInfo1, bInfo2 *QueuedBindingInfo) bool {
	p1 := bInfo1.Priority
	p2 := bInfo2.Priority
	return (p1 > p2) || (p1 == p2 && bInfo1.Timestamp.Before(bInfo2.Timestamp))
}

However, backoffQ is designed to hold bindings that are waiting for a backoff period to expire. Bindings which have completed backoff are poped from backoffQ heap and will be moved to ActiveQ for scheduling.
The flushing logic in flushBackoffQCompleted relies on the heap property to efficiently find expired bindings:

func (bq *prioritySchedulingQueue) flushBackoffQCompleted() {
    // ...
    for {
        bInfo, ok := bq.backoffQ.Peek()
        // ...
        // If the head of the heap is still backing off, we assume subsequent elements are too
        // and break the loop.
        if bq.isBindingBackingoff(bInfo) {
            break
        }
        // ...
    }
}

What you expected to happen:

backoffQ should be sorted by Backoff Expiry Time. If backoffQ is sorted by priority, a high-priority binding with a long backoff duration could sit at the top of the heap. This would cause flushBackoffQCompleted to break early, blocking other lower-priority bindings that have already completed their backoff period from being moved to activeQ.

How to reproduce it (as minimally and precisely as possible):

We can assume a test case:

  1. Create a high-priority Binding A that fails scheduling and enters backoffQ with a long backoff duration (e.g., 10s).
  2. Create a low-priority Binding B that fails scheduling and enters backoffQ with a short backoff duration (e.g., 1s).
  3. Because A has higher priority, it will be at the top of backoffQ.
  4. Wait for 2 seconds. Binding B is ready to retry, but flushBackoffQCompleted checks A, sees it's not ready, and stops. Binding B is blocked until A is ready.

Anything else we need to know?:

We should implement a Less function for backoffQ that compares the backoff expiry time of two bindings.

Environment:

  • Karmada version: master branch
  • kubectl-karmada or karmadactl version (the result of kubectl-karmada version or karmadactl version):
  • Others:

Metadata

Metadata

Assignees

Labels

kind/bugCategorizes issue or PR as related to a bug.

Type

No type

Projects

Status

No status

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions