Skip to content

Build pods with the same build ID may be launched twice. #3259

@sakka2

Description

@sakka2

What happened:
A build pod with the same build ID may be launched twice during the following condition.

Prerequisite:
Assume multiple buildcluster-queue-workers (buildcluster-queue-worker-A, buildcluster-queue-worker-B) are running, and a message with buildId: 12345 is queued in RabbitMQ.

  1. buildcluster-queue-worker-A receives the build message (buildId: 12345) from RabbitMQ and starts a build pod (podname: 12345-abcd)
  2. Before buildcluster-queue-worker-A returns the Ack to RabbitMQ, communication between buildcluster-queue-worker-A and RabbitMQ fails for some reason and Ack cannot be returned
  3. RabbitMQ requeues the message for the build (buildId: 12345)
  4. buildcluster-queue-worker-B receives the requeued message and starts another build pod (podname: 12345-efgh). As a result, two build pods with the same build ID are active.

What you expected to happen:
If build pods are started twice, the last build process should be stopped.

How to reproduce it:.
This occurs in limited situation, but this can happen when multiple buildcluster-queue-workers are running and only some buildcluster-queue-workers have an unstable connection to RabbitMQ.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions