Skip to content

[Bug]: Instance added to terminating fleet #3449

@jvstme

Description

@jvstme

Steps to reproduce

  1. Apply the fleet configuration:

    $ cat fleet.dstack.yml
    type: fleet
    name: test
    nodes: 1..
    
    $ dstack apply -f fleet.dstack.yml -y
  2. Once an instance is added to the fleet in provisioning status, delete the fleet.

    $ dstack delete -f fleet.dstack.yml -y
  3. Immediately after that, without waiting for the fleet to be fully deleted, submit a run to that fleet.

    $ cat run.dstack.yml
    type: dev-environment
    ide: vscode
    
    $ dstack apply -f run.dstack.yml --fleet test -y

Actual behaviour

If the timing is right, the run adds a new instance to the fleet, even though the fleet has the terminating status. The fleet then finishes terminating and disappears from dstack fleet list, yet the instance remains in the cloud and there is no way to terminate it from dstack. The run will keep running on the instance.

$ dstack fleet
 FLEET  INSTANCE  BACKEND  RESOURCES  PRICE  STATUS  CREATED     

$ dstack ps
 NAME             BACKEND          GPU  PRICE           STATUS   SUBMITTED  
 nervous-liger-1  aws (eu-west-3)  -    $0.0297 (spot)  running  9 mins ago

$ dstack event --since 10m
[2026-01-07 09:15:24] [👤admin] [fleet test] Fleet status changed ACTIVE -> TERMINATING
[2026-01-07 09:15:31] [👤admin] [run nervous-liger-1] Run submitted. Status: SUBMITTED
[2026-01-07 09:15:31] [job nervous-liger-1-0-0] Job created on run submission. Status: SUBMITTED
[2026-01-07 09:15:51] [fleet test] Fleet status changed TERMINATING -> TERMINATED
[2026-01-07 09:15:58] [job nervous-liger-1-0-0] Job status changed SUBMITTED -> PROVISIONING
[2026-01-07 09:15:58] [instance test-0, job nervous-liger-1-0-0] Instance created for job. Instance status: PROVISIONING (1/1 blocks busy)
[2026-01-07 09:15:59] [run nervous-liger-1] Run status changed SUBMITTED -> PROVISIONING
[2026-01-07 09:16:36] [job nervous-liger-1-0-0] Job status changed PROVISIONING -> PULLING
[2026-01-07 09:17:21] [job nervous-liger-1-0-0] Job status changed PULLING -> RUNNING
[2026-01-07 09:17:23] [run nervous-liger-1] Run status changed PROVISIONING -> RUNNING

Expected behaviour

The run fails, no new instances are added to the terminating fleet.

dstack version

0.20.2

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions