-
Notifications
You must be signed in to change notification settings - Fork 5
yascheduler: Node deallocation failure leads to zombie servers #151
Copy link
Copy link
Open
Description
yascheduler fails to properly terminate cloud nodes (Hetzner) after task completion.
hcloud server list output:
ID NAME STATUS IPV4 IPV6 PRIVATE NET LOCATION AGE
125115508 node-mhequwrf running 135.181.90.247 2a01:4f9:c011:b3ef::/64 - hel1 1d
125148160 node-njtzvnal running 65.109.224.173 2a01:4f9:c013:9b4f::/64 - hel1 1d
125148184 node-weaciubj running 204.168.204.51 2a01:4f9:c010:a192::/64 - hel1 1dyanodes output:
ip=65.109.224.173 ncpus=MAX enabled=True occupied_by=aiida-284095 (task_id=6428) hetzner
ip=204.168.204.51 ncpus=MAX enabled=True occupied_by=aiida-284079 (task_id=6427) hetzneryascheduler.log:
Details
INFO:yascheduler.Scheduler.CloudAPIManager.hetzner.RemoteMachine:root@135.181.90.247:Setup fleur engine...
INFO:yascheduler.Scheduler:Disconnecting from machines: 135.181.90.247
DEBUG:yascheduler.Scheduler.RemoteMachine:root@135.181.90.247:Close connection
INFO:yascheduler.Scheduler.CloudAPIManager.hetzner:DELETED 135.181.90.247
WARNING:yascheduler.Scheduler.CloudAPIManager.hetzner:Setup node 135.181.90.247 failed - deallocate
INFO:yascheduler.Scheduler.CloudAPIManager.hetzner:NODE 135.181.90.247 NOT DELETED AS UNKNOWN
INFO:yascheduler.Scheduler.CloudAPIManager.hetzner:CREATED 135.181.90.247
DEBUG:yascheduler.Scheduler.CloudAPIManager.hetzner.RemoteMachine:root@135.181.90.247:Open connection
DEBUG:yascheduler.Scheduler.CloudAPIManager.hetzner.RemoteMachine:root@135.181.90.247:Open connection
DEBUG:yascheduler.Scheduler.CloudAPIManager.hetzner.RemoteMachine:root@135.181.90.247:Open connection
INFO:backoff:Backing off create(...) for 1.9s (ConnectionRefusedError: [Errno 111] Connect call failed ('135.181.90.247', 22))
DEBUG:yascheduler.Scheduler.CloudAPIManager.hetzner.RemoteMachine:root@135.181.90.247:Open connection
INFO:backoff:Backing off create(...) for 1.4s (ConnectionRefusedError: [Errno 111] Connect call failed ('135.181.90.247', 22))
DEBUG:yascheduler.Scheduler.CloudAPIManager.hetzner.RemoteMachine:root@135.181.90.247:Open connection
DEBUG:yascheduler.Scheduler.CloudAPIManager.hetzner.RemoteMachine:root@135.181.90.247:Detected platform: linux
INFO:yascheduler.Scheduler.CloudAPIManager.hetzner.RemoteMachine:root@135.181.90.247:CPUs count: 32
INFO:yascheduler.Scheduler.CloudAPIManager.hetzner.RemoteMachine:root@135.181.90.247:Setup fleur engine...
INFO:yascheduler.Scheduler.CloudAPIManager.hetzner.RemoteMachine:root@135.181.90.247:Setup of fleur engine is done...
INFO:yascheduler.Scheduler.CloudAPIManager.hetzner.RemoteMachine:root@135.181.90.247:Setup pcrystal engine...
DEBUG:yascheduler.Scheduler.CloudAPIManager.hetzner.RemoteMachine:root@135.181.90.247:Uploading files (/data/engines/pcrystal/Pcrystal) to data/engines/pcrystal
INFO:yascheduler.Scheduler.CloudAPIManager.hetzner.RemoteMachine:root@135.181.90.247:Setup of pcrystal engine is done...
INFO:yascheduler.Scheduler.CloudAPIManager.hetzner.RemoteMachine:root@135.181.90.247:Setup pproperties engine...
DEBUG:yascheduler.Scheduler.CloudAPIManager.hetzner.RemoteMachine:root@135.181.90.247:Uploading files (/data/engines/pproperties/Pproperties) to data/engines/pproperties
INFO:yascheduler.Scheduler.CloudAPIManager.hetzner.RemoteMachine:root@135.181.90.247:Setup fleur engine...
INFO:yascheduler.Scheduler.CloudAPIManager.hetzner.RemoteMachine:root@135.181.90.247:Setup of fleur engine is done...
INFO:yascheduler.Scheduler.CloudAPIManager.hetzner.RemoteMachine:root@135.181.90.247:Setup pcrystal engine...
DEBUG:yascheduler.Scheduler.CloudAPIManager.hetzner.RemoteMachine:root@135.181.90.247:Uploading files (/data/engines/pcrystal/Pcrystal) to data/engines/pcrystal
INFO:yascheduler.Scheduler.CloudAPIManager.hetzner.RemoteMachine:root@135.181.90.247:Setup of pcrystal engine is done...
INFO:yascheduler.Scheduler.CloudAPIManager.hetzner.RemoteMachine:root@135.181.90.247:Setup pproperties engine...
DEBUG:yascheduler.Scheduler.CloudAPIManager.hetzner.RemoteMachine:root@135.181.90.247:Uploading files (/data/engines/pproperties/Pproperties) to data/engines/pproperties
INFO:yascheduler.Scheduler.CloudAPIManager.hetzner.RemoteMachine:root@135.181.90.247:Setup fleur engine...
INFO:yascheduler.Scheduler.CloudAPIManager.hetzner.RemoteMachine:root@135.181.90.247:Setup of fleur engine is done...
INFO:yascheduler.Scheduler.CloudAPIManager.hetzner.RemoteMachine:root@135.181.90.247:Setup pcrystal engine...
DEBUG:yascheduler.Scheduler.CloudAPIManager.hetzner.RemoteMachine:root@135.181.90.247:Uploading files (/data/engines/pcrystal/Pcrystal) to data/engines/pcrystal
INFO:yascheduler.Scheduler.CloudAPIManager.hetzner.RemoteMachine:root@135.181.90.247:Setup of pcrystal engine is done...
INFO:yascheduler.Scheduler.CloudAPIManager.hetzner.RemoteMachine:root@135.181.90.247:Setup pproperties engine...
DEBUG:yascheduler.Scheduler.CloudAPIManager.hetzner.RemoteMachine:root@135.181.90.247:Uploading files (/data/engines/pproperties/Pproperties) to data/engines/pproperties
INFO:yascheduler.Scheduler.CloudAPIManager.hetzner.RemoteMachine:root@135.181.90.247:Setup fleur engine...
INFO:yascheduler.Scheduler.CloudAPIManager.hetzner.RemoteMachine:root@135.181.90.247:Setup of fleur engine is done...
INFO:yascheduler.Scheduler.CloudAPIManager.hetzner.RemoteMachine:root@135.181.90.247:Setup pcrystal engine...
DEBUG:yascheduler.Scheduler.CloudAPIManager.hetzner.RemoteMachine:root@135.181.90.247:Uploading files (/data/engines/pcrystal/Pcrystal) to data/engines/pcrystal
INFO:yascheduler.Scheduler.CloudAPIManager.hetzner.RemoteMachine:root@135.181.90.247:Setup of pcrystal engine is done...
INFO:yascheduler.Scheduler.CloudAPIManager.hetzner.RemoteMachine:root@135.181.90.247:Setup pproperties engine...
DEBUG:yascheduler.Scheduler.CloudAPIManager.hetzner.RemoteMachine:root@135.181.90.247:Uploading files (/data/engines/pproperties/Pproperties) to data/engines/pproperties
INFO:yascheduler.Scheduler.CloudAPIManager.hetzner.RemoteMachine:root@135.181.90.247:Setup fleur engine...
<\details>
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels