Killing time #8

agoscinski · 2025-05-07T14:14:49Z

No description provided.

When `uv` creates a virtual environment, it will not automatically activate it. However this is not a problem, since the subsequent commands in `.readthedocs.yml`, meaning `uv sync` & `uv run` both pick up the created environment automatically. This commit clarifies the behavior for developers. --------- Co-authored-by: Daniel Hollas <[email protected]>

The killing process is very convoluted due to being partially performed in `tasks.py:Waiting` and `process.py:Process`. The architecture tried to split the killing process in two parts, one responsible for cancelling the job in the scheduler in (`tasks.py:Waiting`), one responsible for killing the process transitioning it to the KILLED state. Here a summary of these two steps Killing the plumpy calcjob/process:Process Event: KillMessage (through rabbitmq by through verdi) kill -> self.runner.controller.kill_process # (sending message to kill) Killing the scheduler job calcjob/tasks:Waiting (The task running the actual CalcJob) Event: CalcJobMonitorAction.KILL (through monitoring), KillInterrupt (through verdi) execute --> _kill_job -> task_kill_job -> do_kill -> execmanager.kill_calculation In this PR I am moving most of the killing logic to the process to simplify the design. This is required to fix a bug that appears when two killing commands are sent. The first killing command is sending the KillInterruption (within `process.py:Process`, part of the logic in parent class) to the `tasks.py:Waiting` that receives it and start the cancelling of the scheduler job. Since this is only triggered through a try-catch block of the `KillInterruption` it cannot be repeated when a second kill command is invoked by the user. This bug was introduced by PR TODO (the one introduced force kill), because it also started to fix the timeout issue (verdi process kill is ignoring the timeout). Moving all killing logic to the process as done in this PR solves the problem as we completely moved the cancelation of the job is reinvoked in the process class. This is the function that is invoked when a worker receives a kill message through RMQ. I put very verbose comments for the review that I will remove later. I must say the kill process seems not well tested as I had not to adapt much in the tests. The tests in `test_work_chain.py` need some adaption to also be able to kill a scheduler job in a dummy manner.

for more information, see https://pre-commit.ci

agoscinski force-pushed the main branch from 2a128ef to e257b3c Compare May 7, 2025 14:15

agoscinski force-pushed the killing-time branch 2 times, most recently from f76c985 to 2e78d73 Compare May 7, 2025 21:39

agoscinski force-pushed the killing-time branch from 2e78d73 to 7cf9ccd Compare May 8, 2025 21:04

agoscinski force-pushed the killing-time branch from 7cf9ccd to 1269ac8 Compare May 8, 2025 21:15

[pre-commit.ci] auto fixes from pre-commit.com hooks

b1690d7

for more information, see https://pre-commit.ci

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Killing time #8

Killing time #8

Uh oh!

agoscinski commented May 7, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Killing time #8

Are you sure you want to change the base?

Killing time #8

Uh oh!

Conversation

agoscinski commented May 7, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants