Skip to content

Conversation

@JinZhou5042
Copy link
Member

@JinZhou5042 JinZhou5042 commented Nov 5, 2025

Proposed Changes

Modulize the function of evicting a random worker and expose this API.

The upper level executor can determine the eviction policy, either static, executed at fixed time intervals, or dynamic, driven by workflow completion progress.

Merge Checklist

The following items must be completed before PRs can be merged.
Check these off to verify you have completed all steps.

  • make test Run local tests prior to pushing.
  • make format Format source code to comply with lint policies. Note that some lint errors can only be resolved manually (e.g., Python)
  • make lint Run lint on source code prior to pushing.
  • Manual Update: Update the manual to reflect user-visible changes.
  • Type Labels: Select a github label for the type: bugfix, enhancement, etc.
  • Product Labels: Select a github label for the product: TaskVine, Makeflow, etc.
  • PR RTM: Mark your PR as ready to merge.

@JinZhou5042 JinZhou5042 self-assigned this Nov 5, 2025
@JinZhou5042 JinZhou5042 requested review from btovar and dthain November 5, 2025 20:15
}

/** Evict a random worker to simulate a failure. */
int vine_manager_evict_a_random_worker(struct vine_manager *q)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's all it release_random_worker. Eviction means to kick off a worker from a compute node.

struct list *candidates_list = list_create();
char *key;
struct vine_worker_info *w;
HASH_TABLE_ITERATE(q->worker_table, key, w)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is hash table random start. You can use it instead of creating a list of all the workers, which could be expensive.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants