Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
WalkthroughAdds asynchronous, array-capable SLURM submission and status logic to the executor: new run_jobs and run_array_jobs orchestration, thread-safe reporting, array-aware status lookup and settings, supporting helpers, tests, docs, and minor CI/package metadata updates. Public API gains array-related settings and two Executor methods. Changes
Sequence Diagram(s)sequenceDiagram
participant Executor
participant ThreadPool as "Submission Worker\n(ThreadPoolExecutor)"
participant MainLoop as "Main Event Loop\n(asyncio)"
participant SLURM as "SLURM (sbatch/sacct)"
participant Job as "JobExecutorInterface"
Executor->>MainLoop: prepare submission (run_jobs)
alt group of jobs -> array
Executor->>ThreadPool: submit run_array_jobs(tasks)
ThreadPool->>SLURM: sbatch --array ... (create array)
SLURM-->>ThreadPool: jobid
ThreadPool->>MainLoop: _report_job_submission_threadsafe(job_info)
MainLoop-->>Executor: record submitted tasks (external_jobid mapping)
else single or grouped non-array
Executor->>ThreadPool: submit run_job(job)
ThreadPool->>SLURM: sbatch (single job)
SLURM-->>ThreadPool: jobid
ThreadPool->>MainLoop: _report_job_submission_threadsafe(job_info)
MainLoop-->>Executor: record submitted job
end
MainLoop->>SLURM: query_job_status (sacct/squeue) for job/task ids
SLURM-->>MainLoop: status info
MainLoop->>Executor: resolve statuses (task-level vs parent-level)
alt failure detected
MainLoop->>Executor: _report_job_error_threadsafe(...)
else success
MainLoop->>Executor: report_job_success(...)
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Outside diff range and nitpick comments (1)
snakemake_executor_plugin_slurm/__init__.py (1)
111-124: Enhance error handling and loggingThe new method should follow the same error handling and logging patterns as the rest of the codebase. Consider adding:
- Error handling for empty job lists
- Logging for job grouping operations
- Consistent error propagation
Here's a suggested implementation:
def run_jobs(self, jobs: List[JobExecutorInterface]): + if not jobs: + return + + self.logger.debug(f"Grouping {len(jobs)} jobs by rule name") try: for rule_name, group in groupby(jobs, key=lambda job: job.rule.name): same_rule_jobs = list(group) + self.logger.debug( + f"Processing {len(same_rule_jobs)} jobs for rule '{rule_name}'" + ) if len(same_rule_jobs) == 1: self.run_job(same_rule_jobs[0]) else: # Temporary implementation: submit jobs individually for job in same_rule_jobs: self.run_job(job) + except Exception as e: + raise WorkflowError(f"Failed to process job group: {str(e)}")🧰 Tools
🪛 Ruff (0.8.0)
114-114: Using the generator returned from
itertools.groupby()more than once will do nothing on the second usage(B031)
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
📒 Files selected for processing (1)
snakemake_executor_plugin_slurm/__init__.py(2 hunks)
🧰 Additional context used
🪛 Ruff (0.8.0)
snakemake_executor_plugin_slurm/__init__.py
114-114: Using the generator returned from itertools.groupby() more than once will do nothing on the second usage
(B031)
🔇 Additional comments (1)
snakemake_executor_plugin_slurm/__init__.py (1)
8-8: LGTM!
The groupby import is correctly placed with other standard library imports and is required for the new job array functionality.
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
|
Plan: support aggregation of jobs in the Snakemake scheduler before they are send to the executor. This can be done by supporting a new standard resource This way we avoid the case that jobs drop into the executor in a too fine-grained way. |
There was a problem hiding this comment.
Actionable comments posted: 0
♻️ Duplicate comments (1)
snakemake_executor_plugin_slurm/__init__.py (1)
177-185:⚠️ Potential issueImplement temporary handling for multi-job groups.
The current implementation uses an ellipsis (...) as a placeholder, which will raise a NotImplementedError. Until the array job submission is implemented, we should handle these jobs individually.
Apply this improvement to handle multi-job groups temporarily:
else: # TODO submit as array # share code with run_job - - # TODO in the future: give a hint to the scheduler to select preferably - # many jobs from the same rule if possible, in order to have - # more efficient array jobs. This should be somehow tunable, because - # it might contradict other efficiency goals. - ... + # Temporary implementation: submit jobs individually until array support is added + for job in same_rule_jobs: + self.run_job(job)
🧹 Nitpick comments (2)
snakemake_executor_plugin_slurm/__init__.py (2)
172-172: Add docstring to document the method's purpose and parameters.The method lacks documentation explaining its purpose and parameters.
Apply this improvement:
def run_jobs(self, jobs: List[JobExecutorInterface]): + """Process and submit multiple jobs, potentially as job arrays. + + Args: + jobs: List of jobs to be submitted for execution. + """
173-173: Fix unused loop variable.The loop control variable
rule_nameis not used within the loop body.Apply this improvement:
- for rule_name, group in groupby(jobs, key=lambda job: job.rule.name): + for _rule_name, group in groupby(jobs, key=lambda job: job.rule.name):🧰 Tools
🪛 Ruff (0.8.2)
173-173: Loop control variable
rule_namenot used within loop bodyRename unused
rule_nameto_rule_name(B007)
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
snakemake_executor_plugin_slurm/__init__.py(2 hunks)
🧰 Additional context used
🪛 Ruff (0.8.2)
snakemake_executor_plugin_slurm/__init__.py
173-173: Loop control variable rule_name not used within loop body
Rename unused rule_name to _rule_name
(B007)
⏰ Context from checks skipped due to timeout of 90000ms (1)
- GitHub Check: testing
🔇 Additional comments (1)
snakemake_executor_plugin_slurm/__init__.py (1)
9-9: LGTM!The
groupbyimport is correctly placed with other standard library imports and is necessary for the new functionality.
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (1)
snakemake_executor_plugin_slurm/__init__.py (1)
179-185:⚠️ Potential issueImplement placeholder for array jobs
The current implementation uses an ellipsis (...) as a placeholder, which will raise a NotImplementedError. Until the array job submission is implemented, we should handle these jobs individually.
else: # TODO submit as array # share code with run_job # TODO in the future: give a hint to the scheduler to select preferably # many jobs from the same rule if possible, in order to have # more efficient array jobs. This should be somehow tunable, because # it might contradict other efficiency goals. - ... + # Temporary implementation: submit jobs individually until array support is added + for job in same_rule_jobs: + self.run_job(job)
🧹 Nitpick comments (1)
snakemake_executor_plugin_slurm/__init__.py (1)
173-173: Rename unused variablerule_nameto_rule_nameThe variable
rule_nameis defined in the loop statement but never used within the loop body. To follow Python conventions, prefix unused variables with an underscore.- def run_jobs(self, jobs: List[JobExecutorInterface]): - for rule_name, group in groupby(jobs, key=lambda job: job.rule.name): + def run_jobs(self, jobs: List[JobExecutorInterface]): + for _rule_name, group in groupby(jobs, key=lambda job: job.rule.name):🧰 Tools
🪛 Ruff (0.8.2)
173-173: Loop control variable
rule_namenot used within loop bodyRename unused
rule_nameto_rule_name(B007)
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
snakemake_executor_plugin_slurm/__init__.py(2 hunks)
🧰 Additional context used
🪛 Ruff (0.8.2)
snakemake_executor_plugin_slurm/__init__.py
173-173: Loop control variable rule_name not used within loop body
Rename unused rule_name to _rule_name
(B007)
⏰ Context from checks skipped due to timeout of 90000ms (1)
- GitHub Check: testing
|
This PR was marked as stale because it has been open for 6 months with no activity. |
…f they are not single jobs
|
@cademirch & @fbartusch final testing not really possible as the dependency can only be set, once the jobstep plugin release is ready. |
|
@cmeesters Works on my cluster. |
There was a problem hiding this comment.
❌ Poetry Test Failures
On my laptop I get one failling test case. I exclude tests.py, as they need a slurm cluster.
Click to expand error log
(base) fbartusch@dell:~/github/snakemake-executor-plugin-slurm$ poetry run pytest --ignore=tests/tests.py
===================================================================================== test session starts ======================================================================================
platform linux -- Python 3.12.2, pytest-8.4.2, pluggy-1.6.0
rootdir: /home/fbartusch/github/snakemake-executor-plugin-slurm
configfile: pyproject.toml
collected 117 items
tests/test_array_jobs.py ................................. [ 28%]
tests/test_array_status_lookup.py ... [ 30%]
tests/test_cli.py F. [ 32%]
tests/test_parsing.py ........... [ 41%]
tests/test_partition_selection.py ............................... [ 68%]
tests/test_scontrol_parsing.py ..... [ 72%]
tests/test_time_conversion.py ................................ [100%]
=========================================================================================== FAILURES ===========================================================================================
_________________________________________________________________________________ test_jobname_prefix_applied __________________________________________________________________________________
def test_jobname_prefix_applied():
executor = _make_executor("testprefix")
with patch.object(Executor, "warn_on_jobcontext", return_value=None):
with patch(
"snakemake_executor_plugin_slurm.uuid.uuid4",
return_value=uuid.UUID("00000000-0000-0000-0000-000000000000"),
):
> executor.__post_init__(test_mode=True)
tests/test_cli.py:42:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
snakemake_executor_plugin_slurm/__init__.py:483: in __post_init__
get_max_array_size(), int(self.workflow.executor_settings.array_limit)
^^^^^^^^^^^^^^^^^^^^
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
def get_max_array_size() -> int:
"""
Function to get the maximum array size for SLURM job arrays. This is used
to determine how many jobs can be submitted in a single array job.
Returns:
The maximum array size for SLURM job arrays, as an integer.
Defaults to 1000 if the SLURM_ARRAY_MAX environment variable is not set
or cannot be parsed as an integer.
"""
max_array_size_str = None
scontrol_cmd = "scontrol show config"
try:
res = subprocess.run(
shlex.split(scontrol_cmd),
capture_output=True,
text=True,
timeout=5,
)
out = (res.stdout or "") + (res.stderr or "")
m = re.search(r"MaxArraySize\s*=?\s*(\d+)", out, re.IGNORECASE)
if m:
max_array_size_str = m.group(1)
except (subprocess.SubprocessError, OSError):
max_array_size_str = None
try:
> max_array_size = int(max_array_size_str)
^^^^^^^^^^^^^^^^^^^^^^^
E TypeError: int() argument must be a string, a bytes-like object or a real number, not 'NoneType'
snakemake_executor_plugin_slurm/utils.py:46: TypeError
=================================================================================== short test summary info ====================================================================================
FAILED tests/test_cli.py::test_jobname_prefix_applied - TypeError: int() argument must be a string, a bytes-like object or a real number, not 'NoneType'
================================================================================ 1 failed, 116 passed in 0.93s =================================================================================
fbartusch
left a comment
There was a problem hiding this comment.
With the one test fixed and it's running on @cmeesters and my Slurm cluster the PR is ready for approval.
) This is the corresponding PR to snakemake/snakemake-executor-plugin-slurm#174 - handling selection and decompression of job information for SLURM array jobs. <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * **New Features** * Enhanced SLURM array-job support: runtime detection of array task index, optional per-task command handling (supports hex-encoded compressed per-task commands), configurable array-exec option, and clearer parsing/error messages. * **Tests** * Added unit tests for array-exec parsing covering JSON, Python-literal, compact-mapping formats, and invalid inputs. <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
…te 'testcases' - will become important for similar test cases, too
…tor-plugin-slurm into feat/jobarrays
…plementation could not handle non-local jobs
…e - otherwise too verbose for non array job
…plus to accomodate for the extra info payload
🤖 I have created a release *beep* *boop* --- ## [2.6.0](v2.5.4...v2.6.0) (2026-03-26) ### Features * job arrays ([#174](#174)) ([a51a5bf](a51a5bf)) * Set tmpspace with help of gres="tmpspace:10G" syntax ([#444](#444)) ([c6f6658](c6f6658)) ### Bug Fixes * logo ([#439](#439)) ([c4e3ec3](c4e3ec3)) ### Documentation * add admin documentation ([#436](#436)) ([8f7491f](8f7491f)) --- This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please). <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * **New Features** * Added support for job arrays * Added tmpspace configuration capability via `gres="tmpspace:10G"` * **Bug Fixes** * Fixed logo display issue * **Documentation** * Added admin documentation <!-- end of auto-generated comment: release notes by coderabbit.ai -->
Summary by CodeRabbit