Skip to content

Commit bf4eb67

Browse files
Add comprehensive documentation for Slurm scheduler support
Co-authored-by: transientlunatic <4365778+transientlunatic@users.noreply.github.com>
1 parent 51fbdd0 commit bf4eb67

File tree

2 files changed

+128
-23
lines changed

2 files changed

+128
-23
lines changed

docs/source/api/schedulers.rst

Lines changed: 65 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,10 +1,73 @@
11
The Schedulers module
22
=====================
33

4-
This module contains the logic for interacting with schedulers, for example, ``HTCondor``.
4+
This module contains the logic for interacting with schedulers, for example, ``HTCondor`` and ``Slurm``.
55

66
The scheduler module provides a unified interface for submitting and managing jobs across different scheduling systems.
7-
Currently supported schedulers include HTCondor, with Slurm support planned for the future.
7+
Currently supported schedulers include:
8+
9+
* **HTCondor**: The High-Throughput Computing (HTC) workload manager commonly used in scientific computing
10+
* **Slurm**: The Simple Linux Utility for Resource Management, widely used in HPC clusters
11+
12+
Configuration
13+
-------------
14+
15+
Asimov can automatically detect which scheduler is available on your system during project initialization.
16+
You can also manually configure the scheduler type in your ``.asimov/asimov.conf`` file:
17+
18+
For HTCondor::
19+
20+
[scheduler]
21+
type = htcondor
22+
23+
[condor]
24+
user = your_username
25+
scheduler = optional_schedd_name
26+
27+
For Slurm::
28+
29+
[scheduler]
30+
type = slurm
31+
32+
[slurm]
33+
user = your_username
34+
partition = optional_partition_name
35+
36+
Using the Scheduler API
37+
------------------------
38+
39+
Pipelines can use the scheduler API through the ``self.scheduler`` property, which provides
40+
a scheduler-agnostic interface for job submission and management.
41+
42+
Example::
43+
44+
# Submit a DAG file
45+
cluster_id = self.scheduler.submit_dag(
46+
dag_file="/path/to/dag.dag",
47+
batch_name="my-analysis"
48+
)
49+
50+
# Query job status
51+
jobs = self.scheduler.query_all_jobs()
52+
53+
# Delete a job
54+
self.scheduler.delete(cluster_id)
55+
56+
DAG File Translation
57+
--------------------
58+
59+
When using Slurm, HTCondor DAG (Directed Acyclic Graph) files are automatically converted
60+
to equivalent Slurm batch scripts with job dependencies. This allows pipelines that generate
61+
HTCondor DAGs (such as bilby, bayeswave, and lalinference) to work seamlessly with Slurm.
62+
63+
The conversion handles:
64+
65+
* Job dependencies (PARENT-CHILD relationships)
66+
* Job submission files
67+
* Batch names and job metadata
68+
69+
API Reference
70+
-------------
871

972
.. automodule:: asimov.scheduler
1073
:members:

docs/source/scheduler-integration.rst

Lines changed: 63 additions & 21 deletions
Original file line numberDiff line numberDiff line change
@@ -6,9 +6,13 @@ This guide explains how to use the scheduler abstraction in asimov pipelines and
66
Overview
77
--------
88

9-
Asimov now includes a scheduler abstraction layer that provides a uniform interface for
10-
interacting with different job schedulers (HTCondor, Slurm, etc.). This reduces code
11-
duplication and makes it easier to switch between schedulers.
9+
Asimov includes a scheduler abstraction layer that provides a uniform interface for
10+
interacting with different job schedulers. Currently supported schedulers:
11+
12+
* **HTCondor**: The High-Throughput Computing (HTC) workload manager
13+
* **Slurm**: The Simple Linux Utility for Resource Management
14+
15+
This abstraction reduces code duplication and makes it easy to switch between schedulers.
1216

1317
Using the Scheduler in Pipelines
1418
---------------------------------
@@ -47,13 +51,16 @@ DAG Submission
4751

4852
DAG submission (via ``submit_dag`` methods) now uses the scheduler API. For HTCondor backends,
4953
this wraps the Python bindings (e.g., ``htcondor.Submit.from_dag()``) rather than calling
50-
``condor_submit_dag`` directly. The scheduler property remains available in these methods for
51-
any additional, non-DAG job submissions that may be needed.
54+
``condor_submit_dag`` directly.
55+
56+
For Slurm backends, HTCondor DAG files are automatically converted to Slurm batch scripts
57+
with proper job dependencies. This allows pipelines that generate HTCondor DAGs (such as
58+
bilby, bayeswave, and lalinference) to work seamlessly with Slurm.
5259

5360
Using the Scheduler in CLI Commands
5461
------------------------------------
5562

56-
The monitor loop and other CLI commands can use the scheduler API directly:
63+
The monitor loop and other CLI commands use the scheduler API:
5764

5865
.. code-block:: python
5966
@@ -75,16 +82,30 @@ The monitor loop and other CLI commands can use the scheduler API directly:
7582
job = create_job_from_dict(job_dict)
7683
cluster_id = scheduler.submit(job)
7784
78-
The ``asimov monitor start`` and ``asimov monitor stop`` commands now support the
79-
``--use-scheduler-api`` flag to use the new scheduler API directly:
85+
Monitor Daemon
86+
~~~~~~~~~~~~~~
87+
88+
The monitor daemon behavior differs based on the scheduler:
89+
90+
**HTCondor**: Uses HTCondor's cron functionality to run periodic jobs
8091

8192
.. code-block:: bash
8293
83-
# Use the new scheduler API
84-
asimov monitor start --use-scheduler-api
85-
86-
# Use the legacy interface (default)
87-
asimov monitor start
94+
asimov start # Submits a recurring job via HTCondor
95+
asimov stop # Removes the HTCondor job
96+
97+
**Slurm**: Uses system cron to schedule periodic monitor runs
98+
99+
.. code-block:: bash
100+
101+
asimov start # Creates a cron job (requires python-crontab)
102+
asimov stop # Removes the cron job
103+
104+
For Slurm support, install the optional dependency:
105+
106+
.. code-block:: bash
107+
108+
pip install asimov[slurm]
88109
89110
Backward Compatibility
90111
----------------------
@@ -106,26 +127,47 @@ This means existing code continues to work without modification:
106127
Configuration
107128
-------------
108129

109-
You can configure the scheduler in your ``asimov.conf`` file:
130+
Asimov automatically detects which scheduler is available during ``asimov init``.
131+
You can manually configure the scheduler in your ``asimov.conf`` file:
132+
133+
**HTCondor Configuration**
110134

111135
.. code-block:: ini
112136
113137
[scheduler]
114138
type = htcondor
115139
116140
[condor]
141+
user = your_username
117142
scheduler = my-schedd.example.com # Optional: specific schedd
118143
119-
Future Schedulers
120-
-----------------
121-
122-
When Slurm or other schedulers are fully implemented, you'll be able to switch by
123-
simply changing the configuration:
144+
**Slurm Configuration**
124145

125146
.. code-block:: ini
126147
127148
[scheduler]
128149
type = slurm
150+
151+
[slurm]
152+
user = your_username
153+
partition = compute # Optional: specific partition
154+
cron_minute = */15 # Optional: monitor frequency (default: every 15 minutes)
155+
156+
Switching Schedulers
157+
--------------------
129158

130-
All code using the scheduler API will automatically use the new scheduler without
131-
requiring any code changes.
159+
To switch between schedulers, simply update the ``[scheduler]`` section in your
160+
``asimov.conf`` file and restart your workflows. All code using the scheduler API
161+
will automatically use the new scheduler without requiring any code changes.
162+
163+
Example: Switching from HTCondor to Slurm
164+
165+
.. code-block:: ini
166+
167+
# Before
168+
[scheduler]
169+
type = htcondor
170+
171+
# After
172+
[scheduler]
173+
type = slurm

0 commit comments

Comments
 (0)