Skip to content

Commit 81cba9f

Browse files
authored
Merge branch 'main' into str
2 parents e4c2c3d + 64282e0 commit 81cba9f

21 files changed

+302
-159
lines changed

QuickStart.md

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1,23 +1,23 @@
11
# Quick Start Guide
22

3-
This document will guide you through the install procedure and your first hello world example.
3+
This document will guide you through the install procedure and your first Hello World example.
44

55
- [Requirements](#requirements)
6-
- [Install psij](#install-psij)
6+
- [Install PSI/J](#install-psij)
77
- [Hello World example](#hello-world)
88

99
## Requirements
1010
- python3.7+
1111

12-
## Install psij
12+
## Install PSI/J
1313

14-
If you have conda installed you might want to start from a fresh environment. This part is not installing psij but setting up a new environment with the specified python version:
14+
If you have conda installed you might want to start from a fresh environment. This part is not installing PSI/J but setting up a new environment with the specified python version:
1515

1616
1. `conda create -n psij python=3.7`
1717
2. `conda activate psij`
1818

1919

20-
Install psij from the GitHub repository:
20+
Install PSI/J from the GitHub repository:
2121

2222
1. Clone the repository into your working directory:
2323

README-dev.md

Lines changed: 4 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,4 @@
1-
# Low Level Development Stuff
2-
3-
## Building the Documentation
1+
# Building the Documentation
42

53
There are two ways to build the documentation. One is the plain one, where
64
the plain Sphinx output is desired, and the other is the themed version that
@@ -13,7 +11,7 @@ is meant to integrate with the website.
1311
resulting pages, such as pages being cut off at the bottom. Please use
1412
a simple http server as detailed below.
1513

16-
### Building the Standalone Documentation
14+
## Building the Standalone Documentation
1715

1816
1. Make sure you have the documentation dependencies installed:
1917
```sh
@@ -28,7 +26,7 @@ is meant to integrate with the website.
2826
The output will be in `docs/.build`
2927

3028

31-
### Building the Themed Documentation
29+
## Building the Themed Documentation
3230

3331
This builds the themed version of the docs as well as the website. The steps
3432
are:
@@ -68,7 +66,7 @@ web site. The themed documentation will be found under the "Documentation"
6866
tab.
6967
7068
71-
### Release Process
69+
## Release Process
7270
7371
Here are the steps for putting out a fresh release to Pypi.
7472

docs/api.rst

Lines changed: 23 additions & 22 deletions
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,7 @@ The Job-related classes listed in this section (``Job``, ``JobSpec``,
1111
``ResourceSpec``, and ``JobAttributes``) are independent of
1212
executor implementations. The authors strongly recommend that users
1313
program against these classes rather than adding executor-specific
14-
configuration options, to the extent possible.
14+
configuration options.
1515

1616
.. autoclass:: psij.Job
1717
:members:
@@ -29,7 +29,7 @@ Job Modifiers
2929
^^^^^^^^^^^^^
3030

3131
There can be a lot of configuration information that goes into each
32-
resource manager job. Its walltime, partition/queue, the number of nodes
32+
resource manager job, including its walltime, partition/queue, the number of nodes
3333
it needs, what kind of nodes, what quality of service the job requires, and
3434
so on.
3535

@@ -78,7 +78,7 @@ Rather than:
7878
Executors can be
7979
installed from multiple sources, so the precise list of executors
8080
available to a specific installation of the PSI/J Python library can vary.
81-
In order to get a list of available executors, you can run, in a
81+
To get a list of available executors, run the following in a
8282
terminal:
8383

8484
.. code-block:: shell
@@ -87,7 +87,7 @@ terminal:
8787
8888
8989
JobExecutor Base Class
90-
^^^^^^^^^^^^^^^^^^^^^^
90+
^^^^^^^^^^^^^^
9191

9292
The ``psij.JobExecutor`` class is abstract, but offers concrete static methods
9393
for registering, fetching, and listing subclasses of itself.
@@ -99,43 +99,43 @@ The concrete executor implementations provided by this version of PSI/J Python
9999
are:
100100

101101
Cobalt
102-
^^^^^^
102+
""""""""""""""""""""""
103103

104104
.. autoclass:: psij.executors.batch.cobalt.CobaltJobExecutor
105105
:noindex:
106106

107107
Flux
108-
^^^^
108+
""""""""""""""""""""""
109109

110110
.. autoclass:: psij.executors.flux.FluxJobExecutor
111111
:noindex:
112112

113113
LSF
114-
^^^
114+
""""""""""""""""""""""
115115

116116
.. autoclass:: psij.executors.batch.lsf.LsfJobExecutor
117117
:noindex:
118118

119119
PBS
120-
^^^
120+
""""""""""""""""""""""
121121

122122
.. autoclass:: psij.executors.batch.pbspro.PBSProJobExecutor
123123
:noindex:
124124

125125
Slurm
126-
^^^^^
126+
""""""""""""""""""""""
127127

128128
.. autoclass:: psij.executors.batch.slurm.SlurmJobExecutor
129129
:noindex:
130130

131131
Local
132-
^^^^^
132+
""""""""""""""""""""""
133133

134134
.. autoclass:: psij.executors.local.LocalJobExecutor
135135
:noindex:
136136

137137
Radical Pilot
138-
^^^^^^^^^^^^^
138+
""""""""""""""""""""""
139139

140140
.. autoclass:: psij.executors.rp.RPJobExecutor
141141
:noindex:
@@ -144,13 +144,15 @@ Radical Pilot
144144

145145

146146
Launchers
147-
~~~~~~~~~
147+
----------------
148148

149149
Launchers are mechanisms to start the actual jobs on batch schedulers
150150
once a set of nodes has been allocated for the job. In essence, launchers
151151
are wrappers around the job executable which can provide additional
152152
features, such as setting up an MPI environment, starting a copy of the
153-
job executable on each allocated node, etc. To get a launcher instance,
153+
job executable on each allocated node, etc.
154+
155+
To get a launcher instance,
154156
call :meth:`Launcher.get_instance(name) <psij.launcher.Launcher.get_instance>`
155157
with ``name`` being the name of a launcher. Like job executors,
156158
launchers are plugins and can come from various places. To obtain a list
@@ -172,58 +174,57 @@ concrete static methods for registering and fetching subclasses of itself.
172174
The PSI/J Python library comes with a core set of launchers, which are:
173175

174176
aprun
175-
^^^^^
177+
""""""""""""""""""""""
176178

177179
.. autoclass:: psij.launchers.aprun.AprunLauncher
178180
:members:
179181
:noindex:
180182

181183
jsrun
182-
^^^^^
184+
""""""""""""""""""""""
183185

184186
.. autoclass:: psij.launchers.jsrun.JsrunLauncher
185187
:members:
186188
:noindex:
187189

188190
srun
189-
^^^^
191+
""""""""""""""""""""""
190192

191193
.. autoclass:: psij.launchers.srun.SrunLauncher
192194
:members:
193195
:noindex:
194196

195197
mpirun
196-
^^^^^^
198+
""""""""""""""""""""""
197199

198200
.. autoclass:: psij.launchers.mpirun.MPILauncher
199201
:members:
200202
:noindex:
201203

202204
single
203-
^^^^^^
205+
""""""""""""""""""""""
204206

205207
.. autoclass:: psij.launchers.single.SingleLauncher
206208
:members:
207209
:noindex:
208210

209211
multiple
210-
^^^^^^^^
212+
""""""""""""""""""""""
211213

212214
.. autoclass:: psij.launchers.multiple.MultipleLauncher
213215
:members:
214216
:noindex:
215217

216218
Other Package Contents
217-
~~~~~~~~~~~~~~~~~~~~~~
219+
----------------
218220

219221
.. automodule:: psij.exceptions
220222
:members:
221223
:noindex:
222224

223225

224226
API Reference
225-
~~~~~~~~~~~~~
226-
227+
----------------
227228
.. toctree::
228229

229230
.generated/modules

docs/development/programming.rst

Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -11,15 +11,15 @@ which should be read first.
1111

1212
The PSI/J specification splits implementations into two main parts:
1313

14-
- The core classes, containing scheduler-agnostic code. Client code, wanting
14+
- **Core classes** containing scheduler-agnostic code. Client code,
1515
to maintain portability, should only directly reference the core classes.
16-
- Executors and launchers, which are specific to scheduler implementations
16+
- **Executors and launchers**, which are specific to scheduler implementations
1717
and can be used interchangeably, provided that the underlying scheduler or
1818
launcher implementation exists.
1919

2020
Nearly all of the core classes described in the PSI/J Specification are simple
2121
property containers and the behavior of the few exceptions is thoroughly
22-
documented therein. There are, however, a few areas that are specific to the
22+
documented therein. There are, however, a few areas specific to the
2323
current PSI/J Python implementation which are mostly a matter of implementation
2424
and are not documented by the specification. These are:
2525

@@ -74,8 +74,8 @@ register the executor.
7474

7575
If an error occurs after a descriptor is loaded but before the actual executor
7676
or launcher class is loaded, that error is stored. Successive attempts to
77-
instantiate that executor/launcher using
78-
:meth:`~psij.job_executor.JobExecutor.get_instance` or
77+
instantiate that executor using
78+
:meth:`~psij.job_executor.JobExecutor.get_instance` or launcher using
7979
:meth:`~psij.job_launcher.Launcher.get_instance` will result in the
8080
stored exception being raised. This prevents packages with broken
8181
implementations of executors or launchers from reporting errors unless there
@@ -93,7 +93,7 @@ a Local Resource Manager (LRM) that allows job submission by pointing a
9393
*submit* command (a tool accessible through a standard POSIX `exec()`) to a
9494
file that contains all relevant job information. It also assumes that there
9595
exist commands for cancelling the job and for querying for the status of one
96-
or more jobs previously submitted.
96+
or more previously submitted jobs.
9797

9898
The general workflow used by the batch scheduler executor to submit a job is as
9999
follows:
@@ -104,7 +104,7 @@ follows:
104104
the `name` of the implementing class. The submit script is generated using
105105
the
106106
:meth:`~psij.executors.batch.batch_scheduler_executor.BatchSchedulerExecutor.generate_submit_script`
107-
method of the implementing class.
107+
method of the implementing class.
108108

109109
2. Execute the command returned by
110110
:meth:`~psij.executors.batch.batch_scheduler_executor.BatchSchedulerExecutor.get_submit_command` to
@@ -150,7 +150,7 @@ launchers. Consequently, launcher scripts also take care of redirecting the
150150
standard streams of the actual launcher tool, which is assumed to properly
151151
aggregate the output streams of the job ranks.
152152

153-
In addition to the functions above, PSI/J launchers also take care of invoking
153+
In addition to the functions above, PSI/J launchers also invoke
154154
the pre- and post-launch scripts.
155155

156156
Since script based launchers are interchangeable, they must have a well

docs/development/tutorial_add_executor.rst

Lines changed: 13 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -11,24 +11,24 @@ that looks like SLURM or PBSPro.
1111
What Is an Executor and Why Might You Want to Add One?
1212
------------------------------------------------------
1313

14-
PSI/J provides a common interface for obtaining allocations on compute resources.
14+
PSI/J provides a common interface for obtaining allocations on compute resources. Usually, those compute resources will already have a batch scheduler in place (for example, SLURM).
1515

16-
Usually, those compute resources will already have some batch scheduler in place (for example, SLURM).
17-
18-
A PSI/J executor is the code that tells the core of PSI/J how to interact with
19-
such a batch scheduler so that it can provide a common interface to applications.
16+
A PSI/J executor is the code that tells the core of PSI/J how to interact with a batch scheduler so that it can provide a common interface to applications.
2017

2118
A PSI/J executor needs to implement the abstract methods defined on the :class:`psij.job_executor.JobExecutor` base class.
2219
The documentation for that class has reference material for each of the methods that won't be repeated here.
2320

2421
For batch scheduler systems, the :class:`.BatchSchedulerExecutor` subclass provides further useful structure to help implement JobExecutor.
2522
This tutorial will focus on using BatchSchedulerExecutor as a base, rather than implementing JobExecutor directly.
2623

27-
The batch scheduler executor is based around a model where interactions with a local resource manager happen via command line invocations.
24+
The batch scheduler executor is based on a model where interactions with a local resource manager happen via command line invocations.
2825
For example, with PBS `qsub` and `qstat` commands are used to submit a request and to see status.
2926

3027
To use BatchSchedulerExecutor for a new local resource manager that uses this command line interface, subclass BatchSchedulerExecutor and add in code that understands how to form the command lines necessary to submit a request for an allocation and to get allocation status. This tutorial will do that for PBSPro.
3128

29+
Adding an Executor
30+
------------------
31+
3232
First set up a directory structure::
3333

3434
mkdir project/
@@ -44,14 +44,13 @@ We're going to create three source files in this directory structure:
4444

4545
* ``psij-descriptors/pbspro_descriptor.py`` - This file tells the PSI/J core what this package implements.
4646

47-
First, we'll build a skeleton that won't work, and see that it doesn't work in the test suite. Then we'll build up to the full functionality.
48-
4947
Prerequisites:
5048

51-
* You have the psij-python package installed already and are able to run whatever basic verification you think is necessary.
49+
* You have the psij-python package installed and are able to run whatever basic verification you think is necessary.
5250

5351
* You are able to submit to PBS Pro on a local system.
5452

53+
First, we'll build a skeleton that won't work, and see that it doesn't work in the test suite. Then we'll build up to the full functionality.
5554

5655
A Not-implemented Stub
5756
----------------------
@@ -135,8 +134,7 @@ Now running the same pytest command will give a different error further along in
135134

136135
This default BatchSchedulerExecutor code needs a configuration object and none was supplied.
137136

138-
A configuration object can contain configuration specific to this particular executor. However,
139-
for now we are not going to specify a custom configuration object and instead will re-use
137+
A configuration object can contain configuration specific to this particular executor. For now we are not going to specify a custom configuration object and instead will re-use
140138
the BatchSchedulerExecutorConfig supplied by the PSI/J core.
141139

142140
Define a new __init__ method that will define a default configuration::
@@ -172,15 +170,13 @@ To implement submission, we need to implement three methods:
172170

173171
You can read the docstrings for each of these methods for more information, but briefly the submission process is:
174172

175-
1. ``generate_submit_script`` should generate a submit script specific to the batch scheduler.
173+
1. ``generate_submit_script`` generates a submit script specific to the batch scheduler.
176174

177-
2. ``get_submit_command`` should return the command line necessary to submit that script to the batch scheduler.
175+
2. ``get_submit_command`` returns the command line necessary to submit that script to the batch scheduler.
178176

179177
The output of that command should be interpreted by ``job_id_from_submit_output`` to extract a batch scheduler specific job ID,
180178
which can be used later when cancelling a job or getting job status.
181179

182-
So let's implement those.
183-
184180
In line with other PSI/J executors, we're going to delegate script generation to a template based helper. So add a line to initialize a :py:class:`.TemplatedScriptGenerator` in the
185181
executor initializer, pointing at a (as yet non-existent) template file, and replace ``generate_submit_script`` with a delegated call to `TemplatedScriptGenerator`::
186182

@@ -278,7 +274,7 @@ Implementing Status
278274

279275
PSI/J needs to ask the batch scheduler for the status of jobs that it has submitted. This can be done with ``BatchSchedulerExecutor`` by overriding these two methods, which we stubbed out as not-implemented earlier on:
280276

281-
* :py:meth:`.BatchSchedulerExecutor.get_status_command` - Like ``get_submit_command``, this should return a batch scheduler specific command line, this time to output job status.
277+
* :py:meth:`.BatchSchedulerExecutor.get_status_command` - Like ``get_submit_command``, this should return a batch scheduler-specific command line, this time to output job status.
282278

283279
* :py:meth:`.BatchSchedulerExecutor.parse_status_output` - This will interpret the output of the above status command, a bit like ``job_id_from_submit_output``.
284280

@@ -407,7 +403,7 @@ The _STATE_MAP given here is also not exhaustive: if PBS Pro qstat returns a dif
407403
How to Distribute Your Executor
408404
-------------------------------
409405

410-
If you want to share your executor with others, here are two ways:
406+
If you want to share your executor with others:
411407

412408
1. You can make a Python package and distribute that as an add-on without needing to interact with the PSI/J project.
413409

0 commit comments

Comments
 (0)