Skip to content

Commit dfe713d

Browse files
authored
Copyedits for grammar and standardization
1 parent 17ed4c3 commit dfe713d

File tree

1 file changed

+35
-35
lines changed

1 file changed

+35
-35
lines changed

docs/user_guide.rst

Lines changed: 35 additions & 35 deletions
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22
User Guide
33
==========
44

5-
Who PSI/J is for
5+
Who PSI/J Is for
66
----------------
77

88
PSI/J is a Python library for submitting and managing HPC jobs via arbitrary
@@ -13,17 +13,17 @@ LSF, Flux, Cobalt, PBS, and your local machine, we think you will find that
1313
PSI/J simplifies your work considerably.
1414

1515

16-
Who PSI/J is (probably) not for
16+
Who PSI/J Is (Probably) Not for
1717
-------------------------------
1818

19-
If you were sure that you will only *ever* be launching jobs on ORNL's Summit
19+
If you are sure that you will only *ever* be launching jobs on ORNL's Summit
2020
system, and you don't care about any other cluster or machine, then you may as
2121
well interact with LSF (the resource manager on Summit) directly, rather than
2222
indirectly through PSI/J. In that case PSI/J would not really be adding much
2323
other than complexity.
2424

2525
If you write application code that is meant to run on various HPC clusters, but
26-
which never make calls to the underlying resource manager (e.g. by calling into
26+
which never makes calls to the underlying resource manager (e.g. by calling into
2727
Flux's client library, or executing ``srun``/``jsrun``/``aprun`` etc.), then
2828
PSI/J will not help you. This is likely your situation if you are a developer
2929
working on a MPI-based science simulation, since we have observed that it is
@@ -57,7 +57,7 @@ What is a JobExecutor?
5757

5858
A :class:`JobExecutor <psij.job_executor.JobExecutor>` represents a specific RM,
5959
e.g. Slurm, on which the job is being executed. Generally, when jobs are
60-
submitted, they will be queued for a variable period of time, depending on how
60+
submitted they will be queued for a variable period of time, depending on how
6161
busy the target machine is. Once the job is started, its executable is
6262
launched and runs to completion, and the job will be marked as completed.
6363

@@ -79,8 +79,8 @@ PSI/J currently provides executors for the following backends:
7979
- `pbspro`: `Altair's PBS-Professional <https://www.altair.com/pbs-professional>`_
8080
- `cobalt`: `ALCF's Cobalt job scheduler <https://www.alcf.anl.gov/support/user-guides/theta/queueing-and-running-jobs/job-and-queue-scheduling/index.html>`_
8181

82-
We encourage the contribution of executors for additional backends - please
83-
reference the `developers documentation
82+
We encourage the contribution of executors for additional backendsplease
83+
reference the `developer documentation
8484
<development/tutorial_add_executor.html>`_ for details.
8585

8686

@@ -109,14 +109,14 @@ Slurm // Local // LSF // PBS // Cobalt
109109
ex.submit(job)
110110
111111
And by way of comparison, other backends can be selected with the tabs above.
112-
Note that the only difference is the argument to the get_instance method.
112+
Note that the only difference is the argument to the ``get_instance`` method.
113113

114114
The ``JobExecutor`` implementation will translate all PSI/J API activities into the
115115
respective backend commands and run them on the backend, while at the same time
116116
monitoring the backend jobs for failure, completion or other state updates.
117117

118118
Assuming there are no errors, you should see a new entry in your resource
119-
manager’s queue after running that example above.
119+
manager’s queue after running the example above.
120120

121121

122122
Multiple Jobs
@@ -142,18 +142,18 @@ Every :class:`JobExecutor <psij.job_executor.JobExecutor>` can handle arbitrary
142142
numbers of jobs (tested with up to 64k jobs).
143143

144144

145-
Configuring your Job
145+
Configuring Your Job
146146
--------------------
147147

148148
In the example above, the ``executable='/bin/date'`` part tells PSI/J that we want
149149
the job to run the ``/bin/date`` command. But there are other parts to the job
150150
which can be configured:
151151

152-
- arguments for the job executable
153-
- environment the job is running in
154-
- destination for standard output and error streams
155-
- resource requirements for the job's execution
156-
- accounting details to be used
152+
- Arguments for the job executable
153+
- Environment the job is running in
154+
- Destination for standard output and error streams
155+
- Resource requirements for the job's execution
156+
- Accounting details to be used
157157

158158
That information is encoded in the ``JobSpec`` which is used to create the
159159
``Job`` instance.
@@ -226,32 +226,32 @@ redirected to files by setting the ``stdout_path`` and ``stderr_path`` attribute
226226
spec.stdout_path = '/tmp/date.out'
227227
spec.stderr_path = '/tmp/date.err'
228228
229-
The job's standard input stream can also be redirected to read from a file, by
229+
A job's standard input stream can also be redirected to read from a file by
230230
setting the ``spec.stdin_path`` attribute.
231231

232232

233233
Job Resources
234234
^^^^^^^^^^^^^
235235

236236
A job submitted to a cluster is allocated a specific set of resources to run on.
237-
The amount and type of resources are defined by a resource specification
238-
``ResourceSpec`` which becomes a part of the job specification. The resource
237+
The number and type of resources are defined by a resource specification,
238+
``ResourceSpec``, which becomes part of the job specification. The resource
239239
specification supports the following attributes:
240240

241-
- ``node_count``: allocate that number of compute nodes to the job. All
241+
- ``node_count``: Allocate that number of compute nodes to the job. All
242242
cpu-cores and gpu-cores on the allocated node can be exclusively used by the
243243
submitted job.
244-
- ``processes_per_node``: on the allocated nodes, execute that given number of
244+
- ``processes_per_node``: On the allocated nodes, execute that given number of
245245
processes.
246-
- ``process_count``: the total number of processes (MPI ranks) to be started
247-
- ``cpu_cores_per_process``: the number of cpu cores allocated to each launched
248-
process. PSI/J uses the system definition of a cpu core which may refer to
249-
a physical cpu core or to a virtual cpu core, also known as a hardware thread.
250-
- ``gpu_cores_per_process``: the number of gpu cores allocated to each launched
251-
process. The system definition of an gpu core is used, but usually refers
246+
- ``process_count``: The total number of processes (MPI ranks) to be started.
247+
- ``cpu_cores_per_process``: The number of cpu cores allocated to each launched
248+
process. PSI/J uses the system definition of a cpu core, which may refer to
249+
a physical cpu core or to a virtual cpu core (also known as a hardware thread).
250+
- ``gpu_cores_per_process``: The number of gpu cores allocated to each launched
251+
process. The system definition of a gpu core is used, but usually refers
252252
to a full physical GPU.
253253
- ``exclusive_node_use``: When this boolean flag is set to ``True``, then PSI/J
254-
will ensure that no other jobs, neither of the same user nor of other users
254+
will ensure that no other jobs, neither from the same user nor from other users
255255
of the same system, will run on any of the compute nodes on which processes
256256
for this job are launched.
257257

@@ -274,14 +274,14 @@ node count contradicts the value of ``process_count / processes_per_node``:
274274
# the line above should raise an 'psij.InvalidJobException' exception
275275
276276
277-
Processes versus ranks
277+
Processes Versus Ranks
278278
""""""""""""""""""""""
279279

280-
All processes of the job will share a single MPI communicator
280+
All processes of a job will share a single MPI communicator
281281
(`MPI_COMM_WORLD`), independent of their placement, and the term `rank` (which
282282
usually refers to an MPI rank) is thus equivalent. However, jobs started with
283283
a single process instance may, depending on the executor implementation, not get
284-
an MPI communicator. How Jobs are launched can be specified by the `launcher`
284+
an MPI communicator. How jobs are launched can be specified by the `launcher`
285285
attribute of the ``JobSpec``, as documented below.
286286

287287

@@ -296,7 +296,7 @@ like so: ``JobSpec(..., launcher='srun')``.
296296
Scheduling Information
297297
^^^^^^^^^^^^^^^^^^^^^^
298298

299-
To specify resource-manager-specific information, like queues/partitions,
299+
To specify resource manager-specific information, like queues/partitions,
300300
runtime, and so on, create a :class:`JobAttributes
301301
<psij.job_attributes.JobAttributes>` and set it with ``JobSpec(...,
302302
attributes=my_job_attributes)``:
@@ -357,10 +357,10 @@ to call the :meth:`wait <psij.job.Job.wait>` method with no arguments:
357357
358358
The :meth:`wait <psij.job.Job.wait>` call will return once the job has reached
359359
a terminal state, which almost always means that it finished or was
360-
cancelled.
360+
canceled.
361361

362362
To distinguish jobs that complete successfully from ones that fail or
363-
are cancelled, fetch the status of the job after calling
363+
are canceled, fetch the status of the job after calling
364364
:meth:`wait <psij.job.Job.wait>`:
365365

366366
.. code-block:: python
@@ -369,7 +369,7 @@ are cancelled, fetch the status of the job after calling
369369
print(str(job.status))
370370
371371
372-
Canceling your Job
372+
Canceling Your Job
373373
^^^^^^^^^^^^^^^^^^
374374

375375
If supported by the underlying job scheduler, PSI/J jobs can be canceled by
@@ -381,7 +381,7 @@ Status Callbacks
381381

382382
Waiting for jobs to complete with :meth:`wait <psij.job.Job.wait>` is fine if
383383
you don't mind blocking while you wait for a single job to complete. However, if
384-
you want to wait on multiple jobs without blocking, or you want to get updates
384+
you want to wait on multiple jobs without blocking or you want to get updates
385385
when jobs start running, you can attach a callback to a :class:`JobExecutor
386386
<psij.job_executor.JobExecutor>` which will fire whenever any job submitted to
387387
that executor changes status.

0 commit comments

Comments
 (0)