Skip to content

Commit ae4acf5

Browse files
authored
Merge pull request #354 from ExaWorks/docs_absolute_paths
Added a note about relative/absolute paths for job files and some of …
2 parents 901230d + ae9ee37 commit ae4acf5

File tree

1 file changed

+39
-0
lines changed

1 file changed

+39
-0
lines changed

src/psij/job_spec.py

Lines changed: 39 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -23,6 +23,45 @@ def __init__(self, name: Optional[str] = None, executable: Optional[str] = None,
2323
"""
2424
Constructs a `JobSpec` object while allowing its properties to be initialized.
2525
26+
.. note::
27+
A note about paths.
28+
29+
It is strongly recommended that paths to `std*_path`, `directory`, etc. be specified
30+
as absolute. While paths can be relative, and there are cases when it is desirable to
31+
specify them as relative, it is important to understand what the implications are.
32+
33+
Paths in a specification refer to paths *that are accessible to the machine where the
34+
job is running*. In most cases, that will be different from the machine on which the
35+
job is launched (i.e., where PSI/J is invoked from). This means that a given path may
36+
or may not point to the same file in both the location where the job is running and the
37+
location where the job is launched from.
38+
39+
For example, if launching jobs from a login node of a cluster, the path `/tmp/foo.txt`
40+
will likely refer to locally mounted drives on both the login node and the compute
41+
node(s) where the job is running. However, since they are local mounts, the file
42+
`/tmp/foo.txt` written by a job running on the compute node will not be visible by
43+
opening `/tmp/foo.txt` on the login node. If an output file written on a compute node
44+
needs to be accessed on a login node, that file should be placed on a shared filesystem.
45+
However, even by doing so, there is no guarantee that the shared filesystem is mounted
46+
under the same mount point on both login and compute nodes. While this is an unlikely
47+
scenario, it remains a possibility.
48+
49+
When relative paths are specified, even when they point to files on a shared filesystem
50+
as seen from the submission side (i.e., login node), the job working directory may be
51+
different from the working directory of the application that is launching the job. For
52+
example, an application that uses PSI/J to launch jobs on a cluster may be invoked from
53+
(and have its working directory set to) `/home/foo`, where `/home` is a mount point for
54+
a shared filesystem accessible by compute nodes. The launched job may specify
55+
`stdout_path=Path('bar.txt')`, which would resolve to `/home/foo/bar.txt`. However, the
56+
job may start in `/tmp` on the compute node, and its standard output will be redirected
57+
to `/tmp/bar.txt`.
58+
59+
Relative paths are useful when there is a need to refer to the job directory that the
60+
scheduler chooses for the job, which is not generally known until the job is started by
61+
the scheduler. In such a case, one must leave the `spec.directory` attribute empty and
62+
refer to files inside the job directory using relative paths.
63+
64+
2665
:param name: A name for the job. The name plays no functional role except that
2766
:class:`~psij.JobExecutor` implementations may attempt to use the name to label the
2867
job as presented by the underlying implementation.

0 commit comments

Comments
 (0)