@@ -225,36 +225,32 @@ To find out more, see the ``job.err`` file.
225
225
226
226
.. cylc-scope ::
227
227
228
- If you're struggling to track down the error, you might want to restart the
229
- workflow in debug mode and run the task again :
228
+ If you're struggling to track down the error, you might want to put the
229
+ workflow into debug mode: :
230
230
231
- .. TODO Update this advice after https://github.com/cylc/cylc-flow/issues/5829
231
+ cylc verbosity DEBUG <workflow-id>
232
232
233
- .. code-block :: console
233
+ When a workflow is running in debug mode, all jobs will create a ``job.xtrace ``
234
+ file when run in addition to ``job.err ``. This can help you to locate the error
235
+ within the job script.
234
236
235
- # shut the workflow down (leave any active jobs running)
236
- $ cylc stop --now --now <workflow>
237
- # restart the workflow in debug mode
238
- $ cylc play <workflow> --debug
239
- # re-run all failed task(s)
240
- $ cylc trigger '<workflow>//*:failed'
237
+ You can also start workflows in debug mode::
241
238
242
- When a workflow is running in debug mode, all jobs will create a ``job.xtrace ``
243
- file which can help you to locate the error within the job script.
239
+ cylc play --debug <workflow-id>
244
240
245
241
246
- My workflow shutdown unexpectedly
247
- ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
242
+ My workflow shut down unexpectedly
243
+ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
248
244
249
245
When a Cylc scheduler shuts down, it should leave behind a log message explaining why.
250
246
251
- E.G. this message means that a workflow shutdown because it was told to:
247
+ E.G. this message means that a workflow shut down because it was told to:
252
248
253
249
.. code-block ::
254
250
255
251
Workflow shutting down - REQUEST(CLEAN)
256
252
257
- If a workflow shutdown due to a critical problem, you should find some
253
+ If a workflow shut down due to a critical problem, you should find some
258
254
traceback in this log. If this traceback doesn't look like it comes from your
259
255
system, please report it to the Cylc developers for investigation (on
260
256
GitHub or Discourse).
@@ -276,6 +272,10 @@ Why isn't my task running?
276
272
To find out why a task is not being run, use the ``cylc show `` command.
277
273
This will list the task's prerequisites and xtriggers.
278
274
275
+ Note, at present ``cylc show `` can only display
276
+ :term: `active tasks <active task> `. Waiting tasks beyond the
277
+ :term: `n=0 window <n-window> ` have no satisfied prerequisites.
278
+
279
279
Note, tasks which are held |task-held | will not be run, use ``cylc release ``
280
280
to release a held task.
281
281
@@ -296,7 +296,7 @@ If something has gone wrong during installation, an error should have been
296
296
logged a file in this directory:
297
297
``$HOME/cylc-run/<workflow-id>/log/remote-install/ ``.
298
298
299
- `` If you need to access files from a remote platform (e.g. 2-stage ``fcm_make ``),
299
+ If you need to access files from a remote platform (e.g. 2-stage ``fcm_make ``),
300
300
ensure that a task has submitted to it before you do so. If needed you can use
301
301
a blank "dummy" task to ensure that remote installation is completed *before *
302
302
you run any tasks which require this e.g:
@@ -312,12 +312,12 @@ Conda / Mamba environment activation fails
312
312
Some Conda packages rely on activation scripts which are run when you call the
313
313
activate command.
314
314
315
- Sadly , some of these scripts don't defend against command failure or unset
316
- environment variables causing them to fail when configured in Cylc `` *script ``
317
- (see also :ref: `troubleshooting.my_job_failed ` for details).
315
+ Unfortunately , some of these scripts don't defend against command failure or
316
+ unset environment variables causing them to fail when configured in Cylc
317
+ `` *script `` (see also :ref: `troubleshooting.my_job_failed ` for details).
318
318
319
319
To avoid this, run ``set +eu `` before activating your environment. This turns
320
- off some Bash safety features allowing environment activation to complete.
320
+ off some Bash safety features, allowing environment activation to complete.
321
321
Remember to run ``set -eu `` afterwards to turn these features back on.
322
322
323
323
.. code-block :: cylc
@@ -360,7 +360,7 @@ E.G. the following error:
360
360
361
361
FileNotFoundError: [Errno 2] No such file or directory: 'ssh'
362
362
363
- Means that ``ssh `` is not installed.
363
+ Means that ``ssh `` is not installed or not in your `` $PATH `` .
364
364
365
365
See :ref: `non-python-requirements ` for details on system requirements.
366
366
@@ -376,8 +376,8 @@ a remote platform.
376
376
This either means that:
377
377
378
378
1. The platform is down (e.g. all login nodes are offline).
379
- 2. There is a network problem (e.g. you cannot connect to the login nodes).
380
- 3. The platform is not correctly configured.
379
+ 2. Or, there is a network problem (e.g. you cannot connect to the login nodes).
380
+ 3. Or, the platform is not correctly configured.
381
381
382
382
Check the scheduler log, you might find some stderr associated with this
383
383
message.
@@ -395,15 +395,14 @@ note that this defaults to the platform name if not explicitly set.
395
395
``OperationalError: disk I/O error ``
396
396
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
397
397
398
- This means that something was unable to write to a file when it would expect to
399
- have been able to.
398
+ This means that Cylc was unable to write to the database.
400
399
401
- This error usually occurs if when you have exceeded you filesystem quota.
400
+ This error usually occurs if when you have exceeded your filesystem quota.
402
401
403
- If a Cylc workflow cannot write to the filesystem, it will shutdown. Once
402
+ If a Cylc scheduler cannot write to the filesystem, it will shutdown. Once
404
403
you've cleared out enough space for the workflow to continue you should be able
405
- to safely restart it as you would normally using ``cylc play ``, the workflow
406
- will continue where it left off.
404
+ to safely restart it as you would normally using ``cylc play ``. The workflow
405
+ will continue from where it left off.
407
406
408
407
409
408
``socket.gaierror ``
@@ -418,7 +417,7 @@ login nodes you submit jobs to).
418
417
Cylc expects each host to have a unique and stable fully qualified domain name
419
418
(FQDN) and to be identifiable from other hosts on the network using this name.
420
419
421
- I.E. If a host identifies itself with an FQDN, then we should be able to look it
420
+ I.e., If a host identifies itself with an FQDN, then we should be able to look it
422
421
from another host by this FQDN. If we can't, then Cylc can't tell which host is
423
422
which and will not be able to function properly.
424
423
@@ -429,7 +428,7 @@ DNS setup is consistent.
429
428
Sometimes we do not have control over the platforms we use and it is not
430
429
possible to compel system administrators to address these issues. If this is
431
430
the case, you can fall back to IP address based host identification which may
432
- work (i.e. use IP address rather than host names, makes logs less human
431
+ work (i.e. use IP addresses rather than host names, which makes logs less human
433
432
readable). As a last resort you can also hard-code the host name for each host.
434
433
435
434
For more information, see
@@ -449,10 +448,10 @@ increase this limit.
449
448
``Cannot determine whether workflow is running on <host> ``
450
449
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
451
450
452
- When a Cylc workflow runs, it creates a :term: `contact file ` which tells us on
451
+ When Cylc runs a workflow , it creates a :term: `contact file ` which tells us on
453
452
which host and port it can be contacted.
454
453
455
- If the workflow cannot be contacted, Cylc will attempt to check whether the
454
+ If the scheduler cannot be contacted, Cylc will attempt to check whether the
456
455
process is still running to ensure it hasn't crashed.
457
456
458
457
If you are seeing this error message, it means that Cylc was unable to
0 commit comments