You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
-[Files in this docs folder](#files-in-this-docs-folder)
2
+
-[Task types overview](#task-types-overview)
3
+
-[Transfers](#transfers)
4
+
-[Executions](#executions)
5
+
-[Batches](#batches)
6
+
-[Templates and variable resolution](#templates-and-variable-resolution)
2
7
3
8
This folder contains documentation Open Task Framework (OTF). Use these docs to understand core concepts, the package layout, how plugins and handlers are structured, and concrete examples for using built-in handlers.
4
9
5
-
## Table of contents
6
-
7
-
-[Architecture](./architecture.md)
8
-
-[Remote handlers](./remotehandlers.md)
9
-
-[Task handlers](./taskhandlers.md)
10
-
-[Plugins](./plugins.md)
11
-
-[Lookup plugins](./plugins/lookup.md)
12
-
13
10
## Files in this docs folder
14
11
15
12
-`architecture.md` — high-level architecture and component responsibilities
@@ -18,9 +15,40 @@ This folder contains documentation Open Task Framework (OTF). Use these docs to
18
15
-`plugins.md` — index of built-in plugins and how to author new ones
19
16
-`plugins/lookup.md` — details for the lookup plugin family
20
17
21
-
## Quick start
18
+
## Task types overview
19
+
20
+
As mentioned in the [README.md](../README.md), OTF supports three main task types: Transfers, Executions and Batches. Task payloads are JSON-based (either plain `.json` or Jinja2 `.json.j2` templates). See `docs/architecture.md` for the full rendering/parsing pipeline.
21
+
22
+
## Transfers
23
+
24
+
Transfers move files from a single source to one or more destinations. Supported protocols include SFTP/SSH and local filesystem handlers. Key features:
25
+
26
+
- File polling and filewatch (wait for files to appear)
27
+
- Log watching for specific patterns
28
+
- Conditional selection by file size, age, and count
29
+
- Post-copy actions: archive, delete, or move source files
30
+
31
+
Transfers can operate in "direct" mode (remote-to-remote where supported) or via a local staging step if protocols differ.
32
+
33
+
## Executions
34
+
35
+
Execution tasks run commands on remote hosts (via SSH or local execution handlers). Execution handlers capture stdout, stderr, exit code and may include PID tokenization for advanced lifecycle management (useful for `kill`).
36
+
37
+
## Batches
38
+
39
+
Batches compose multiple tasks (executions, transfers, or nested batches). They support:
40
+
41
+
- Ordered execution using `order_id`
42
+
- Explicit `dependencies` for DAG-style control
43
+
-`timeout`, `continue_on_fail`, and `retry_on_rerun` per-task options
44
+
- Resumption via log markers so batches can be rerun from the last known state
45
+
46
+
Batches are documented in more detail in `docs/taskhandlers.md`.
47
+
48
+
## Templates and variable resolution
22
49
23
-
1. Read `architecture.md` to understand the components.
24
-
2. Inspect `src/opentaskpy/remotehandlers` and `src/opentaskpy/plugins` for concrete handler implementations and examples.
50
+
- All task/config payloads are JSON-based. Files are either plain `.json` or Jinja2 templates with `.json.j2`.
51
+
- The loader pipeline renders `.json.j2` templates using Jinja2 and available plugin helpers, then parses the rendered text as JSON and validates against `src/opentaskpy/config/schemas/`.
52
+
- When editing templates, ensure the rendered output is valid JSON and that required variables are present in the rendering context.
25
53
26
-
If you want additional docs (diagrams, developer onboarding checklist, or API docs), tell me which and I will add them.
54
+
If you want this page removed instead of updated, tell me and I'll delete it.
# Lookup Plugins — built-in plugin index and guidance
2
2
3
3
Plugins are small, reusable helpers that can be referenced in task payloads to compute values at runtime (for example: lookups, templating helpers). Plugins live under `src/opentaskpy/plugins/` and are intentionally lightweight.
-[Where plugins are used](#where-plugins-are-used)
9
-
-[How to author a plugin](#how-to-author-a-plugin)
10
-
-[Example usage](#example-usage)
11
-
-[Notes](#notes)
7
+
-[Lookup Plugins — built-in plugin index and guidance](#lookup-plugins--built-in-plugin-index-and-guidance)
8
+
-[Table of contents](#table-of-contents)
9
+
-[Built-in plugins](#built-in-plugins)
10
+
-[How to write a plugin](#how-to-write-a-plugin)
11
+
-[Notes](#notes)
12
12
13
-
## Built-in plugin families
13
+
## Built-in plugins
14
14
15
-
-`lookup` — helpers that resolve values from different sources. Built-in lookup plugins include:
16
-
-`lookup.file` — read a value from a local file
17
-
-`lookup.http_json` — fetch JSON over HTTP and extract a value
18
-
-`lookup.random_number` — generate a random number (useful for tests)
15
+
-`lookup.file` — read a value from a local file
16
+
-`lookup.http_json` — fetch JSON over HTTP and extract a value
17
+
-`lookup.random_number` — generate a random number (useful for tests)
19
18
20
-
## Where plugins are used
19
+
These are primarily included for demonstration purposes, it is not likely these will be useful in production.
21
20
22
-
- Variable interpolation and templating across tasks and examples under `examples/`
23
-
- Tests that need small deterministic helpers without external dependencies
21
+
## How to write a plugin
24
22
25
-
## How to author a plugin
26
-
27
-
1. Create a new module under `src/opentaskpy/plugins/<family>/`.
28
-
2. Export a callable that accepts a plugin configuration dict and returns a value or raises a descriptive exception.
29
-
3. Add unit tests under `tests/` covering expected inputs and errors.
30
-
31
-
## Example usage
32
-
33
-
Example — using `lookup.http_json` (pseudo-config)
34
-
35
-
```yaml
36
-
someVar: !lookup.http_json
37
-
url: "https://api.example.com/data"
38
-
path: "items[0].id"
39
-
```
23
+
1. Create a new module in your own configuration under a director named `plugins`. Plugins are auto discovered if they live under your configuration directory.
24
+
2. Write a function that performs the task you need to return the appropirate result, optionally taking arguments from the Jinja template.
40
25
41
26
## Notes
42
27
43
-
- Keep plugins deterministic when possible to make tests reliable.
44
-
- Avoid long-running IO in plugins used by unit tests; mock external calls in tests.
28
+
- Plugins should be very simple and return almost immediately. These are often going to be called on every task execution, unless using lazy loading, meaning slow calls will slow down startup times of everything.
45
29
- Plugins should validate their input and raise clear exceptions for missing fields or bad responses.
-[Example: execution task using SSH handler](#example-execution-task-using-ssh-handler)
12
+
-[Notes and caveats](#notes-and-caveats)
13
+
-[Extending or adding a handler](#extending-or-adding-a-handler)
12
14
13
15
## Available handlers
14
16
15
-
-`ssh.py` — SSHExecution and SSHTransfer helpers (depends on local SSH client or paramiko-like behavior). Used for remote command execution and staging file transfers.
17
+
-`ssh.py` — SSHExecution and SSHTransfer helpers (depends on local SSH client or paramiko-like behavior). Used for remote command execution and file transfers.
16
18
-`sftp.py` — SFTPTransfer: file transfer over SFTP
17
19
-`local.py` — LocalTransfer/Execution: runs commands and moves files on the local filesystem (useful for testing and local workflows)
18
20
-`email.py` — Email transfer helper for sending files via email
19
21
-`dummy.py` — No-op handlers used for testing and examples
20
-
-`scripts/` — utilities used by handlers to run or wrap platform scripts
21
22
22
23
## Referencing handlers
23
24
24
25
A handler is referenced by its importable class path in the `protocol.name` field, for example:
- Handlers vary in their required `protocol.credentials`. Consult individual handler docstrings and tests for exact fields.
52
+
- Handlers vary in their required `protocol.credentials`. Consult individual handler schemas and tests for exact fields.
52
53
- For networked handlers, ensure integration test services are available in `test/docker-compose.yml`.
53
54
-`local` handler is safe to use in CI for unit tests; it avoids external network dependencies.
54
55
@@ -57,7 +58,3 @@ A handler is referenced by its importable class path in the `protocol.name` fiel
57
58
1. Implement a concrete class in `src/opentaskpy/remotehandlers/` that subclasses the appropriate abstract base (see `remotehandler.py`).
58
59
2. Add schema entries if your handler expects new protocol fields.
59
60
3. Write unit tests and, if appropriate, a small integration test using `test/docker-compose.yml` fixtures.
60
-
61
-
## Where to find tests
62
-
63
-
- See `tests/test_plugin_file.py`, `tests/test_remotehandler.py`, and other tests that exercise handlers for examples of usage and expected return values.
- Invoke transfer handler methods for listing, pulling, pushing, and final move to destination
16
18
- Apply post-copy actions (move, delete) as configured
17
-
-`batch.py` — `BatchTaskHandler`: orchestrates multiple sub-tasks (either executionor transfer). Useful for multi-target deployments or multi-step workflows.
19
+
-`batch.py` — `BatchTaskHandler`: orchestrates multiple sub-tasks (either execution, transfer or batch). Useful for multi-step workflows.
18
20
19
-
Design notes
21
+
## Design notes
20
22
21
23
- Each handler focuses on orchestration; concrete remote behavior lives in `remotehandlers` implementations.
22
-
- Handlers must return structured result objects for consistent test assertions. Typical result fields include: `success` (bool), `result` (dict), `errors` (list), and `logs` (list) or a flattened `stdout`/`stderr` pair.
23
24
- Handlers should be resilient to partial failures in batch operations and should provide granular results per sub-task.
24
25
25
-
Examples
26
-
27
-
Invoke an `execution` task programmatically (pseudo-code):
28
-
29
-
```py
30
-
from opentaskpy.taskhandlers.taskhandler import TaskHandler
31
-
32
-
manifest = {...} # validated dict
33
-
handler = TaskHandler()
34
-
result = handler.handle(manifest)
35
-
print(result)
36
-
```
37
-
38
-
Testing handlers
39
-
40
-
- Unit tests should mock remote handlers where possible (use `dummy.py` handler).
41
-
- Integration tests can instantiate real handlers against local test services defined in `test/docker-compose.yml`.
42
-
43
-
Batch handling details (flow and options)
26
+
## Batch handling details (flow and options)
44
27
45
28
The `Batch` handler orchestrates multiple sub-tasks defined in a `batch` task manifest. Each batch task entry contains control fields that determine ordering, dependency, failure handling, timeout, and rerun behavior.
46
29
@@ -56,31 +39,25 @@ Key fields available for each sub-task in the batch manifest:
56
39
Batch orchestration behavior:
57
40
58
41
- Ordering: `batch.tasks` are sorted by `order_id`.
59
-
- Dependencies: a task will not be started until all dependencies' statuses are in `COMPLETED`.
42
+
- Dependencies: a task will not be started until all dependencies' statuses are `COMPLETED`.
60
43
- Execution model: each runnable task is started in a separate thread. The batch loop polls task statuses and enforces timeouts.
61
44
- Failure semantics:
62
-
- If a task fails and `continue_on_fail` is false, the sub-task is marked FAILED and the batch will not proceed with tasks that depend on it. The overall batch will ultimately return a non-zero exit code.
63
-
- If `continue_on_fail` is true, the sub-task is marked COMPLETED and the batch continues.
45
+
- If a task fails and `continue_on_fail` is false (the default), the sub-task is marked FAILED and the batch will not proceed with tasks that depend on it. The overall batch will ultimately return a non-zero exit code.
46
+
- If `continue_on_fail` is true, the sub-task is marked COMPLETED and the batch continues. The overall batch will still return a non-zero exit code.
64
47
- Restart / resume semantics:
65
48
- On startup the batch inspects the most recent batch log file to locate `__OTF_BATCH_TASK_MARKER__` marks to determine which tasks previously completed. Tasks marked as completed are skipped unless `retry_on_rerun` is true.
66
49
67
-
Killing and timeouts:
50
+
### Killing and timeouts:
68
51
69
-
- A sub-task running longer than `timeout` will be marked `TIMED_OUT`. The batch will set the sub-task's `kill_event` and wait for the thread to stop; if it does not stop in time, the thread is canceled.
52
+
- A sub-task running longer than `timeout` will be marked `TIMED_OUT`. The batch will set the sub-task's `kill_event` and wait for the thread to stop; if it does not stop in time, the thread is cancelled.
70
53
- A global kill_event passed to `Batch.run(kill_event)` will stop all running sub-tasks gracefully by setting each sub-task's `kill_event`.
71
54
72
-
Logging and resumption
55
+
### Logging and resumption
73
56
74
-
- The batch writes ordered log markers using `__OTF_BATCH_TASK_MARKER__: ORDER_ID::<order_id>::TASK::<task_id>::<status>` so reruns can detect state and take appropriate action.
57
+
- The batch writes ordered log markers using `__OTF_BATCH_TASK_MARKER__: ORDER_ID::<order_id>::TASK::<task_id>::<status>` so reruns can detect state and take appropriate action. Note this only applies to gracefully failed runs, if a batch is killed via a kill command for example, it will not be able to rename it's log file with the `_failed` suffix, meaning the run will not be taken into account for resumption. If there are previously failed log files, the batch will use those instead, or if not then start from scratch.
75
58
76
-
Best practices when authoring batch manifests
59
+
### Best practices when authoring batch definitions
77
60
78
-
- Prefer explicit `dependencies` for complex DAGs rather than relying purely on ordering.
79
-
- Set reasonable `timeout` values for long-running jobs and ensure handlers support graceful shutdown when `kill_event` is set.
61
+
- Prefer explicit `dependencies` for complex tasks rather than relying purely on ordering.
62
+
- Set reasonable `timeout` values for long-running jobs and ensure handlers support graceful shutdown when `kill_event` is set. Ensure a timeout is longer than any file watch timeouts defined within transfers, otherwise these will be killed before the filewatching has finished.
80
63
- Use `continue_on_fail` only when downstream tasks are tolerant of upstream failures.
0 commit comments