Skip to content

Commit 954dd5d

Browse files
committed
Update docs
1 parent 2765a77 commit 954dd5d

File tree

5 files changed

+84
-156
lines changed

5 files changed

+84
-156
lines changed

docs/overview.md

Lines changed: 41 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -1,15 +1,12 @@
1-
# Open Task Framework — Overview
1+
- [Files in this docs folder](#files-in-this-docs-folder)
2+
- [Task types overview](#task-types-overview)
3+
- [Transfers](#transfers)
4+
- [Executions](#executions)
5+
- [Batches](#batches)
6+
- [Templates and variable resolution](#templates-and-variable-resolution)
27

38
This folder contains documentation Open Task Framework (OTF). Use these docs to understand core concepts, the package layout, how plugins and handlers are structured, and concrete examples for using built-in handlers.
49

5-
## Table of contents
6-
7-
- [Architecture](./architecture.md)
8-
- [Remote handlers](./remotehandlers.md)
9-
- [Task handlers](./taskhandlers.md)
10-
- [Plugins](./plugins.md)
11-
- [Lookup plugins](./plugins/lookup.md)
12-
1310
## Files in this docs folder
1411

1512
- `architecture.md` — high-level architecture and component responsibilities
@@ -18,9 +15,40 @@ This folder contains documentation Open Task Framework (OTF). Use these docs to
1815
- `plugins.md` — index of built-in plugins and how to author new ones
1916
- `plugins/lookup.md` — details for the lookup plugin family
2017

21-
## Quick start
18+
## Task types overview
19+
20+
As mentioned in the [README.md](../README.md), OTF supports three main task types: Transfers, Executions and Batches. Task payloads are JSON-based (either plain `.json` or Jinja2 `.json.j2` templates). See `docs/architecture.md` for the full rendering/parsing pipeline.
21+
22+
## Transfers
23+
24+
Transfers move files from a single source to one or more destinations. Supported protocols include SFTP/SSH and local filesystem handlers. Key features:
25+
26+
- File polling and filewatch (wait for files to appear)
27+
- Log watching for specific patterns
28+
- Conditional selection by file size, age, and count
29+
- Post-copy actions: archive, delete, or move source files
30+
31+
Transfers can operate in "direct" mode (remote-to-remote where supported) or via a local staging step if protocols differ.
32+
33+
## Executions
34+
35+
Execution tasks run commands on remote hosts (via SSH or local execution handlers). Execution handlers capture stdout, stderr, exit code and may include PID tokenization for advanced lifecycle management (useful for `kill`).
36+
37+
## Batches
38+
39+
Batches compose multiple tasks (executions, transfers, or nested batches). They support:
40+
41+
- Ordered execution using `order_id`
42+
- Explicit `dependencies` for DAG-style control
43+
- `timeout`, `continue_on_fail`, and `retry_on_rerun` per-task options
44+
- Resumption via log markers so batches can be rerun from the last known state
45+
46+
Batches are documented in more detail in `docs/taskhandlers.md`.
47+
48+
## Templates and variable resolution
2249

23-
1. Read `architecture.md` to understand the components.
24-
2. Inspect `src/opentaskpy/remotehandlers` and `src/opentaskpy/plugins` for concrete handler implementations and examples.
50+
- All task/config payloads are JSON-based. Files are either plain `.json` or Jinja2 templates with `.json.j2`.
51+
- The loader pipeline renders `.json.j2` templates using Jinja2 and available plugin helpers, then parses the rendered text as JSON and validates against `src/opentaskpy/config/schemas/`.
52+
- When editing templates, ensure the rendered output is valid JSON and that required variables are present in the rendering context.
2553

26-
If you want additional docs (diagrams, developer onboarding checklist, or API docs), tell me which and I will add them.
54+
If you want this page removed instead of updated, tell me and I'll delete it.

docs/plugins.md

Lines changed: 15 additions & 31 deletions
Original file line numberDiff line numberDiff line change
@@ -1,45 +1,29 @@
1-
# Plugins — built-in plugin index and guidance
1+
# Lookup Plugins — built-in plugin index and guidance
22

33
Plugins are small, reusable helpers that can be referenced in task payloads to compute values at runtime (for example: lookups, templating helpers). Plugins live under `src/opentaskpy/plugins/` and are intentionally lightweight.
44

55
## Table of contents
66

7-
- [Built-in plugin families](#built-in-plugin-families)
8-
- [Where plugins are used](#where-plugins-are-used)
9-
- [How to author a plugin](#how-to-author-a-plugin)
10-
- [Example usage](#example-usage)
11-
- [Notes](#notes)
7+
- [Lookup Plugins — built-in plugin index and guidance](#lookup-plugins--built-in-plugin-index-and-guidance)
8+
- [Table of contents](#table-of-contents)
9+
- [Built-in plugins](#built-in-plugins)
10+
- [How to write a plugin](#how-to-write-a-plugin)
11+
- [Notes](#notes)
1212

13-
## Built-in plugin families
13+
## Built-in plugins
1414

15-
- `lookup` — helpers that resolve values from different sources. Built-in lookup plugins include:
16-
- `lookup.file` — read a value from a local file
17-
- `lookup.http_json` — fetch JSON over HTTP and extract a value
18-
- `lookup.random_number` — generate a random number (useful for tests)
15+
- `lookup.file` — read a value from a local file
16+
- `lookup.http_json` — fetch JSON over HTTP and extract a value
17+
- `lookup.random_number` — generate a random number (useful for tests)
1918

20-
## Where plugins are used
19+
These are primarily included for demonstration purposes, it is not likely these will be useful in production.
2120

22-
- Variable interpolation and templating across tasks and examples under `examples/`
23-
- Tests that need small deterministic helpers without external dependencies
21+
## How to write a plugin
2422

25-
## How to author a plugin
26-
27-
1. Create a new module under `src/opentaskpy/plugins/<family>/`.
28-
2. Export a callable that accepts a plugin configuration dict and returns a value or raises a descriptive exception.
29-
3. Add unit tests under `tests/` covering expected inputs and errors.
30-
31-
## Example usage
32-
33-
Example — using `lookup.http_json` (pseudo-config)
34-
35-
```yaml
36-
someVar: !lookup.http_json
37-
url: "https://api.example.com/data"
38-
path: "items[0].id"
39-
```
23+
1. Create a new module in your own configuration under a director named `plugins`. Plugins are auto discovered if they live under your configuration directory.
24+
2. Write a function that performs the task you need to return the appropirate result, optionally taking arguments from the Jinja template.
4025

4126
## Notes
4227

43-
- Keep plugins deterministic when possible to make tests reliable.
44-
- Avoid long-running IO in plugins used by unit tests; mock external calls in tests.
28+
- Plugins should be very simple and return almost immediately. These are often going to be called on every task execution, unless using lazy loading, meaning slow calls will slow down startup times of everything.
4529
- Plugins should validate their input and raise clear exceptions for missing fields or bad responses.

docs/plugins/lookup.md

Lines changed: 0 additions & 58 deletions
This file was deleted.

docs/remotehandlers.md

Lines changed: 11 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -4,28 +4,29 @@ This document describes the built-in remote handlers found in `src/opentaskpy/re
44

55
## Table of contents
66

7-
- [Available handlers](#available-handlers)
8-
- [Referencing handlers](#referencing-handlers)
9-
- [Notes and caveats](#notes-and-caveats)
10-
- [Extending or adding a handler](#extending-or-adding-a-handler)
11-
- [Where to find tests](#where-to-find-tests)
7+
- [Remote Handlers — built-in implementations](#remote-handlers--built-in-implementations)
8+
- [Table of contents](#table-of-contents)
9+
- [Available handlers](#available-handlers)
10+
- [Referencing handlers](#referencing-handlers)
11+
- [Example: execution task using SSH handler](#example-execution-task-using-ssh-handler)
12+
- [Notes and caveats](#notes-and-caveats)
13+
- [Extending or adding a handler](#extending-or-adding-a-handler)
1214

1315
## Available handlers
1416

15-
- `ssh.py` — SSHExecution and SSHTransfer helpers (depends on local SSH client or paramiko-like behavior). Used for remote command execution and staging file transfers.
17+
- `ssh.py` — SSHExecution and SSHTransfer helpers (depends on local SSH client or paramiko-like behavior). Used for remote command execution and file transfers.
1618
- `sftp.py` — SFTPTransfer: file transfer over SFTP
1719
- `local.py` — LocalTransfer/Execution: runs commands and moves files on the local filesystem (useful for testing and local workflows)
1820
- `email.py` — Email transfer helper for sending files via email
1921
- `dummy.py` — No-op handlers used for testing and examples
20-
- `scripts/` — utilities used by handlers to run or wrap platform scripts
2122

2223
## Referencing handlers
2324

2425
A handler is referenced by its importable class path in the `protocol.name` field, for example:
2526

2627
```json
2728
"protocol": {
28-
"name": "opentaskpy.remotehandlers.sftp.SFTPTransfer",
29+
"name": "sftp",
2930
"credentials": { "username": "user", "password": "pw" }
3031
}
3132
```
@@ -40,15 +41,15 @@ A handler is referenced by its importable class path in the `protocol.name` fiel
4041
"directory": "/tmp",
4142
"command": "ls -la",
4243
"protocol": {
43-
"name": "opentaskpy.remotehandlers.ssh.SSHExecution",
44+
"name": "ssh",
4445
"credentials": { "username": "test", "password": "pw" }
4546
}
4647
}
4748
```
4849

4950
## Notes and caveats
5051

51-
- Handlers vary in their required `protocol.credentials`. Consult individual handler docstrings and tests for exact fields.
52+
- Handlers vary in their required `protocol.credentials`. Consult individual handler schemas and tests for exact fields.
5253
- For networked handlers, ensure integration test services are available in `test/docker-compose.yml`.
5354
- `local` handler is safe to use in CI for unit tests; it avoids external network dependencies.
5455

@@ -57,7 +58,3 @@ A handler is referenced by its importable class path in the `protocol.name` fiel
5758
1. Implement a concrete class in `src/opentaskpy/remotehandlers/` that subclasses the appropriate abstract base (see `remotehandler.py`).
5859
2. Add schema entries if your handler expects new protocol fields.
5960
3. Write unit tests and, if appropriate, a small integration test using `test/docker-compose.yml` fixtures.
60-
61-
## Where to find tests
62-
63-
- See `tests/test_plugin_file.py`, `tests/test_remotehandler.py`, and other tests that exercise handlers for examples of usage and expected return values.

docs/taskhandlers.md

Lines changed: 17 additions & 40 deletions
Original file line numberDiff line numberDiff line change
@@ -5,42 +5,25 @@ This document explains the responsibilities and usage patterns for the core task
55
Core handlers
66

77
- `taskhandler.py` — a lightweight facade that selects and invokes a concrete handler based on the incoming task's `type` field.
8-
- `execution.py``ExecutionTaskHandler`: handles `execution` tasks. Responsibilities:
8+
- `execution.py``ExecutionTaskHandler`: handles `execution` tasks. Responsibilities include:
9+
910
- Validate the task manifest (schema)
1011
- Instantiate the configured `Execution` handler
1112
- Execute the command remotely or locally, stream output, collect results, and handle process termination if requested
12-
- `transfer.py``TransferTaskHandler`: handles `transfer` tasks. Responsibilities:
13+
14+
- `transfer.py``TransferTaskHandler`: handles `transfer` tasks. Responsibilities include:
1315
- Validate transfer payloads
1416
- Handle staging directories (worker staging)
1517
- Invoke transfer handler methods for listing, pulling, pushing, and final move to destination
1618
- Apply post-copy actions (move, delete) as configured
17-
- `batch.py``BatchTaskHandler`: orchestrates multiple sub-tasks (either execution or transfer). Useful for multi-target deployments or multi-step workflows.
19+
- `batch.py``BatchTaskHandler`: orchestrates multiple sub-tasks (either execution, transfer or batch). Useful for multi-step workflows.
1820

19-
Design notes
21+
## Design notes
2022

2123
- Each handler focuses on orchestration; concrete remote behavior lives in `remotehandlers` implementations.
22-
- Handlers must return structured result objects for consistent test assertions. Typical result fields include: `success` (bool), `result` (dict), `errors` (list), and `logs` (list) or a flattened `stdout`/`stderr` pair.
2324
- Handlers should be resilient to partial failures in batch operations and should provide granular results per sub-task.
2425

25-
Examples
26-
27-
Invoke an `execution` task programmatically (pseudo-code):
28-
29-
```py
30-
from opentaskpy.taskhandlers.taskhandler import TaskHandler
31-
32-
manifest = {...} # validated dict
33-
handler = TaskHandler()
34-
result = handler.handle(manifest)
35-
print(result)
36-
```
37-
38-
Testing handlers
39-
40-
- Unit tests should mock remote handlers where possible (use `dummy.py` handler).
41-
- Integration tests can instantiate real handlers against local test services defined in `test/docker-compose.yml`.
42-
43-
Batch handling details (flow and options)
26+
## Batch handling details (flow and options)
4427

4528
The `Batch` handler orchestrates multiple sub-tasks defined in a `batch` task manifest. Each batch task entry contains control fields that determine ordering, dependency, failure handling, timeout, and rerun behavior.
4629

@@ -56,31 +39,25 @@ Key fields available for each sub-task in the batch manifest:
5639
Batch orchestration behavior:
5740

5841
- Ordering: `batch.tasks` are sorted by `order_id`.
59-
- Dependencies: a task will not be started until all dependencies' statuses are in `COMPLETED`.
42+
- Dependencies: a task will not be started until all dependencies' statuses are `COMPLETED`.
6043
- Execution model: each runnable task is started in a separate thread. The batch loop polls task statuses and enforces timeouts.
6144
- Failure semantics:
62-
- If a task fails and `continue_on_fail` is false, the sub-task is marked FAILED and the batch will not proceed with tasks that depend on it. The overall batch will ultimately return a non-zero exit code.
63-
- If `continue_on_fail` is true, the sub-task is marked COMPLETED and the batch continues.
45+
- If a task fails and `continue_on_fail` is false (the default), the sub-task is marked FAILED and the batch will not proceed with tasks that depend on it. The overall batch will ultimately return a non-zero exit code.
46+
- If `continue_on_fail` is true, the sub-task is marked COMPLETED and the batch continues. The overall batch will still return a non-zero exit code.
6447
- Restart / resume semantics:
6548
- On startup the batch inspects the most recent batch log file to locate `__OTF_BATCH_TASK_MARKER__` marks to determine which tasks previously completed. Tasks marked as completed are skipped unless `retry_on_rerun` is true.
6649

67-
Killing and timeouts:
50+
### Killing and timeouts:
6851

69-
- A sub-task running longer than `timeout` will be marked `TIMED_OUT`. The batch will set the sub-task's `kill_event` and wait for the thread to stop; if it does not stop in time, the thread is canceled.
52+
- A sub-task running longer than `timeout` will be marked `TIMED_OUT`. The batch will set the sub-task's `kill_event` and wait for the thread to stop; if it does not stop in time, the thread is cancelled.
7053
- A global kill_event passed to `Batch.run(kill_event)` will stop all running sub-tasks gracefully by setting each sub-task's `kill_event`.
7154

72-
Logging and resumption
55+
### Logging and resumption
7356

74-
- The batch writes ordered log markers using `__OTF_BATCH_TASK_MARKER__: ORDER_ID::<order_id>::TASK::<task_id>::<status>` so reruns can detect state and take appropriate action.
57+
- The batch writes ordered log markers using `__OTF_BATCH_TASK_MARKER__: ORDER_ID::<order_id>::TASK::<task_id>::<status>` so reruns can detect state and take appropriate action. Note this only applies to gracefully failed runs, if a batch is killed via a kill command for example, it will not be able to rename it's log file with the `_failed` suffix, meaning the run will not be taken into account for resumption. If there are previously failed log files, the batch will use those instead, or if not then start from scratch.
7558

76-
Best practices when authoring batch manifests
59+
### Best practices when authoring batch definitions
7760

78-
- Prefer explicit `dependencies` for complex DAGs rather than relying purely on ordering.
79-
- Set reasonable `timeout` values for long-running jobs and ensure handlers support graceful shutdown when `kill_event` is set.
61+
- Prefer explicit `dependencies` for complex tasks rather than relying purely on ordering.
62+
- Set reasonable `timeout` values for long-running jobs and ensure handlers support graceful shutdown when `kill_event` is set. Ensure a timeout is longer than any file watch timeouts defined within transfers, otherwise these will be killed before the filewatching has finished.
8063
- Use `continue_on_fail` only when downstream tasks are tolerant of upstream failures.
81-
82-
Where to find related tests
83-
84-
- `tests/test_taskhandler_transfer_dummy.py`
85-
- `tests/test_taskhandler_execution_local.py`
86-
- `tests/test_taskhandler_batch.py`

0 commit comments

Comments
 (0)