Update docs

adammcdonagh · adammcdonagh · commit 954dd5d6427c · 2025-10-31T15:54:06.000Z
diff --git a/docs/overview.md b/docs/overview.md
@@ -1,15 +1,12 @@
-# Open Task Framework — Overview
+- [Files in this docs folder](#files-in-this-docs-folder)
+- [Task types overview](#task-types-overview)
+- [Transfers](#transfers)
+- [Executions](#executions)
+- [Batches](#batches)
+- [Templates and variable resolution](#templates-and-variable-resolution)
 
 This folder contains documentation Open Task Framework (OTF). Use these docs to understand core concepts, the package layout, how plugins and handlers are structured, and concrete examples for using built-in handlers.
 
-## Table of contents
-
-- [Architecture](./architecture.md)
-- [Remote handlers](./remotehandlers.md)
-- [Task handlers](./taskhandlers.md)
-- [Plugins](./plugins.md)
-- [Lookup plugins](./plugins/lookup.md)
-
 ## Files in this docs folder
 
 - `architecture.md` — high-level architecture and component responsibilities
@@ -18,9 +15,40 @@ This folder contains documentation Open Task Framework (OTF). Use these docs to
 - `plugins.md` — index of built-in plugins and how to author new ones
 - `plugins/lookup.md` — details for the lookup plugin family
 
-## Quick start
+## Task types overview
+
+As mentioned in the [README.md](../README.md), OTF supports three main task types: Transfers, Executions and Batches. Task payloads are JSON-based (either plain `.json` or Jinja2 `.json.j2` templates). See `docs/architecture.md` for the full rendering/parsing pipeline.
+
+## Transfers
+
+Transfers move files from a single source to one or more destinations. Supported protocols include SFTP/SSH and local filesystem handlers. Key features:
+
+- File polling and filewatch (wait for files to appear)
+- Log watching for specific patterns
+- Conditional selection by file size, age, and count
+- Post-copy actions: archive, delete, or move source files
+
+Transfers can operate in "direct" mode (remote-to-remote where supported) or via a local staging step if protocols differ.
+
+## Executions
+
+Execution tasks run commands on remote hosts (via SSH or local execution handlers). Execution handlers capture stdout, stderr, exit code and may include PID tokenization for advanced lifecycle management (useful for `kill`).
+
+## Batches
+
+Batches compose multiple tasks (executions, transfers, or nested batches). They support:
+
+- Ordered execution using `order_id`
+- Explicit `dependencies` for DAG-style control
+- `timeout`, `continue_on_fail`, and `retry_on_rerun` per-task options
+- Resumption via log markers so batches can be rerun from the last known state
+
+Batches are documented in more detail in `docs/taskhandlers.md`.
+
+## Templates and variable resolution
 
-1. Read `architecture.md` to understand the components.
-2. Inspect `src/opentaskpy/remotehandlers` and `src/opentaskpy/plugins` for concrete handler implementations and examples.
+- All task/config payloads are JSON-based. Files are either plain `.json` or Jinja2 templates with `.json.j2`.
+- The loader pipeline renders `.json.j2` templates using Jinja2 and available plugin helpers, then parses the rendered text as JSON and validates against `src/opentaskpy/config/schemas/`.
+- When editing templates, ensure the rendered output is valid JSON and that required variables are present in the rendering context.
 
-If you want additional docs (diagrams, developer onboarding checklist, or API docs), tell me which and I will add them.
+If you want this page removed instead of updated, tell me and I'll delete it.
diff --git a/docs/plugins.md b/docs/plugins.md
@@ -1,45 +1,29 @@
-# Plugins — built-in plugin index and guidance
+# Lookup Plugins — built-in plugin index and guidance
 
 Plugins are small, reusable helpers that can be referenced in task payloads to compute values at runtime (for example: lookups, templating helpers). Plugins live under `src/opentaskpy/plugins/` and are intentionally lightweight.
 
 ## Table of contents
 
-- [Built-in plugin families](#built-in-plugin-families)
-- [Where plugins are used](#where-plugins-are-used)
-- [How to author a plugin](#how-to-author-a-plugin)
-- [Example usage](#example-usage)
-- [Notes](#notes)
+- [Lookup Plugins — built-in plugin index and guidance](#lookup-plugins--built-in-plugin-index-and-guidance)
+  - [Table of contents](#table-of-contents)
+  - [Built-in plugins](#built-in-plugins)
+  - [How to write a plugin](#how-to-write-a-plugin)
+  - [Notes](#notes)
 
-## Built-in plugin families
+## Built-in plugins
 
-- `lookup` — helpers that resolve values from different sources. Built-in lookup plugins include:
-  - `lookup.file` — read a value from a local file
-  - `lookup.http_json` — fetch JSON over HTTP and extract a value
-  - `lookup.random_number` — generate a random number (useful for tests)
+- `lookup.file` — read a value from a local file
+- `lookup.http_json` — fetch JSON over HTTP and extract a value
+- `lookup.random_number` — generate a random number (useful for tests)
 
-## Where plugins are used
+These are primarily included for demonstration purposes, it is not likely these will be useful in production.
 
-- Variable interpolation and templating across tasks and examples under `examples/`
-- Tests that need small deterministic helpers without external dependencies
+## How to write a plugin
 
-## How to author a plugin
-
-1. Create a new module under `src/opentaskpy/plugins/<family>/`.
-2. Export a callable that accepts a plugin configuration dict and returns a value or raises a descriptive exception.
-3. Add unit tests under `tests/` covering expected inputs and errors.
-
-## Example usage
-
-Example — using `lookup.http_json` (pseudo-config)
-
-```yaml
-someVar: !lookup.http_json
-  url: "https://api.example.com/data"
-  path: "items[0].id"
-```
+1. Create a new module in your own configuration under a director named `plugins`. Plugins are auto discovered if they live under your configuration directory.
+2. Write a function that performs the task you need to return the appropirate result, optionally taking arguments from the Jinja template.
 
 ## Notes
 
-- Keep plugins deterministic when possible to make tests reliable.
-- Avoid long-running IO in plugins used by unit tests; mock external calls in tests.
+- Plugins should be very simple and return almost immediately. These are often going to be called on every task execution, unless using lazy loading, meaning slow calls will slow down startup times of everything.
 - Plugins should validate their input and raise clear exceptions for missing fields or bad responses.
diff --git a/docs/plugins/lookup.md b/docs/plugins/lookup.md
diff --git a/docs/remotehandlers.md b/docs/remotehandlers.md
@@ -4,28 +4,29 @@ This document describes the built-in remote handlers found in `src/opentaskpy/re
 
 ## Table of contents
 
-- [Available handlers](#available-handlers)
-- [Referencing handlers](#referencing-handlers)
-- [Notes and caveats](#notes-and-caveats)
-- [Extending or adding a handler](#extending-or-adding-a-handler)
-- [Where to find tests](#where-to-find-tests)
+- [Remote Handlers — built-in implementations](#remote-handlers--built-in-implementations)
+  - [Table of contents](#table-of-contents)
+  - [Available handlers](#available-handlers)
+  - [Referencing handlers](#referencing-handlers)
+    - [Example: execution task using SSH handler](#example-execution-task-using-ssh-handler)
+  - [Notes and caveats](#notes-and-caveats)
+  - [Extending or adding a handler](#extending-or-adding-a-handler)
 
 ## Available handlers
 
-- `ssh.py` — SSHExecution and SSHTransfer helpers (depends on local SSH client or paramiko-like behavior). Used for remote command execution and staging file transfers.
+- `ssh.py` — SSHExecution and SSHTransfer helpers (depends on local SSH client or paramiko-like behavior). Used for remote command execution and file transfers.
 - `sftp.py` — SFTPTransfer: file transfer over SFTP
 - `local.py` — LocalTransfer/Execution: runs commands and moves files on the local filesystem (useful for testing and local workflows)
 - `email.py` — Email transfer helper for sending files via email
 - `dummy.py` — No-op handlers used for testing and examples
-- `scripts/` — utilities used by handlers to run or wrap platform scripts
 
 ## Referencing handlers
 
 A handler is referenced by its importable class path in the `protocol.name` field, for example:
 
 ```json
 "protocol": {
-  "name": "opentaskpy.remotehandlers.sftp.SFTPTransfer",
+  "name": "sftp",
   "credentials": { "username": "user", "password": "pw" }
 }
 ```
@@ -40,15 +41,15 @@ A handler is referenced by its importable class path in the `protocol.name` fiel
   "directory": "/tmp",
   "command": "ls -la",
   "protocol": {
-    "name": "opentaskpy.remotehandlers.ssh.SSHExecution",
+    "name": "ssh",
     "credentials": { "username": "test", "password": "pw" }
   }
 }
 ```
 
 ## Notes and caveats
 
-- Handlers vary in their required `protocol.credentials`. Consult individual handler docstrings and tests for exact fields.
+- Handlers vary in their required `protocol.credentials`. Consult individual handler schemas and tests for exact fields.
 - For networked handlers, ensure integration test services are available in `test/docker-compose.yml`.
 - `local` handler is safe to use in CI for unit tests; it avoids external network dependencies.
 
@@ -57,7 +58,3 @@ A handler is referenced by its importable class path in the `protocol.name` fiel
 1. Implement a concrete class in `src/opentaskpy/remotehandlers/` that subclasses the appropriate abstract base (see `remotehandler.py`).
 2. Add schema entries if your handler expects new protocol fields.
 3. Write unit tests and, if appropriate, a small integration test using `test/docker-compose.yml` fixtures.
-
-## Where to find tests
-
-- See `tests/test_plugin_file.py`, `tests/test_remotehandler.py`, and other tests that exercise handlers for examples of usage and expected return values.
diff --git a/docs/taskhandlers.md b/docs/taskhandlers.md
@@ -5,42 +5,25 @@ This document explains the responsibilities and usage patterns for the core task
 Core handlers
 
 - `taskhandler.py` — a lightweight facade that selects and invokes a concrete handler based on the incoming task's `type` field.
-- `execution.py` — `ExecutionTaskHandler`: handles `execution` tasks. Responsibilities:
+- `execution.py` — `ExecutionTaskHandler`: handles `execution` tasks. Responsibilities include:
+
   - Validate the task manifest (schema)
   - Instantiate the configured `Execution` handler
   - Execute the command remotely or locally, stream output, collect results, and handle process termination if requested
-- `transfer.py` — `TransferTaskHandler`: handles `transfer` tasks. Responsibilities:
+
+- `transfer.py` — `TransferTaskHandler`: handles `transfer` tasks. Responsibilities include:
   - Validate transfer payloads
   - Handle staging directories (worker staging)
   - Invoke transfer handler methods for listing, pulling, pushing, and final move to destination
   - Apply post-copy actions (move, delete) as configured
-- `batch.py` — `BatchTaskHandler`: orchestrates multiple sub-tasks (either execution or transfer). Useful for multi-target deployments or multi-step workflows.
+- `batch.py` — `BatchTaskHandler`: orchestrates multiple sub-tasks (either execution, transfer or batch). Useful for multi-step workflows.
 
-Design notes
+## Design notes
 
 - Each handler focuses on orchestration; concrete remote behavior lives in `remotehandlers` implementations.
-- Handlers must return structured result objects for consistent test assertions. Typical result fields include: `success` (bool), `result` (dict), `errors` (list), and `logs` (list) or a flattened `stdout`/`stderr` pair.
 - Handlers should be resilient to partial failures in batch operations and should provide granular results per sub-task.
 
-Examples
-
-Invoke an `execution` task programmatically (pseudo-code):
-
-```py
-from opentaskpy.taskhandlers.taskhandler import TaskHandler
-
-manifest = {...}  # validated dict
-handler = TaskHandler()
-result = handler.handle(manifest)
-print(result)
-```
-
-Testing handlers
-
-- Unit tests should mock remote handlers where possible (use `dummy.py` handler).
-- Integration tests can instantiate real handlers against local test services defined in `test/docker-compose.yml`.
-
-Batch handling details (flow and options)
+## Batch handling details (flow and options)
 
 The `Batch` handler orchestrates multiple sub-tasks defined in a `batch` task manifest. Each batch task entry contains control fields that determine ordering, dependency, failure handling, timeout, and rerun behavior.
 
@@ -56,31 +39,25 @@ Key fields available for each sub-task in the batch manifest:
 Batch orchestration behavior:
 
 - Ordering: `batch.tasks` are sorted by `order_id`.
-- Dependencies: a task will not be started until all dependencies' statuses are in `COMPLETED`.
+- Dependencies: a task will not be started until all dependencies' statuses are `COMPLETED`.
 - Execution model: each runnable task is started in a separate thread. The batch loop polls task statuses and enforces timeouts.
 - Failure semantics:
-  - If a task fails and `continue_on_fail` is false, the sub-task is marked FAILED and the batch will not proceed with tasks that depend on it. The overall batch will ultimately return a non-zero exit code.
-  - If `continue_on_fail` is true, the sub-task is marked COMPLETED and the batch continues.
+  - If a task fails and `continue_on_fail` is false (the default), the sub-task is marked FAILED and the batch will not proceed with tasks that depend on it. The overall batch will ultimately return a non-zero exit code.
+  - If `continue_on_fail` is true, the sub-task is marked COMPLETED and the batch continues. The overall batch will still return a non-zero exit code.
 - Restart / resume semantics:
   - On startup the batch inspects the most recent batch log file to locate `__OTF_BATCH_TASK_MARKER__` marks to determine which tasks previously completed. Tasks marked as completed are skipped unless `retry_on_rerun` is true.
 
-Killing and timeouts:
+### Killing and timeouts:
 
-- A sub-task running longer than `timeout` will be marked `TIMED_OUT`. The batch will set the sub-task's `kill_event` and wait for the thread to stop; if it does not stop in time, the thread is canceled.
+- A sub-task running longer than `timeout` will be marked `TIMED_OUT`. The batch will set the sub-task's `kill_event` and wait for the thread to stop; if it does not stop in time, the thread is cancelled.
 - A global kill_event passed to `Batch.run(kill_event)` will stop all running sub-tasks gracefully by setting each sub-task's `kill_event`.
 
-Logging and resumption
+### Logging and resumption
 
-- The batch writes ordered log markers using `__OTF_BATCH_TASK_MARKER__: ORDER_ID::<order_id>::TASK::<task_id>::<status>` so reruns can detect state and take appropriate action.
+- The batch writes ordered log markers using `__OTF_BATCH_TASK_MARKER__: ORDER_ID::<order_id>::TASK::<task_id>::<status>` so reruns can detect state and take appropriate action. Note this only applies to gracefully failed runs, if a batch is killed via a kill command for example, it will not be able to rename it's log file with the `_failed` suffix, meaning the run will not be taken into account for resumption. If there are previously failed log files, the batch will use those instead, or if not then start from scratch.
 
-Best practices when authoring batch manifests
+### Best practices when authoring batch definitions
 
-- Prefer explicit `dependencies` for complex DAGs rather than relying purely on ordering.
-- Set reasonable `timeout` values for long-running jobs and ensure handlers support graceful shutdown when `kill_event` is set.
+- Prefer explicit `dependencies` for complex tasks rather than relying purely on ordering.
+- Set reasonable `timeout` values for long-running jobs and ensure handlers support graceful shutdown when `kill_event` is set. Ensure a timeout is longer than any file watch timeouts defined within transfers, otherwise these will be killed before the filewatching has finished.
 - Use `continue_on_fail` only when downstream tasks are tolerant of upstream failures.
-
-Where to find related tests
-
-- `tests/test_taskhandler_transfer_dummy.py`
-- `tests/test_taskhandler_execution_local.py`
-- `tests/test_taskhandler_batch.py`