|
| 1 | +# YAML Chain Definition Format for pg_timetable |
| 2 | + |
| 3 | +This document defines the YAML format for defining chains of scheduled tasks in pg_timetable. |
| 4 | + |
| 5 | +## YAML Schema |
| 6 | + |
| 7 | +```yaml |
| 8 | +# Top-level structure |
| 9 | +chains: |
| 10 | + - name: "chain-name" # Required: chain_name (TEXT, unique) |
| 11 | + schedule: "* * * * *" # Required: run_at (cron format) |
| 12 | + live: true # Optional: live (BOOLEAN), default: false |
| 13 | + max_instances: 1 # Optional: max_instances (INTEGER) |
| 14 | + timeout: 30000 # Optional: timeout in milliseconds (INTEGER) |
| 15 | + self_destruct: false # Optional: self_destruct (BOOLEAN), default: false |
| 16 | + exclusive: false # Optional: exclusive_execution (BOOLEAN), default: false |
| 17 | + client_name: "worker-1" # Optional: client_name (TEXT) |
| 18 | + on_error: "SELECT log_error()" # Optional: on_error SQL (TEXT) |
| 19 | + |
| 20 | + tasks: # Required: array of tasks |
| 21 | + - name: "task-1" # Optional: task_name (TEXT) |
| 22 | + kind: "SQL" # Optional: kind (SQL|PROGRAM|BUILTIN), default: SQL |
| 23 | + command: "SELECT $1, $2" # Required: command (TEXT) |
| 24 | + parameters: # Optional: parameters (array of execution parameters) |
| 25 | + - ["value1", 42] # First execution with these parameters |
| 26 | + - ["value2", 99] # Second execution with different parameters |
| 27 | + run_as: "postgres" # Optional: run_as (TEXT) - role for SET ROLE |
| 28 | + connect_string: "postgresql://user@host/otherdb" # Optional: database_connection (TEXT) |
| 29 | + ignore_error: false # Optional: ignore_error (BOOLEAN), default: false |
| 30 | + autonomous: false # Optional: autonomous (BOOLEAN), default: false |
| 31 | + timeout: 5000 # Optional: timeout in milliseconds (INTEGER) |
| 32 | + |
| 33 | + - name: "task-2" |
| 34 | + kind: "PROGRAM" |
| 35 | + command: "bash" |
| 36 | + parameters: ["-c", "echo hello"] |
| 37 | + ignore_error: true |
| 38 | +``` |
| 39 | +
|
| 40 | +## Field Mappings |
| 41 | +
|
| 42 | +### Chain Level |
| 43 | +
|
| 44 | +| YAML Field | DB Column | Type | Default | Description | |
| 45 | +|------------|-----------|------|---------|-------------| |
| 46 | +| `name` | `chain_name` | TEXT | **required** | Unique chain identifier | |
| 47 | +| `schedule` | `run_at` | cron | **required** | Cron-style schedule | |
| 48 | +| `live` | `live` | BOOLEAN | `false` | Whether chain is active | |
| 49 | +| `max_instances` | `max_instances` | INTEGER | `null` | Max parallel instances | |
| 50 | +| `timeout` | `timeout` | INTEGER | `0` | Chain timeout (ms) | |
| 51 | +| `self_destruct` | `self_destruct` | BOOLEAN | `false` | Delete after success | |
| 52 | +| `exclusive` | `exclusive_execution` | BOOLEAN | `false` | Pause other chains | |
| 53 | +| `client_name` | `client_name` | TEXT | `null` | Restrict to specific client | |
| 54 | +| `on_error` | `on_error` | TEXT | `null` | Error handling SQL | |
| 55 | + |
| 56 | +### Task Level |
| 57 | + |
| 58 | +| YAML Field | DB Column | Type | Default | Description | |
| 59 | +|------------|-----------|------|---------|-------------| |
| 60 | +| `name` | `task_name` | TEXT | `null` | Task description | |
| 61 | +| `kind` | `kind` | ENUM | `'SQL'` | Command type (SQL/PROGRAM/BUILTIN) | |
| 62 | +| `command` | `command` | TEXT | **required** | Command to execute | |
| 63 | +| `parameters` | via `timetable.parameter` | Array of any | `null` | Array of parameter values stored as individual JSONB rows with order_id | |
| 64 | +| `run_as` | `run_as` | TEXT | `null` | Role for SET ROLE | |
| 65 | +| `connect_string` | `database_connection` | TEXT | `null` | Connection string | |
| 66 | +| `ignore_error` | `ignore_error` | BOOLEAN | `false` | Continue on error | |
| 67 | +| `autonomous` | `autonomous` | BOOLEAN | `false` | Execute outside transaction | |
| 68 | +| `timeout` | `timeout` | INTEGER | `0` | Task timeout (ms) | |
| 69 | + |
| 70 | +## Task Ordering |
| 71 | + |
| 72 | +Tasks are ordered sequentially within a chain based on their array position. The system will automatically assign appropriate `task_order` values with spacing (e.g., 10, 20, 30) to allow future insertions. |
| 73 | + |
| 74 | +## Examples |
| 75 | + |
| 76 | +### Simple SQL Job |
| 77 | + |
| 78 | +```yaml |
| 79 | +chains: |
| 80 | + - name: "daily-report" |
| 81 | + schedule: "0 9 * * *" # 9 AM daily |
| 82 | + live: true |
| 83 | + tasks: |
| 84 | + - name: "generate-report" |
| 85 | + command: "CALL generate_daily_report()" |
| 86 | +``` |
| 87 | + |
| 88 | +### Multi-task Chain |
| 89 | + |
| 90 | +```yaml |
| 91 | +chains: |
| 92 | + - name: "etl-pipeline" |
| 93 | + schedule: "0 2 * * *" # 2 AM daily |
| 94 | + live: true |
| 95 | + max_instances: 1 |
| 96 | + timeout: 3600000 # 1 hour |
| 97 | + |
| 98 | + tasks: |
| 99 | + - name: "extract-data" |
| 100 | + command: "SELECT extract_sales_data($1)" |
| 101 | + parameters: ["2023-01-01"] |
| 102 | + |
| 103 | + - name: "transform-data" |
| 104 | + command: "CALL transform_sales_data()" |
| 105 | + autonomous: true |
| 106 | + |
| 107 | + - name: "load-data" |
| 108 | + command: "CALL load_to_warehouse()" |
| 109 | + ignore_error: false |
| 110 | +``` |
| 111 | + |
| 112 | +### Program Task |
| 113 | + |
| 114 | +```yaml |
| 115 | +chains: |
| 116 | + - name: "backup-job" |
| 117 | + schedule: "0 3 * * 0" # Sunday 3 AM |
| 118 | + live: true |
| 119 | + |
| 120 | + tasks: |
| 121 | + - name: "pg-dump" |
| 122 | + kind: "PROGRAM" |
| 123 | + command: "pg_dump" |
| 124 | + parameters: |
| 125 | + - ["-h", "localhost", "-U", "postgres", "-d", "mydb", "-f", "/backups/mydb.sql"] |
| 126 | +``` |
| 127 | + |
| 128 | +## Validation Rules |
| 129 | + |
| 130 | +1. **Required Fields**: `name`, `schedule`, `tasks`, and `command` for each task |
| 131 | +2. **Unique Names**: Chain names must be unique across the database |
| 132 | +3. **Valid Cron**: Schedule must be valid cron format (5 fields) |
| 133 | +4. **Valid Kind**: Task kind must be one of: SQL, PROGRAM, BUILTIN |
| 134 | +5. **Parameter Types**: Parameters can be any JSON-compatible type (strings, numbers, booleans, arrays, objects) and are stored as individual JSONB values |
| 135 | +6. **Timeout Values**: Must be non-negative integers (milliseconds) |
0 commit comments