Skip to content

Commit c119f44

Browse files
authored
Merge pull request #48 from NREL/feat/parameterization
Support zip parameterization
2 parents 2729371 + 32c9d38 commit c119f44

File tree

11 files changed

+685
-18
lines changed

11 files changed

+685
-18
lines changed

docs/src/reference/parameterization.md

Lines changed: 79 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -323,6 +323,84 @@ job "train_lr{lr:.4f}_bs{batch_size}" {
323323
}
324324
```
325325

326+
## Parameter Modes
327+
328+
By default, when multiple parameters are specified, Torc generates the Cartesian product of all parameter values. You can change this behavior using `parameter_mode`.
329+
330+
### Product Mode (Default)
331+
332+
The default mode generates all possible combinations:
333+
334+
```yaml
335+
jobs:
336+
- name: job_{a}_{b}
337+
command: echo {a} {b}
338+
parameters:
339+
a: "[1, 2, 3]"
340+
b: "['x', 'y', 'z']"
341+
# parameter_mode: product # This is the default
342+
```
343+
344+
This creates 3 × 3 = **9 jobs**: `job_1_x`, `job_1_y`, `job_1_z`, `job_2_x`, etc.
345+
346+
### Zip Mode
347+
348+
Use `parameter_mode: zip` to pair parameters element-wise (like Python's `zip()` function). All parameter lists must have the same length.
349+
350+
```yaml
351+
jobs:
352+
- name: train_{dataset}_{model}
353+
command: python train.py --dataset={dataset} --model={model}
354+
parameters:
355+
dataset: "['cifar10', 'mnist', 'imagenet']"
356+
model: "['resnet', 'cnn', 'transformer']"
357+
parameter_mode: zip
358+
```
359+
360+
This creates **3 jobs** (not 9):
361+
- `train_cifar10_resnet`
362+
- `train_mnist_cnn`
363+
- `train_imagenet_transformer`
364+
365+
**When to use zip mode:**
366+
- Pre-determined parameter pairings (dataset A always uses model X)
367+
- Corresponding input/output file pairs
368+
- Parallel arrays where position matters
369+
370+
**Error handling:**
371+
If parameter lists have different lengths in zip mode, Torc will return an error:
372+
```
373+
All parameters must have the same number of values when using 'zip' mode.
374+
Parameter 'dataset' has 3 values, but 'model' has 2 values.
375+
```
376+
377+
### KDL Syntax
378+
379+
```kdl
380+
job "train_{dataset}_{model}" {
381+
command "python train.py --dataset={dataset} --model={model}"
382+
parameters {
383+
dataset "['cifar10', 'mnist', 'imagenet']"
384+
model "['resnet', 'cnn', 'transformer']"
385+
}
386+
parameter_mode "zip"
387+
}
388+
```
389+
390+
### JSON5 Syntax
391+
392+
```json5
393+
{
394+
name: "train_{dataset}_{model}",
395+
command: "python train.py --dataset={dataset} --model={model}",
396+
parameters: {
397+
dataset: "['cifar10', 'mnist', 'imagenet']",
398+
model: "['resnet', 'cnn', 'transformer']"
399+
},
400+
parameter_mode: "zip"
401+
}
402+
```
403+
326404
## Best Practices
327405

328406
1. **Use descriptive parameter names** - `lr` not `x`, `batch_size` not `b`
@@ -332,3 +410,4 @@ job "train_lr{lr:.4f}_bs{batch_size}" {
332410
5. **Consider parameter dependencies** - Some parameter combinations may be invalid
333411
6. **Prefer shared parameters for multi-job workflows** - Use `use_parameters` to avoid repeating definitions
334412
7. **Use selective inheritance** - Only inherit the parameters each job actually needs
413+
8. **Use zip mode for paired parameters** - When parameters have a 1:1 correspondence, use `parameter_mode: zip`

docs/src/tutorials/simple-params.md

Lines changed: 27 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -254,6 +254,32 @@ Torc creates these relationships:
254254

255255
Each chain is independent—`eval_2` doesn't wait for `train_1`.
256256

257+
## Parameter Modes: Product vs Zip
258+
259+
By default, multiple parameters create a **Cartesian product** (all combinations). For **paired parameters**, use `parameter_mode: zip`:
260+
261+
```yaml
262+
jobs:
263+
# Default (product): 3 × 3 = 9 jobs
264+
- name: train_{dataset}_{model}
265+
command: python train.py --dataset={dataset} --model={model}
266+
parameters:
267+
dataset: "['cifar10', 'mnist', 'imagenet']"
268+
model: "['resnet', 'vgg', 'transformer']"
269+
270+
# Zip mode: 3 paired jobs (cifar10+resnet, mnist+vgg, imagenet+transformer)
271+
- name: paired_{dataset}_{model}
272+
command: python train.py --dataset={dataset} --model={model}
273+
parameters:
274+
dataset: "['cifar10', 'mnist', 'imagenet']"
275+
model: "['resnet', 'vgg', 'transformer']"
276+
parameter_mode: zip
277+
```
278+
279+
Use zip mode when parameters have a 1:1 correspondence (e.g., input/output file pairs, pre-determined configurations).
280+
281+
See [Parameterization Reference](../reference/parameterization.md#parameter-modes) for details.
282+
257283
## What You Learned
258284

259285
In this tutorial, you learned:
@@ -263,6 +289,7 @@ In this tutorial, you learned:
263289
- ✅ Format specifiers (`{i:03d}`, `{lr:.4f}`) for consistent naming
264290
- ✅ How parameterized files create one-to-one dependencies
265291
- ✅ The efficiency of parameter-matched dependencies (each chain runs independently)
292+
- ✅ The difference between product (default) and zip parameter modes
266293

267294
## Next Steps
268295

Lines changed: 56 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,56 @@
1+
{
2+
// Zip Parameter Mode Example
3+
// Demonstrates using parameter_mode: "zip" for paired parameter expansion
4+
//
5+
// When using Cartesian product (default): 3 datasets x 3 models = 9 jobs
6+
// When using zip mode: 3 pairs = 3 jobs (dataset[i] paired with model[i])
7+
//
8+
// This is useful when you have pre-determined parameter combinations rather than
9+
// wanting to test all possible combinations.
10+
11+
"name": "zip_parameter_example",
12+
"description": "Example workflow demonstrating zip parameter mode for paired parameters",
13+
14+
"jobs": [
15+
// Training jobs using zip mode - each dataset is paired with a specific model
16+
// Instead of 3x3=9 combinations, we get exactly 3 jobs:
17+
// - cifar10 with resnet
18+
// - mnist with cnn
19+
// - imagenet with transformer
20+
{
21+
"name": "train_{dataset}_{model}",
22+
"command": "python train.py --dataset={dataset} --model={model}",
23+
"output_files": ["model_{dataset}_{model}"],
24+
"parameters": {
25+
"dataset": "['cifar10', 'mnist', 'imagenet']",
26+
"model": "['resnet', 'cnn', 'transformer']"
27+
},
28+
"parameter_mode": "zip"
29+
},
30+
// Evaluation jobs also use zip mode to match the training jobs
31+
{
32+
"name": "evaluate_{dataset}_{model}",
33+
"command": "python evaluate.py --model=/models/{dataset}_{model}.pt",
34+
"depends_on": ["train_{dataset}_{model}"],
35+
"input_files": ["model_{dataset}_{model}"],
36+
"parameters": {
37+
"dataset": "['cifar10', 'mnist', 'imagenet']",
38+
"model": "['resnet', 'cnn', 'transformer']"
39+
},
40+
"parameter_mode": "zip"
41+
}
42+
],
43+
44+
// File specifications also support zip mode
45+
"files": [
46+
{
47+
"name": "model_{dataset}_{model}",
48+
"path": "/models/{dataset}_{model}.pt",
49+
"parameters": {
50+
"dataset": "['cifar10', 'mnist', 'imagenet']",
51+
"model": "['resnet', 'cnn', 'transformer']"
52+
},
53+
"parameter_mode": "zip"
54+
}
55+
]
56+
}
Lines changed: 48 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,48 @@
1+
// Zip Parameter Mode Example
2+
// Demonstrates using parameter_mode "zip" for paired parameter expansion
3+
//
4+
// When using Cartesian product (default): 3 datasets x 3 models = 9 jobs
5+
// When using zip mode: 3 pairs = 3 jobs (dataset[i] paired with model[i])
6+
//
7+
// This is useful when you have pre-determined parameter combinations rather than
8+
// wanting to test all possible combinations.
9+
10+
name "zip_parameter_example"
11+
description "Example workflow demonstrating zip parameter mode for paired parameters"
12+
13+
// Training jobs using zip mode - each dataset is paired with a specific model
14+
// Instead of 3x3=9 combinations, we get exactly 3 jobs:
15+
// - cifar10 with resnet
16+
// - mnist with cnn
17+
// - imagenet with transformer
18+
job "train_{dataset}_{model}" {
19+
command "python train.py --dataset={dataset} --model={model}"
20+
output_files "model_{dataset}_{model}"
21+
parameters {
22+
dataset "['cifar10', 'mnist', 'imagenet']"
23+
model "['resnet', 'cnn', 'transformer']"
24+
}
25+
parameter_mode "zip"
26+
}
27+
28+
// Evaluation jobs also use zip mode to match the training jobs
29+
job "evaluate_{dataset}_{model}" {
30+
command "python evaluate.py --model=/models/{dataset}_{model}.pt"
31+
depends_on_job "train_{dataset}_{model}"
32+
input_files "model_{dataset}_{model}"
33+
parameters {
34+
dataset "['cifar10', 'mnist', 'imagenet']"
35+
model "['resnet', 'cnn', 'transformer']"
36+
}
37+
parameter_mode "zip"
38+
}
39+
40+
// File specifications also support zip mode
41+
file "model_{dataset}_{model}" {
42+
path "/models/{dataset}_{model}.pt"
43+
parameters {
44+
dataset "['cifar10', 'mnist', 'imagenet']"
45+
model "['resnet', 'cnn', 'transformer']"
46+
}
47+
parameter_mode "zip"
48+
}

examples/yaml/diamond_workflow.yaml

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -39,19 +39,19 @@ jobs:
3939
# File definitions - representing the data files in the workflow
4040
files:
4141
- name: f1
42-
path: /Users/dthom/repos/torc/f1.json
42+
path: f1.json
4343

4444
- name: f2
45-
path: /Users/dthom/repos/torc/f2.json
45+
path: f2.json
4646

4747
- name: f3
48-
path: /Users/dthom/repos/torc/f3.json
48+
path: f3.json
4949

5050
- name: f4
51-
path: /Users/dthom/repos/torc/f4.json
51+
path: f4.json
5252

5353
- name: f5
54-
path: /Users/dthom/repos/torc/f5.json
54+
path: f5.json
5555

5656
- name: f6
57-
path: /Users/dthom/repos/torc/f6.json
57+
path: f6.json
Lines changed: 83 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,83 @@
1+
# Zip Parameter Mode Example
2+
# Demonstrates using parameter_mode: zip for paired parameter expansion
3+
#
4+
# When using Cartesian product (default): 3 datasets × 3 models = 9 jobs
5+
# When using zip mode: 3 pairs = 3 jobs (dataset[i] paired with model[i])
6+
#
7+
# This is useful when you have pre-determined parameter combinations rather than
8+
# wanting to test all possible combinations.
9+
10+
name: zip_parameter_example
11+
description: Example workflow demonstrating zip parameter mode for paired parameters
12+
13+
jobs:
14+
# Setup job - runs once before all training jobs
15+
- name: setup_environment
16+
command: |
17+
echo "Setting up training environment"
18+
mkdir -p /models /results
19+
20+
# Training jobs using zip mode - each dataset is paired with a specific model
21+
# Instead of 3×3=9 combinations, we get exactly 3 jobs:
22+
# - cifar10 with resnet
23+
# - mnist with cnn
24+
# - imagenet with transformer
25+
- name: train_{dataset}_{model}
26+
command: |
27+
python train.py \
28+
--dataset={dataset} \
29+
--model={model} \
30+
--output=/models/{dataset}_{model}.pt
31+
depends_on:
32+
- setup_environment
33+
output_files:
34+
- model_{dataset}_{model}
35+
parameters:
36+
dataset: "['cifar10', 'mnist', 'imagenet']"
37+
model: "['resnet', 'cnn', 'transformer']"
38+
parameter_mode: zip
39+
40+
# Evaluation job that processes all models
41+
# Note: This job uses the same parameters with zip mode to wait for the
42+
# correct corresponding training jobs
43+
- name: evaluate_{dataset}_{model}
44+
command: |
45+
python evaluate.py \
46+
--model=/models/{dataset}_{model}.pt \
47+
--output=/results/{dataset}_{model}_metrics.json
48+
depends_on:
49+
- train_{dataset}_{model}
50+
input_files:
51+
- model_{dataset}_{model}
52+
output_files:
53+
- metrics_{dataset}_{model}
54+
parameters:
55+
dataset: "['cifar10', 'mnist', 'imagenet']"
56+
model: "['resnet', 'cnn', 'transformer']"
57+
parameter_mode: zip
58+
59+
# Final aggregation job
60+
- name: aggregate_results
61+
command: |
62+
python aggregate.py \
63+
--input-dir=/results \
64+
--output=/results/summary.json
65+
# Use regex to depend on all evaluation jobs
66+
depends_on_regexes:
67+
- "evaluate_.*"
68+
69+
# File specifications also support zip mode
70+
files:
71+
- name: model_{dataset}_{model}
72+
path: /models/{dataset}_{model}.pt
73+
parameters:
74+
dataset: "['cifar10', 'mnist', 'imagenet']"
75+
model: "['resnet', 'cnn', 'transformer']"
76+
parameter_mode: zip
77+
78+
- name: metrics_{dataset}_{model}
79+
path: /results/{dataset}_{model}_metrics.json
80+
parameters:
81+
dataset: "['cifar10', 'mnist', 'imagenet']"
82+
model: "['resnet', 'cnn', 'transformer']"
83+
parameter_mode: zip

src/client/parameter_expansion.rs

Lines changed: 45 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -235,6 +235,51 @@ pub fn cartesian_product(
235235
result
236236
}
237237

238+
/// Zip parameter values together (like Python's zip function)
239+
/// All parameter lists must have the same length
240+
/// Given a map of parameter names to value lists, returns a vector where
241+
/// the i-th element contains the i-th value from each parameter
242+
pub fn zip_parameters(
243+
params: &HashMap<String, Vec<ParameterValue>>,
244+
) -> Result<Vec<HashMap<String, ParameterValue>>, String> {
245+
if params.is_empty() {
246+
return Ok(vec![HashMap::new()]);
247+
}
248+
249+
// Check that all parameter lists have the same length
250+
let lengths: Vec<(&String, usize)> = params.iter().map(|(k, v)| (k, v.len())).collect();
251+
let first_len = lengths[0].1;
252+
253+
for (name, len) in &lengths {
254+
if *len != first_len {
255+
return Err(format!(
256+
"All parameters must have the same number of values when using 'zip' mode. \
257+
Parameter '{}' has {} values, but '{}' has {} values.",
258+
lengths[0].0, first_len, name, len
259+
));
260+
}
261+
}
262+
263+
if first_len == 0 {
264+
return Ok(vec![]);
265+
}
266+
267+
// Convert HashMap to Vec for consistent ordering
268+
let param_vec: Vec<(&String, &Vec<ParameterValue>)> = params.iter().collect();
269+
270+
// Zip the values together
271+
let mut result = Vec::with_capacity(first_len);
272+
for i in 0..first_len {
273+
let mut combo = HashMap::new();
274+
for (param_name, param_values) in &param_vec {
275+
combo.insert((*param_name).clone(), param_values[i].clone());
276+
}
277+
result.push(combo);
278+
}
279+
280+
Ok(result)
281+
}
282+
238283
/// Substitute parameter values into a template string
239284
/// Supports both {param_name} and {param_name:format} syntax
240285
pub fn substitute_parameters(template: &str, params: &HashMap<String, ParameterValue>) -> String {

0 commit comments

Comments
 (0)