You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+6Lines changed: 6 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -364,6 +364,12 @@ hyp create hyp-pytorch-job \
364
364
|`--accelerator-partition-limit`| INTEGER | No | Limit for the number of accelerator partitions (minimum: 1) |
365
365
|`--preferred-topology`| TEXT | No | Preferred topology annotation for scheduling |
366
366
|`--required-topology`| TEXT | No | Required topology annotation for scheduling |
367
+
|`--max-node-count`| INTEGER | No | Maximum number of nodes|
368
+
|`--elastic-replica-increment-step`| INTEGER | No | Scaling step size for elastic training. Provide either this or elastic-replica-discrete-values|
369
+
|`--elastic-graceful-shutdown-timeout-in-seconds`| INTEGER | No | Graceful shutdown timeout in seconds for elastic scaling operations|
370
+
|`--elastic-scaling-timeout-in-seconds`| INTEGER | No | Scaling timeout for elastic training|
371
+
|`--elastic-scale-up-snooze-time-in-seconds`| INTEGER | No | Timeout period after job restart during which no scale up/workload admission is allowed|
372
+
|`--elastic-replica-discrete-values`| ARRAY | No | Alternative to elastic-replica-increment-step. Provides exact values for total replicas count (array of integers)|
367
373
|`--debug`| FLAG | No | Enable debug mode (default: false) |
|`--memory-limit`| FLOAT | No | Limit for the amount of memory in GiB |
207
207
|`--preferred-topology`| TEXT | No | Preferred topology annotation for scheduling |
208
208
|`--required-topology`| TEXT | No | Required topology annotation for scheduling |
209
+
|`--max-node-count`| INTEGER | No | Maximum number of nodes|
210
+
|`--elastic-replica-increment-step`| INTEGER | No | Scaling step size for elastic training. Provide either this or elastic-replica-discrete-values|
211
+
|`--elastic-graceful-shutdown-timeout-in-seconds`| INTEGER | No | Graceful shutdown timeout in seconds for elastic scaling operations|
212
+
|`--elastic-scaling-timeout-in-seconds`| INTEGER | No | Scaling timeout for elastic training|
213
+
|`--elastic-scale-up-snooze-time-in-seconds`| INTEGER | No | Timeout period after job restart during which no scale up/workload admission is allowed|
214
+
|`--elastic-replica-discrete-values`| ARRAY | No | Alternative to elastic-replica-increment-step. Provides exact values for total replicas count (array of integers)|
209
215
|`--debug`| FLAG | No | Enable debug mode (default: false) |
0 commit comments