Commit 85bf6a4
authored
Add efa support in manifest for training jobs (#345)
* Update documentation for elastic training arguments
* nit: Add detail descriptions for array type
* Add efa support for training jobs
* address comment and add unit test for efa support
* fix: add efa check in quota allocation test
* Modify efa arg name and fix gpu integ test1 parent a824151 commit 85bf6a4
File tree
10 files changed
+285
-154
lines changed- hyperpod-pytorch-job-template/hyperpod_pytorch_job_template/v1_1
- src/sagemaker/hyperpod
- cli/constants
- training
- test
- integration_tests/training/cli
- unit_tests
- cli
- training
10 files changed
+285
-154
lines changedLines changed: 18 additions & 4 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
195 | 195 | | |
196 | 196 | | |
197 | 197 | | |
| 198 | + | |
| 199 | + | |
| 200 | + | |
| 201 | + | |
| 202 | + | |
| 203 | + | |
| 204 | + | |
| 205 | + | |
| 206 | + | |
| 207 | + | |
198 | 208 | | |
199 | 209 | | |
200 | 210 | | |
| |||
453 | 463 | | |
454 | 464 | | |
455 | 465 | | |
456 | | - | |
| 466 | + | |
| 467 | + | |
457 | 468 | | |
458 | 469 | | |
459 | 470 | | |
460 | 471 | | |
461 | | - | |
| 472 | + | |
| 473 | + | |
462 | 474 | | |
463 | 475 | | |
464 | 476 | | |
465 | 477 | | |
466 | 478 | | |
467 | | - | |
| 479 | + | |
| 480 | + | |
468 | 481 | | |
469 | 482 | | |
470 | 483 | | |
471 | 484 | | |
472 | | - | |
| 485 | + | |
| 486 | + | |
473 | 487 | | |
474 | 488 | | |
475 | 489 | | |
| |||
Lines changed: 10 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
305 | 305 | | |
306 | 306 | | |
307 | 307 | | |
| 308 | + | |
| 309 | + | |
| 310 | + | |
| 311 | + | |
| 312 | + | |
| 313 | + | |
| 314 | + | |
| 315 | + | |
| 316 | + | |
| 317 | + | |
308 | 318 | | |
309 | 319 | | |
310 | 320 | | |
| |||
Lines changed: 5 additions & 5 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
97 | 97 | | |
98 | 98 | | |
99 | 99 | | |
100 | | - | |
101 | | - | |
| 100 | + | |
| 101 | + | |
102 | 102 | | |
103 | 103 | | |
104 | 104 | | |
105 | 105 | | |
106 | 106 | | |
107 | | - | |
| 107 | + | |
108 | 108 | | |
109 | 109 | | |
110 | 110 | | |
| |||
117 | 117 | | |
118 | 118 | | |
119 | 119 | | |
120 | | - | |
121 | | - | |
| 120 | + | |
| 121 | + | |
122 | 122 | | |
123 | 123 | | |
124 | 124 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
45 | 45 | | |
46 | 46 | | |
47 | 47 | | |
| 48 | + | |
48 | 49 | | |
49 | 50 | | |
50 | 51 | | |
| |||
0 commit comments