|
1 | 1 | # CHANGELOG
|
2 | 2 |
|
3 |
| -## torchx-0.2.0 |
4 |
| - |
5 |
| -* Milestone: https://github.com/pytorch/torchx/milestone/4 |
6 |
| - |
7 |
| -* `torchx.schedulers` |
8 |
| - * DeviceMounts |
9 |
| - * New mount type 'DeviceMount' that allows mounting a host device into a container in the supported schedulers (Docker, AWS Batch, K8). Custom accelerators and network devices such as Infiniband or Amazon EFA are now supported. |
10 |
| - * Slurm |
11 |
| - * Scheduler integration now supports "max_retries" the same way that our other schedulers do. This only handles whole job level retries and doesn't support per replica retries. |
12 |
| - * Autodetects "nomem" setting by using `sinfo` to get the "Memory" setting for the specified partition |
13 |
| - * More robust slurmint script |
14 |
| - * Kubernetes |
15 |
| - * Support for k8s device plugins/resource limits |
16 |
| - * Added "devices" list of (str, int) tuples to role/resource |
17 |
| - * Added devices.py to map from named devices to DeviceMounts |
18 |
| - * Added logic in kubernetes_scheduler to add devices from resource to resource limits |
19 |
| - * Added logic in aws_batch_scheduler and docker_scheduler to add DeviceMounts for any devices from resource |
20 |
| - * Added "priority_class" argument to kubernetes scheduler to set the priorityClassName of the volcano job. |
21 |
| - * Ray |
22 |
| - * fixes for distributed training, now supported in Beta |
23 |
| - |
24 |
| -* `torchx.specs` |
25 |
| - * Moved factory/builder methods from datastruct specific "specs.api" to "specs.factory" module |
26 |
| - |
27 |
| -* `torchx.runner` |
28 |
| - * Renamed "stop" method to "cancel" for consistency. `Runner.stop` is now deprecated |
29 |
| - * Added warning message when "name" parameter is specified. It is used as part of Session name, which is deprecated so makes "name" obsolete. |
30 |
| - * New env variable TORCHXCONFIG for specified config |
31 |
| - |
32 |
| -* `torchx.components` |
33 |
| - * Removed "base" + "torch_dist_role" since users should prefer to use the `dist.ddp` components instead |
34 |
| - * Removed custom components for example apps in favor of using builtins. |
35 |
| - * Added "env", "max_retries" and "mounts" arguments to utils.sh |
36 |
| - |
37 |
| -* `torchx.cli` |
38 |
| - * Better parsing of configs from a string literal |
39 |
| - * Added support to delimit kv-pairs and list values with "," and ";" interchangeably |
40 |
| - * allow the default scheduler to be specified via .torchxconfig |
41 |
| - * better invalid scheduler messaging |
42 |
| - * Log message about how to disable workspaces |
43 |
| - * Job cancellation support via `torchx cancel <job>` |
44 |
| - |
45 |
| -`torchx.workspace` |
46 |
| - * Support for .dockerignore files used as include lists to fixe some behavioral differences between how .dockerignore files are interpreted by torchx and docker |
47 |
| - |
48 |
| -* Testing |
49 |
| - * Component tests now run sequentially |
50 |
| - * Components can be tested with a runner using `components.components_test_base.ComponentTestCase#run_component()` method. |
51 |
| - |
52 |
| -* Additional Changes |
53 |
| - * Updated Pyre configuration to preemptively guard again upcoming semantic changes |
54 |
| - * Formatting changes from black 22.3.0 |
55 |
| - * Now using pyfmt with usort 1.0 and the new import merging behavior. |
56 |
| - * Added script to automatically get system diagnostics for reporting purposes |
57 |
| - |
58 |
| - |
59 |
| - |
60 | 3 | ## torchx-0.1.2
|
61 | 4 |
|
62 | 5 | Milestone: https://github.com/pytorch/torchx/milestones/3
|
|
0 commit comments