|
1 | 1 | # CHANGELOG |
2 | 2 |
|
3 | | -## torchx-0.2.0 |
4 | | - |
5 | | -* Milestone: https://github.com/pytorch/torchx/milestone/4 |
6 | | - |
7 | | -* `torchx.schedulers` |
8 | | - * DeviceMounts |
9 | | - * New mount type 'DeviceMount' that allows mounting a host device into a container in the supported schedulers (Docker, AWS Batch, K8). Custom accelerators and network devices such as Infiniband or Amazon EFA are now supported. |
10 | | - * Slurm |
11 | | - * Scheduler integration now supports "max_retries" the same way that our other schedulers do. This only handles whole job level retries and doesn't support per replica retries. |
12 | | - * Autodetects "nomem" setting by using `sinfo` to get the "Memory" setting for the specified partition |
13 | | - * More robust slurmint script |
14 | | - * Kubernetes |
15 | | - * Support for k8s device plugins/resource limits |
16 | | - * Added "devices" list of (str, int) tuples to role/resource |
17 | | - * Added devices.py to map from named devices to DeviceMounts |
18 | | - * Added logic in kubernetes_scheduler to add devices from resource to resource limits |
19 | | - * Added logic in aws_batch_scheduler and docker_scheduler to add DeviceMounts for any devices from resource |
20 | | - * Added "priority_class" argument to kubernetes scheduler to set the priorityClassName of the volcano job. |
21 | | - * Ray |
22 | | - * fixes for distributed training, now supported in Beta |
23 | | - |
24 | | -* `torchx.specs` |
25 | | - * Moved factory/builder methods from datastruct specific "specs.api" to "specs.factory" module |
26 | | - |
27 | | -* `torchx.runner` |
28 | | - * Renamed "stop" method to "cancel" for consistency. `Runner.stop` is now deprecated |
29 | | - * Added warning message when "name" parameter is specified. It is used as part of Session name, which is deprecated so makes "name" obsolete. |
30 | | - * New env variable TORCHXCONFIG for specified config |
31 | | - |
32 | | -* `torchx.components` |
33 | | - * Removed "base" + "torch_dist_role" since users should prefer to use the `dist.ddp` components instead |
34 | | - * Removed custom components for example apps in favor of using builtins. |
35 | | - * Added "env", "max_retries" and "mounts" arguments to utils.sh |
36 | | - |
37 | | -* `torchx.cli` |
38 | | - * Better parsing of configs from a string literal |
39 | | - * Added support to delimit kv-pairs and list values with "," and ";" interchangeably |
40 | | - * allow the default scheduler to be specified via .torchxconfig |
41 | | - * better invalid scheduler messaging |
42 | | - * Log message about how to disable workspaces |
43 | | - * Job cancellation support via `torchx cancel <job>` |
44 | | - |
45 | | -`torchx.workspace` |
46 | | - * Support for .dockerignore files used as include lists to fixe some behavioral differences between how .dockerignore files are interpreted by torchx and docker |
47 | | - |
48 | | -* Testing |
49 | | - * Component tests now run sequentially |
50 | | - * Components can be tested with a runner using `components.components_test_base.ComponentTestCase#run_component()` method. |
51 | | - |
52 | | -* Additional Changes |
53 | | - * Updated Pyre configuration to preemptively guard again upcoming semantic changes |
54 | | - * Formatting changes from black 22.3.0 |
55 | | - * Now using pyfmt with usort 1.0 and the new import merging behavior. |
56 | | - * Added script to automatically get system diagnostics for reporting purposes |
57 | | - |
58 | | - |
59 | | - |
60 | 3 | ## torchx-0.1.2 |
61 | 4 |
|
62 | 5 | Milestone: https://github.com/pytorch/torchx/milestones/3 |
|
0 commit comments