Skip to content

Commit 46d1dce

Browse files
pablo-garayko3n1g
andauthored
Create CHANGELOG.md (#314)
* Create CHANGELOG.md Signed-off-by: Pablo Garay <[email protected]> * Add entries to CHANGELOG.md Signed-off-by: Pablo Garay <[email protected]> * Update CHANGELOG.md Signed-off-by: Pablo Garay <[email protected]> * Update CHANGELOG.md Co-authored-by: oliver könig <[email protected]> Signed-off-by: Pablo Garay <[email protected]> * add links --------- Signed-off-by: Pablo Garay <[email protected]> Co-authored-by: oliver könig <[email protected]>
1 parent 6b34a8e commit 46d1dce

File tree

1 file changed

+53
-0
lines changed

1 file changed

+53
-0
lines changed

CHANGELOG.md

Lines changed: 53 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,53 @@
1+
# Changelog
2+
3+
<!-- Next changelog -->
4+
## NVIDIA Nemo Run 0.5.0
5+
6+
7+
- Fix docs warnings [#271](https://github.com/NVIDIA-NeMo/Run/pull/271)
8+
- Fix docs build [#269](https://github.com/NVIDIA-NeMo/Run/pull/269)
9+
- Support overlapped srun commands in Slurm Ray [#263](https://github.com/NVIDIA-NeMo/Run/pull/263)
10+
- Refactor DGXC Lepton data mover: switch to BatchJob with auto cleanup and sleep after every run [#265](https://github.com/NVIDIA-NeMo/Run/pull/265)
11+
- ci: Fix nemo fw template ref after migrating to new org [#256](https://github.com/NVIDIA-NeMo/Run/pull/256)
12+
- Enable Nsys gpu device metrics [#257](https://github.com/NVIDIA-NeMo/Run/pull/257)
13+
- Sync job code in local tunnel for Slurm Ray job [#254](https://github.com/NVIDIA-NeMo/Run/pull/254)
14+
- Change the create dist job function to support creating a single node [#240](https://github.com/NVIDIA-NeMo/Run/pull/240)
15+
- Making job names match Run:ai requirements and making errors more descriptive [#255](https://github.com/NVIDIA-NeMo/Run/pull/255)
16+
- Support for %j in slurm log retrieval [#252](https://github.com/NVIDIA-NeMo/Run/pull/252)
17+
- Add KubeRay tests for Ray APIs [#249](https://github.com/NVIDIA-NeMo/Run/pull/249)
18+
- Upgrade skypilot executor with 0.9.2 [#246](https://github.com/NVIDIA-NeMo/Run/pull/246)
19+
- Add user scoping for k8s backend and log level support for Ray APIs [#247](https://github.com/NVIDIA-NeMo/Run/pull/247)
20+
- Update to latest Lepton SDK [#248](https://github.com/NVIDIA-NeMo/Run/pull/248)
21+
- Add storage mount options to LeptonExecutor [#237](https://github.com/NVIDIA-NeMo/Run/pull/237)
22+
- Import guard k8s import in Ray Cluster and Job [#245](https://github.com/NVIDIA-NeMo/Run/pull/245)
23+
- Add RayJob and Slurm support for Ray APIs + integration with run.Experiment [#236](https://github.com/NVIDIA-NeMo/Run/pull/236)
24+
- ci: Enforce coverage [#238](https://github.com/NVIDIA-NeMo/Run/pull/238)
25+
- Fix bug with a CLI overwrite [#235](https://github.com/NVIDIA-NeMo/Run/pull/235)
26+
- Add LeptonExecutor support [#224](https://github.com/NVIDIA-NeMo/Run/pull/224)
27+
- Add cancel to docker executor [#233](https://github.com/NVIDIA-NeMo/Run/pull/233)
28+
- Change default log wait timeout to 10s [#232](https://github.com/NVIDIA-NeMo/Run/pull/232)
29+
- Add RayCluster API with Kuberay support [#222](https://github.com/NVIDIA-NeMo/Run/pull/222)
30+
- Add sbatch network arg [#230](https://github.com/NVIDIA-NeMo/Run/pull/230)
31+
- chore: Update package info [#227](https://github.com/NVIDIA-NeMo/Run/pull/227)
32+
- Add support for job groups for local executor [#220](https://github.com/NVIDIA-NeMo/Run/pull/220)
33+
- Roll back get_underlying_types change + introduce extract_constituent [#223](https://github.com/NVIDIA-NeMo/Run/pull/223)
34+
- Fix some bugs for --lazy in CLI [#179](https://github.com/NVIDIA-NeMo/Run/pull/179)
35+
- Adding support for modern type-hints [#221](https://github.com/NVIDIA-NeMo/Run/pull/221)
36+
- Fix bug in CLI with calling a factory-fn inside a list [#214](https://github.com/NVIDIA-NeMo/Run/pull/214)
37+
- Handle more edge cases in --help [#219](https://github.com/NVIDIA-NeMo/Run/pull/219)
38+
- Add autogenerated API reference content to the documentation [#190](https://github.com/NVIDIA-NeMo/Run/pull/190)
39+
- Handle Callable in --help to fix nemo llm export --help error [#217](https://github.com/NVIDIA-NeMo/Run/pull/217)
40+
- Ensure job directory creation for various schedulers [#216](https://github.com/NVIDIA-NeMo/Run/pull/216)
41+
- Adding support for ForwardRef in CLI [#176](https://github.com/NVIDIA-NeMo/Run/pull/176)
42+
- Add additional debug to DGXC data mover [#215](https://github.com/NVIDIA-NeMo/Run/pull/215)
43+
- Handle ctx in entrypoint for experiment [#213](https://github.com/NVIDIA-NeMo/Run/pull/213)
44+
- zozhang/dgxc executor data mover [#206](https://github.com/NVIDIA-NeMo/Run/pull/206)
45+
- Add support for YAML, TOML & JSON [#182](https://github.com/NVIDIA-NeMo/Run/pull/182)
46+
- Add clean mode for experiment to avoid printing any NeMo-Run specific logs [#208](https://github.com/NVIDIA-NeMo/Run/pull/208)
47+
- Fix seed for torchrun [#209](https://github.com/NVIDIA-NeMo/Run/pull/209)
48+
- Support torchrun multi node on local executor [#143](https://github.com/NVIDIA-NeMo/Run/pull/143)
49+
- Add nsys filename param [#205](https://github.com/NVIDIA-NeMo/Run/pull/205)
50+
- Add DGXCloudExecutor docs and update execution guide [#192](https://github.com/NVIDIA-NeMo/Run/pull/192)
51+
- Add --cuda-event-trace=false to nsys command [#180](https://github.com/NVIDIA-NeMo/Run/pull/180)
52+
53+

0 commit comments

Comments
 (0)