You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Welcome to the NeMo-Run guides! This section provides comprehensive documentation on how to use NeMo-Run effectively for your machine learning experiments.
16
+
17
+
## Getting Started
18
+
19
+
If you're new to NeMo-Run, we recommend starting with:
20
+
21
+
-**[Why Use NeMo-Run?](why-use-nemo-run.md)** - Understand the benefits and philosophy behind NeMo-Run
22
+
-**[Configuration](configuration.md)** - Learn how to configure your ML tasks and experiments
23
+
-**[Execution](execution.md)** - Discover how to run your experiments across different computing environments
24
+
-**[Management](management.md)** - Master experiment tracking, reproducibility, and organization
25
+
26
+
## Advanced Topics
27
+
28
+
For more advanced usage:
29
+
30
+
-**[Ray Integration](ray.md)** - Learn how to use NeMo-Run with Ray for distributed computing
31
+
-**[CLI Reference](cli.md)** - Explore the command-line interface for NeMo-Run
32
+
33
+
## Core Concepts
34
+
35
+
NeMo-Run is built around three core responsibilities:
36
+
37
+
1.**Configuration** - Define your ML experiments using a flexible, pythonic configuration system
38
+
2.**Execution** - Run your experiments seamlessly across local machines, Slurm clusters, cloud providers, and more
39
+
3.**Management** - Track, reproduce, and organize your experiments with built-in experiment management
40
+
41
+
Each guide dives deep into these concepts with practical examples and best practices. Choose a guide above to get started!
Copy file name to clipboardExpand all lines: docs/source/guides/management.md
+8-8Lines changed: 8 additions & 8 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,8 +1,8 @@
1
-
# Management
1
+
# Manage NeMo-Run
2
2
3
3
The central component for management of tasks in NeMo-Run is the `Experiment` class. It allows you to define, launch, and manage complex workflows consisting of multiple tasks. This guide provides an overview of the `Experiment` class, its methods, and how to use it effectively.
4
4
5
-
## **Creating an Experiment**
5
+
## Creating an Experiment
6
6
7
7
To create an experiment, you can instantiate the `Experiment` class by passing in a descriptive title:
8
8
@@ -14,7 +14,7 @@ When executed, it will automatically generate a unique experiment ID for you, wh
14
14
15
15
> [!NOTE] > `Experiment` is a context manager and `Experiment.add` and `Experiment.run` methods can currently only be used after entering the context manager.
16
16
17
-
## **Adding Tasks**
17
+
## Adding Tasks
18
18
19
19
You can add tasks to an experiment using the `add` method. This method supports tasks of the following kind:
20
20
@@ -50,7 +50,7 @@ with run.Experiment("dag-experiment", log_level="INFO") as exp:
50
50
)
51
51
```
52
52
53
-
## **Launching an Experiment**
53
+
## Launching an Experiment
54
54
55
55
Once you have added all tasks to an experiment, you can launch it using the `run` method. This method takes several optional arguments, including `detach`, `sequential`, and `tail_logs` and `direct`:
You can check the status of an experiment using the `status` method:
71
71
@@ -97,23 +97,23 @@ Task 2: simple.add.add_object
97
97
- Local Directory: /home/your_user/.nemo_run/experiments/experiment_with_scripts/experiment_with_scripts_1730761155/simple.add.add_object
98
98
```
99
99
100
-
## **Canceling a Task**
100
+
## Canceling a Task
101
101
102
102
You can cancel a task using the `cancel` method:
103
103
104
104
```python
105
105
exp.cancel("task_id")
106
106
```
107
107
108
-
## **Viewing Logs**
108
+
## Viewing Logs
109
109
110
110
You can view the logs of a task using the `logs` method:
111
111
112
112
```python
113
113
exp.logs("task_id")
114
114
```
115
115
116
-
## **Experiment output**
116
+
## Experiment output
117
117
118
118
Once an experiment is run, NeMo-Run displays information on ways to inspect and reproduce past experiments. This allows you to check logs, sync artifacts (in the future), cancel running tasks, and rerun an old experiment.
0 commit comments