You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/guides/dataset-overview.md
+2-2Lines changed: 2 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,6 +1,6 @@
1
1
# Dataset Overview: LLM and VLM Datasets in NeMo Automodel
2
2
3
-
This page summarizes the datasets already supported in NeMo Automodel for LLM and VLM, and shows how to plug in your own datasets via simple Python functions or purely through YAML using the `_target_` mechanism.
3
+
This page summarizes the datasets already supported in NeMo Automodel for LLM and VLM, and shows how to plug in your own datasets using simple Python functions or directly through YAML using the `_target_` mechanism.
4
4
5
5
- See also: [LLM datasets](llm/dataset.md) and [VLM datasets](vlm/dataset.md) for deeper, task-specific guides.
6
6
@@ -84,7 +84,7 @@ dataset:
84
84
split: "0.99, 0.01, 0.00" # train, validation, test
85
85
splits_to_build: "train"
86
86
```
87
-
- See the detailed pretraining guide, [Megatron MCore Pretraining](llm/mcore-pretraining.md), which uses MegatronPretraining data.
87
+
- See the detailed pretraining guide, [Megatron Core Dataset Pretraining](llm/pretraining.md), which uses MegatronPretraining data.
88
88
89
89
> ⚠️ Note: Multi-turn conversational and tool-calling/function-calling dataset support is coming soon.
Copy file name to clipboardExpand all lines: docs/guides/overview.md
+8-8Lines changed: 8 additions & 8 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,8 +1,8 @@
1
-
## Recipes and E2E Examples
1
+
## Recipes and End-to-End Examples
2
2
3
3
NeMo Automodel is organized around two key concepts: recipes and components.
4
4
5
-
Recipes are executable scripts configured via YAML files. Each recipe defines its own training and validation loop, orchestrated through a `step_scheduler`. It specifies the model, dataset, loss function, optimizer and scheduler, checkpointing, and distributed training settings—allowing end-to-end training with a single command.
5
+
Recipes are executable scripts configured with YAML files. Each recipe defines its own training and validation loop, orchestrated through a `step_scheduler`. It specifies the model, dataset, loss function, optimizer, scheduler, checkpointing, and distributed training settings—allowing end-to-end training with a single command.
6
6
7
7
Components are modular, plug-and-play building blocks referenced using the `_target_` field. These include models, datasets, loss functions, and distribution managers. Recipes assemble these components, making it easy to swap them out to change precision, distribution strategy, dataset, or task—without modifying the training loop itself.
8
8
@@ -14,7 +14,7 @@ This page maps the ready-to-run recipes found in the `examples/` directory to th
14
14
## Large Language Models (LLM)
15
15
This section provides practical recipes and configurations for working with large language models across three core workflows: fine-tuning, pretraining, and knowledge distillation.
16
16
17
-
### Fine-tuning
17
+
### Fine-Tuning
18
18
19
19
End-to-end fine-tuning recipes for many open models. Each subfolder contains YAML configurations showing task setups (e.g., SQuAD, HellaSwag), precision options (e.g., FP8), and parameter-efficient methods (e.g., LoRA/QLoRA).
20
20
@@ -30,7 +30,7 @@ Starter configurations and scripts for pretraining with datasets from different
30
30
- Example models: GPT-2 baseline, NanoGPT, DeepSeek-V3, Moonlight 16B TE (Slurm)
31
31
- How-to guides:
32
32
-[LLM pretraining](llm/pretraining.md)
33
-
-[Pretraining with Megatron-Core datasets](llm/mcore-pretraining.md)
33
+
-[Pretraining with NanoGPT](llm/nanogpt-pretraining.md)
34
34
35
35
### Knowledge Distillation (KD)
36
36
@@ -49,11 +49,11 @@ Curated configurations for benchmarking different training stacks and settings (
49
49
50
50
51
51
## Vision Language Models (VLM)
52
-
This section provides practical recipes and configurations for working with vision-language models, covering fine-tuning and generation workflows for multimodal tasks.
52
+
This section provides practical recipes and configurations for working with visionlanguage models, covering fine-tuning and generation workflows for multimodal tasks.
@@ -73,4 +73,4 @@ WAN 2.2 example for diffusion-based image generation.
73
73
74
74
---
75
75
76
-
If you are new to the project, begin with the [Installation](installation.md) guide. Then, select a recipe category above and follow its linked how-to guide(s). The provided YAML configurations can serve as templates—customize them by adapting model names, datasets, and precision settings to match your specific needs.
76
+
If you are new to the project, begin with the [Installation](installation.md) guide. Then, select a recipe category above and follow its linked how-to guide(s). The provided YAML configurations can serve as templates—customize them by adapting model names, datasets, and precision settings to match your specific needs.
0 commit comments