You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/get-started/index.md
+15-10Lines changed: 15 additions & 10 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -13,48 +13,53 @@ modality: "universal"
13
13
14
14
## Before You Start
15
15
16
-
Welcome to NeMo Curator! This toolkit enables you to curate large-scale datasets for training generative AI models across text, image, and video modalities.
16
+
Welcome to NeMo Curator! This framework streamlines the curation and pre-processing of large-scale datasets for training generative AI models across text, image, audio and video modalities.
17
17
18
18
**Who are these quickstarts for?**
19
-
-Data scientists and ML engineers who want to quickly test NeMo Curator's capabilities
20
-
- Users who want to run their first curation pipeline with minimal setup
21
-
-Anyone exploring NeMo Curator before committing to a full production deployment
19
+
-AI/ML engineers and researchers who want to quickly test NeMo Curator's capabilities
20
+
- Users looking to run an initial curation pipeline with minimal setup
21
+
-Individuals exploring NeMo Curator prior to a full production deployment
22
22
23
23
**What you'll find here:**
24
-
Each quickstart below gets you up and running with a specific modality in under 30 minutes. They include basic installation, sample data, and a working example.
24
+
Each quickstart enables you to get started with a specific domain in less than 30 minutes. Quickstarts provide basic installation steps, sample data, and a working example.
25
25
26
26
:::{tip}
27
-
For production deployments, cluster configurations, or detailed system requirements, see the [Setup & Deployment documentation](admin-overview).
27
+
For production deployments, cluster configurations, or detailed system requirements, refer to the [Setup & Deployment documentation](admin-overview).
28
28
:::
29
29
30
30
---
31
31
32
32
## Modality Quickstarts
33
33
34
-
The following quickstarts enable you to test out NeMo Curator for a given modality.
34
+
The following quickstarts allow you to test NeMo Curator using a selected data modality.
35
35
36
36
::::{grid} 1 1 1 2
37
37
:gutter: 1 1 1 2
38
38
39
39
:::{grid-item-card} {octicon}`typography;1.5em;sd-mr-1` Text Curation Quickstart
40
40
:link: gs-text
41
41
:link-type: ref
42
-
Set up your environment and run your first text curation pipeline with NeMo Curator. Learn how to install the toolkit, prepare your data, and use the pipeline architecture with modular stages to curate large-scale text datasets efficiently.
42
+
Set up your environment and execute your first text curation pipeline with NeMo Curator. Instructions cover installation, data preparation, and use of the modular pipeline architecture for efficient large-scale text dataset curation.
Set up your environment and install NeMo Curator's image modules. Learn about prerequisites, installation methods, and how to use the toolkit to curate large-scale image-text datasets for generative model training.
49
+
Set up your environment and install the NeMo Curator image modules. The quickstart explains prerequisites, installation methods, and the use of the framework to curate large-scale image-text datasets for generative AI model training.
50
50
51
51
:::
52
52
53
53
:::{grid-item-card} {octicon}`video;1.5em;sd-mr-1` Video Curation Quickstart
54
54
:link: gs-video
55
55
:link-type: ref
56
-
Set up your environment and run your first video curation pipeline. Learn about prerequisites, installation options, and how to split, encode, embed, and export curated clips at scale.
56
+
Set up your environment and execute your first video curation pipeline. The instructions include prerequisites, installation options, and guidance on splitting, encoding, embedding, and exporting curated video clips at scale.
Set up your environment and execute your first audio curation pipeline with NeMo Curator. Instructions cover installation, data preparation, and use of the modular pipeline architecture for efficient large-scale audio speech dataset curation.
0 commit comments