You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: nemo/NeMo-Data-Designer/README.md
+37-36Lines changed: 37 additions & 36 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -2,34 +2,23 @@
2
2
3
3
This directory contains the tutorial notebooks for getting started with NeMo Data Designer.
4
4
5
-
## 📚 Table of Contents
5
+
## 📦 Set Up the Environment
6
6
7
-
### 🚀 Intro Tutorials
7
+
We will use the `uv` package manager to set up our environment and install the necessary dependencies. If you don't have `uv` installed, you can follow the installation instructions from the [uv documentation](https://docs.astral.sh/uv/getting-started/installation/).
|[1-the-basics.ipynb](./intro-tutorials/1-the-basics.ipynb)| Learn the basics of Data Designer by generating a simple product review dataset |
12
-
|[2-structured-outputs-and-jinja-expressions.ipynb](./intro-tutorials/2-structured-outputs-and-jinja-expressions.ipynb)| Explore advanced data generation using structured outputs and Jinja expressions |
13
-
|[3-seeding-with-a-dataset.ipynb](./intro-tutorials/3-seeding-with-a-dataset.ipynb)| Discover how to seed synthetic data generation with an external dataset |
14
-
|[4-custom-model-configs.ipynb](./intro-tutorials/4-custom-model-configs.ipynb)| Master creating and using custom model configurations |
9
+
Once you have `uv` installed, be sure you are in the `Nemo-Data-Designer` directory and run the following command:
|[person-sampler-tutorial.ipynb](./advanced/person-samplers/person-sampler-tutorial.ipynb)| Persona Samplers | Generate realistic personas using the person sampler |
21
-
|[clinical-trials.ipynb](./advanced/healthcare-datasets/clinical-trials.ipynb)| Healthcare | Build synthetic clinical trial datasets with realistic PII for testing data protection |
22
-
|[insurance-claims.ipynb](./advanced/healthcare-datasets/insurance-claims.ipynb)| Healthcare | Create synthetic insurance claims datasets with realistic claim data and processing information |
23
-
|[physician-notes-with-realistic-personal-details.ipynb](./advanced/healthcare-datasets/physician-notes-with-realistic-personal-details.ipynb)| Healthcare | Generate realistic patient data and physician notes with embedded personal information |
24
-
|[w2-dataset.ipynb](./advanced/forms/w2-dataset.ipynb)| Forms & Documents | Generate synthetic W-2 tax form datasets with realistic employee and employer information |
25
-
|[multi-turn-conversation.ipynb](./advanced/multi-turn-chat/multi-turn-conversation.ipynb)| Conversational AI | Build synthetic conversational data with realistic person details and multi-turn dialogues |
26
-
|[visual-question-answering-using-vlm.ipynb](./advanced/multimodal/visual-question-answering-using-vlm.ipynb)| Multimodal | Create visual question answering datasets using Vision Language Models |
27
-
|[product-question-answer-generator.ipynb](./advanced/qa-generation/product-question-answer-generator.ipynb)| Q&A Generation | Build product information datasets with corresponding questions and answers |
28
-
|[generate-rag-evaluation-dataset.ipynb](./advanced/rag-examples/generate-rag-evaluation-dataset.ipynb)| RAG & Retrieval | Generate diverse RAG evaluation datasets for testing retrieval-augmented generation systems |
|[text-to-python.ipynb](./advanced/text-to-code/text-to-python.ipynb)| Text-to-Code | Generate Python code from natural language instructions with validation and evaluation |
31
-
|[text-to-python-evol.ipynb](./advanced/text-to-code/text-to-python-evol.ipynb)| Text-to-Code | Build advanced Python code generation with evolutionary improvements and iterative refinement |
32
-
|[text-to-sql.ipynb](./advanced/text-to-code/text-to-sql.ipynb)| Text-to-Code | Create SQL queries from natural language descriptions with validation and testing |
15
+
This will create a virtual environment and install the necessary dependencies. Activate the virtual environment by running the following command:
16
+
17
+
```bash
18
+
source .venv/bin/activate
19
+
```
20
+
21
+
Be sure to select this virtual environment as your kernel when running the notebooks.
33
22
34
23
## 🚀 Deploying the NeMo Data Designer Microservice
35
24
@@ -49,20 +38,32 @@ Alternatively, you can deploy the NeMo Data Designer microservice locally via Do
49
38
50
39
To run the tutorial notebooks in the [advanced](./advanced/), you will need to have NeMo Data Designer deployed locally. Please see the [deployment guide](http://docs.nvidia.com/nemo/microservices/latest/set-up/deploy-as-microservices/data-designer/docker-compose.html) for more details.
51
40
52
-
## 📦 Set Up the Environment
53
41
54
-
We will use the `uv` package manager to set up our environment and install the necessary dependencies. If you don't have `uv` installed, you can follow the installation instructions from the [uv documentation](https://docs.astral.sh/uv/getting-started/installation/).
42
+
## 📚 Tutorial Directory
55
43
56
-
Once you have `uv` installed, be sure you are in the `Nemo-Data-Designer` directory and run the following command:
57
-
58
-
```bash
59
-
uv sync
60
-
```
44
+
### 🚀 Intro Tutorials
61
45
62
-
This will create a virtual environment and install the necessary dependencies. Activate the virtual environment by running the following command:
|[1-the-basics.ipynb](./intro-tutorials/1-the-basics.ipynb)| Learn the basics of Data Designer by generating a simple product review dataset |
49
+
|[2-structured-outputs-and-jinja-expressions.ipynb](./intro-tutorials/2-structured-outputs-and-jinja-expressions.ipynb)| Explore advanced data generation using structured outputs and Jinja expressions |
50
+
|[3-seeding-with-a-dataset.ipynb](./intro-tutorials/3-seeding-with-a-dataset.ipynb)| Discover how to seed synthetic data generation with an external dataset |
51
+
|[4-custom-model-configs.ipynb](./intro-tutorials/4-custom-model-configs.ipynb)| Master creating and using custom model configurations |
63
52
64
-
```bash
65
-
source .venv/bin/activate
66
-
```
53
+
### 🎯 Advanced Tutorials
67
54
68
-
Be sure to select this virtual environment as your kernel when running the notebooks.
|[person-sampler-tutorial.ipynb](./advanced/person-samplers/person-sampler-tutorial.ipynb)| Persona Samplers | Generate realistic personas using the person sampler |
58
+
|[clinical-trials.ipynb](./advanced/healthcare-datasets/clinical-trials.ipynb)| Healthcare | Build synthetic clinical trial datasets with realistic PII for testing data protection |
59
+
|[insurance-claims.ipynb](./advanced/healthcare-datasets/insurance-claims.ipynb)| Healthcare | Create synthetic insurance claims datasets with realistic claim data and processing information |
60
+
|[physician-notes-with-realistic-personal-details.ipynb](./advanced/healthcare-datasets/physician-notes-with-realistic-personal-details.ipynb)| Healthcare | Generate realistic patient data and physician notes with embedded personal information |
61
+
|[w2-dataset.ipynb](./advanced/forms/w2-dataset.ipynb)| Forms & Documents | Generate synthetic W-2 tax form datasets with realistic employee and employer information |
62
+
|[multi-turn-conversation.ipynb](./advanced/multi-turn-chat/multi-turn-conversation.ipynb)| Conversational AI | Build synthetic conversational data with realistic person details and multi-turn dialogues |
63
+
|[visual-question-answering-using-vlm.ipynb](./advanced/multimodal/visual-question-answering-using-vlm.ipynb)| Multimodal | Create visual question answering datasets using Vision Language Models |
64
+
|[product-question-answer-generator.ipynb](./advanced/qa-generation/product-question-answer-generator.ipynb)| Q&A Generation | Build product information datasets with corresponding questions and answers |
65
+
|[generate-rag-evaluation-dataset.ipynb](./advanced/rag-examples/generate-rag-evaluation-dataset.ipynb)| RAG & Retrieval | Generate diverse RAG evaluation datasets for testing retrieval-augmented generation systems |
|[text-to-python.ipynb](./advanced/text-to-code/text-to-python.ipynb)| Text-to-Code | Generate Python code from natural language instructions with validation and evaluation |
68
+
|[text-to-python-evol.ipynb](./advanced/text-to-code/text-to-python-evol.ipynb)| Text-to-Code | Build advanced Python code generation with evolutionary improvements and iterative refinement |
69
+
|[text-to-sql.ipynb](./advanced/text-to-code/text-to-sql.ipynb)| Text-to-Code | Create SQL queries from natural language descriptions with validation and testing |
0 commit comments