You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This directory contains the tutorial notebooks for getting started with NeMo Data Designer.
3
+
This directory contains tutorial notebooks for getting started with NeMo Data Designer.
4
4
5
5
## 📦 Set Up the Environment
6
6
7
-
We will use the `uv` package manager to set up our environment and install the necessary dependencies. If you don't have `uv` installed, you can follow the installation instructions from the [uv documentation](https://docs.astral.sh/uv/getting-started/installation/).
7
+
We will use the `uv` package manager to set up our environment and install the necessary dependencies. If you don't have `uv` installed, follow the installation instructions from the [uv documentation](https://docs.astral.sh/uv/getting-started/installation/).
8
8
9
9
Once you have `uv` installed, be sure you are in the `Nemo-Data-Designer` directory and run the following command:
10
10
@@ -22,48 +22,28 @@ Be sure to select this virtual environment as your kernel when running the noteb
22
22
23
23
## 🚀 Deploying the NeMo Data Designer Microservice
24
24
25
-
To run these notebooks, you'll need the NeMo Data Designer microservice. You have two deployment options:
25
+
To run the tutorial notebooks in this repository, you'll need access to a running instance of the NeMo Data Designer microservice.
26
26
27
-
### ⚙️ Using the NeMo Data Designer Managed Service
28
-
We have a [managed service of NeMo Data Designer](https://build.nvidia.com/nemo/data-designer) to help you get started quickly.
29
-
30
-
Please refer to the [intro-tutorials](./intro-tutorials/) notebooks to learn how to connect to this service.
27
+
You have two deployment options:
31
28
32
-
**Note**: This managed service of NeMo Data Designer is intended to only help you get started. As a result, it can only be used to launch `preview` jobs. It can **not** be used to launch long running jobs. If you need to launch long-running jobs please deploy an instance of [NeMo Data Designer locally](#-deploy-the-nemo-data-designer-microservice-locally)
29
+
### 🐳 Self-Hosted Deployment
30
+
Deploy the NeMo Data Designer microservice locally via Docker Compose.
33
31
32
+
Please see the [Installation Options](https://docs.nvidia.com/nemo/microservices/latest/design-synthetic-data-from-scratch-or-seeds/index.html#installation-options) section of the [NeMo Data Designer documentation](https://docs.nvidia.com/nemo/microservices/latest/design-synthetic-data-from-scratch-or-seeds/index.html) for more information.
34
33
35
-
### 🐳 Deploy the NeMo Data Designer Microservice Locally
36
34
37
-
Alternatively, you can deploy the NeMo Data Designer microservice locally via Docker Compose.
35
+
### ⚙️ Managed Service
36
+
We have a [managed service of NeMo Data Designer](https://build.nvidia.com/nemo/data-designer) to help you get started quickly.
38
37
39
-
To run the tutorial notebooks in the [advanced](./advanced/), you will need to have NeMo Data Designer deployed locally. Please see the [deployment guide](http://docs.nvidia.com/nemo/microservices/latest/set-up/deploy-as-microservices/data-designer/docker-compose.html) for more details.
38
+
**Note**: This managed service can only be used to launch `preview` jobs. It can **not** be used to launch long-running jobs. If you need to launch long-running jobs please deploy an instance of NeMo Data Designer locally.
|[1-the-basics.ipynb](./intro-tutorials/1-the-basics.ipynb)| Learn the basics of Data Designer by generating a simple product review dataset |
49
-
|[2-structured-outputs-and-jinja-expressions.ipynb](./intro-tutorials/2-structured-outputs-and-jinja-expressions.ipynb)| Explore advanced data generation using structured outputs and Jinja expressions |
50
-
|[3-seeding-with-a-dataset.ipynb](./intro-tutorials/3-seeding-with-a-dataset.ipynb)| Discover how to seed synthetic data generation with an external dataset |
51
-
|[4-custom-model-configs.ipynb](./intro-tutorials/4-custom-model-configs.ipynb)| Master creating and using custom model configurations |
|[person-sampler-tutorial.ipynb](./advanced/person-samplers/person-sampler-tutorial.ipynb)| Persona Samplers | Generate realistic personas using the person sampler |
58
-
|[clinical-trials.ipynb](./advanced/healthcare-datasets/clinical-trials.ipynb)| Healthcare | Build synthetic clinical trial datasets with realistic PII for testing data protection |
59
-
|[insurance-claims.ipynb](./advanced/healthcare-datasets/insurance-claims.ipynb)| Healthcare | Create synthetic insurance claims datasets with realistic claim data and processing information |
60
-
|[physician-notes-with-realistic-personal-details.ipynb](./advanced/healthcare-datasets/physician-notes-with-realistic-personal-details.ipynb)| Healthcare | Generate realistic patient data and physician notes with embedded personal information |
61
-
|[w2-dataset.ipynb](./advanced/forms/w2-dataset.ipynb)| Forms & Documents | Generate synthetic W-2 tax form datasets with realistic employee and employer information |
62
-
|[multi-turn-conversation.ipynb](./advanced/multi-turn-chat/multi-turn-conversation.ipynb)| Conversational AI | Build synthetic conversational data with realistic person details and multi-turn dialogues |
63
-
|[visual-question-answering-using-vlm.ipynb](./advanced/multimodal/visual-question-answering-using-vlm.ipynb)| Multimodal | Create visual question answering datasets using Vision Language Models |
64
-
|[product-question-answer-generator.ipynb](./advanced/qa-generation/product-question-answer-generator.ipynb)| Q&A Generation | Build product information datasets with corresponding questions and answers |
65
-
|[generate-rag-evaluation-dataset.ipynb](./advanced/rag-examples/generate-rag-evaluation-dataset.ipynb)| RAG & Retrieval | Generate diverse RAG evaluation datasets for testing retrieval-augmented generation systems |
|[text-to-python.ipynb](./advanced/text-to-code/text-to-python.ipynb)| Text-to-Code | Generate Python code from natural language instructions with validation and evaluation |
68
-
|[text-to-python-evol.ipynb](./advanced/text-to-code/text-to-python-evol.ipynb)| Text-to-Code | Build advanced Python code generation with evolutionary improvements and iterative refinement |
69
-
|[text-to-sql.ipynb](./advanced/text-to-code/text-to-sql.ipynb)| Text-to-Code | Create SQL queries from natural language descriptions with validation and testing |
43
+
If you find yourself writing Data Designer tutorial notebooks (thank you 🫶), please check out the [TUTORIAL_STYLE_GUIDE.md](./TUTORIAL_STYLE_GUIDE.md) for best practices and style guidelines.
44
+
45
+
#### Self-hosted tutorials:
46
+
47
+
-[Getting Started](./self-hosted-tutorials/getting-started): Learn the foundations of generating synthetic data with Data Designer.
48
+
49
+
-[Community Contributions](./self-hosted-tutorials/community-contributions/): Explore diverse use cases and advanced features in community-contributed notebooks.
0 commit comments