Skip to content

Commit 68533c7

Browse files
docs: fix links on notebooks and add %%capture on install cell (#134)
1 parent 6e65b10 commit 68533c7

9 files changed

+178
-171
lines changed

docs/colab_notebooks/1-the-basics.ipynb

Lines changed: 39 additions & 36 deletions
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22
"cells": [
33
{
44
"cell_type": "markdown",
5-
"id": "a4ac4d55",
5+
"id": "39d7d274",
66
"metadata": {},
77
"source": [
88
"# 🎨 Data Designer Tutorial: The Basics\n",
@@ -14,7 +14,7 @@
1414
},
1515
{
1616
"cell_type": "markdown",
17-
"id": "9e9f3c47",
17+
"id": "60f1d002",
1818
"metadata": {},
1919
"source": [
2020
"### ⚡ Colab Setup\n",
@@ -25,17 +25,18 @@
2525
{
2626
"cell_type": "code",
2727
"execution_count": null,
28-
"id": "41b31194",
28+
"id": "99c42292",
2929
"metadata": {},
3030
"outputs": [],
3131
"source": [
32-
"!pip install -qU data-designer"
32+
"%%capture\n",
33+
"!pip install -U data-designer"
3334
]
3435
},
3536
{
3637
"cell_type": "code",
3738
"execution_count": null,
38-
"id": "502b3aba",
39+
"id": "2c959ca9",
3940
"metadata": {},
4041
"outputs": [],
4142
"source": [
@@ -52,7 +53,7 @@
5253
},
5354
{
5455
"cell_type": "markdown",
55-
"id": "8c512fbc",
56+
"id": "bc185897",
5657
"metadata": {},
5758
"source": [
5859
"### 📦 Import the essentials\n",
@@ -63,7 +64,7 @@
6364
{
6465
"cell_type": "code",
6566
"execution_count": null,
66-
"id": "8fae521f",
67+
"id": "dc3a2d9d",
6768
"metadata": {},
6869
"outputs": [],
6970
"source": [
@@ -84,20 +85,20 @@
8485
},
8586
{
8687
"cell_type": "markdown",
87-
"id": "e71d0256",
88+
"id": "36c5f571",
8889
"metadata": {},
8990
"source": [
9091
"### ⚙️ Initialize the Data Designer interface\n",
9192
"\n",
9293
"- `DataDesigner` is the main object is responsible for managing the data generation process.\n",
9394
"\n",
94-
"- When initialized without arguments, the [default model providers](https://nvidia-nemo.github.io/DataDesigner/concepts/models/default-model-settings/) are used.\n"
95+
"- When initialized without arguments, the [default model providers](https://nvidia-nemo.github.io/DataDesigner/latest/concepts/models/default-model-settings/) are used.\n"
9596
]
9697
},
9798
{
9899
"cell_type": "code",
99100
"execution_count": null,
100-
"id": "68fc7172",
101+
"id": "61b23c70",
101102
"metadata": {},
102103
"outputs": [],
103104
"source": [
@@ -106,7 +107,7 @@
106107
},
107108
{
108109
"cell_type": "markdown",
109-
"id": "9a821a27",
110+
"id": "3c9b7cb6",
110111
"metadata": {},
111112
"source": [
112113
"### 🎛️ Define model configurations\n",
@@ -115,15 +116,15 @@
115116
"\n",
116117
"- The \"model alias\" is used to reference the model in the Data Designer config (as we will see below).\n",
117118
"\n",
118-
"- The \"model provider\" is the external service that hosts the model (see the [model config](https://nvidia-nemo.github.io/DataDesigner/concepts/models/default-model-settings/) docs for more details).\n",
119+
"- The \"model provider\" is the external service that hosts the model (see the [model config](https://nvidia-nemo.github.io/DataDesigner/latest/concepts/models/default-model-settings/) docs for more details).\n",
119120
"\n",
120121
"- By default, we use [build.nvidia.com](https://build.nvidia.com/models) as the model provider.\n"
121122
]
122123
},
123124
{
124125
"cell_type": "code",
125126
"execution_count": null,
126-
"id": "a9515141",
127+
"id": "b86f6217",
127128
"metadata": {},
128129
"outputs": [],
129130
"source": [
@@ -155,7 +156,7 @@
155156
},
156157
{
157158
"cell_type": "markdown",
158-
"id": "3b940ab9",
159+
"id": "1f089871",
159160
"metadata": {},
160161
"source": [
161162
"### 🏗️ Initialize the Data Designer Config Builder\n",
@@ -170,7 +171,7 @@
170171
{
171172
"cell_type": "code",
172173
"execution_count": null,
173-
"id": "ec21da7e",
174+
"id": "3d666193",
174175
"metadata": {},
175176
"outputs": [],
176177
"source": [
@@ -179,7 +180,7 @@
179180
},
180181
{
181182
"cell_type": "markdown",
182-
"id": "85b2324e",
183+
"id": "e88c8881",
183184
"metadata": {},
184185
"source": [
185186
"## 🎲 Getting started with sampler columns\n",
@@ -196,7 +197,7 @@
196197
{
197198
"cell_type": "code",
198199
"execution_count": null,
199-
"id": "f49f435e",
200+
"id": "79fb85c6",
200201
"metadata": {},
201202
"outputs": [],
202203
"source": [
@@ -205,7 +206,7 @@
205206
},
206207
{
207208
"cell_type": "markdown",
208-
"id": "f582b642",
209+
"id": "5106cc10",
209210
"metadata": {},
210211
"source": [
211212
"Let's start designing our product review dataset by adding product category and subcategory columns.\n"
@@ -214,7 +215,7 @@
214215
{
215216
"cell_type": "code",
216217
"execution_count": null,
217-
"id": "8cfc43b1",
218+
"id": "22b97af1",
218219
"metadata": {},
219220
"outputs": [],
220221
"source": [
@@ -295,7 +296,7 @@
295296
},
296297
{
297298
"cell_type": "markdown",
298-
"id": "2d0eea21",
299+
"id": "4857b085",
299300
"metadata": {},
300301
"source": [
301302
"Next, let's add samplers to generate data related to the customer and their review.\n"
@@ -304,7 +305,7 @@
304305
{
305306
"cell_type": "code",
306307
"execution_count": null,
307-
"id": "b5e65724",
308+
"id": "9e90b3cb",
308309
"metadata": {},
309310
"outputs": [],
310311
"source": [
@@ -341,7 +342,7 @@
341342
},
342343
{
343344
"cell_type": "markdown",
344-
"id": "e6788771",
345+
"id": "b36a153b",
345346
"metadata": {},
346347
"source": [
347348
"## 🦜 LLM-generated columns\n",
@@ -356,7 +357,7 @@
356357
{
357358
"cell_type": "code",
358359
"execution_count": null,
359-
"id": "a2705cd9",
360+
"id": "4da88fe6",
360361
"metadata": {},
361362
"outputs": [],
362363
"source": [
@@ -393,7 +394,7 @@
393394
},
394395
{
395396
"cell_type": "markdown",
396-
"id": "e3dd2f69",
397+
"id": "5f1b9ac8",
397398
"metadata": {},
398399
"source": [
399400
"### 🔁 Iteration is key – preview the dataset!\n",
@@ -410,7 +411,7 @@
410411
{
411412
"cell_type": "code",
412413
"execution_count": null,
413-
"id": "c6e43147",
414+
"id": "543e2f9c",
414415
"metadata": {},
415416
"outputs": [],
416417
"source": [
@@ -420,7 +421,7 @@
420421
{
421422
"cell_type": "code",
422423
"execution_count": null,
423-
"id": "fab77d01",
424+
"id": "26136a8a",
424425
"metadata": {},
425426
"outputs": [],
426427
"source": [
@@ -431,7 +432,7 @@
431432
{
432433
"cell_type": "code",
433434
"execution_count": null,
434-
"id": "875ee6a6",
435+
"id": "aca4360d",
435436
"metadata": {},
436437
"outputs": [],
437438
"source": [
@@ -441,7 +442,7 @@
441442
},
442443
{
443444
"cell_type": "markdown",
444-
"id": "87b59e4b",
445+
"id": "35ca0470",
445446
"metadata": {},
446447
"source": [
447448
"### 📊 Analyze the generated data\n",
@@ -454,7 +455,7 @@
454455
{
455456
"cell_type": "code",
456457
"execution_count": null,
457-
"id": "5d347f4c",
458+
"id": "d55b402d",
458459
"metadata": {},
459460
"outputs": [],
460461
"source": [
@@ -464,7 +465,7 @@
464465
},
465466
{
466467
"cell_type": "markdown",
467-
"id": "d2fb84f2",
468+
"id": "245b48cf",
468469
"metadata": {},
469470
"source": [
470471
"### 🆙 Scale up!\n",
@@ -477,7 +478,7 @@
477478
{
478479
"cell_type": "code",
479480
"execution_count": null,
480-
"id": "71a31e85",
481+
"id": "fc803eb0",
481482
"metadata": {},
482483
"outputs": [],
483484
"source": [
@@ -487,7 +488,7 @@
487488
{
488489
"cell_type": "code",
489490
"execution_count": null,
490-
"id": "501e9092",
491+
"id": "881c2043",
491492
"metadata": {},
492493
"outputs": [],
493494
"source": [
@@ -500,7 +501,7 @@
500501
{
501502
"cell_type": "code",
502503
"execution_count": null,
503-
"id": "6f217b4a",
504+
"id": "d79860d4",
504505
"metadata": {},
505506
"outputs": [],
506507
"source": [
@@ -512,16 +513,18 @@
512513
},
513514
{
514515
"cell_type": "markdown",
515-
"id": "4da82b0f",
516+
"id": "b4b45176",
516517
"metadata": {},
517518
"source": [
518519
"## ⏭️ Next Steps\n",
519520
"\n",
520521
"Now that you've seen the basics of Data Designer, check out the following notebooks to learn more about:\n",
521522
"\n",
522-
"- [Structured outputs and jinja expressions](/notebooks/2-structured-outputs-and-jinja-expressions/)\n",
523+
"- [Structured outputs and jinja expressions](https://nvidia-nemo.github.io/DataDesigner/latest/notebooks/2-structured-outputs-and-jinja-expressions/)\n",
523524
"\n",
524-
"- [Seeding synthetic data generation with an external dataset](/notebooks/3-seeding-with-a-dataset/)\n"
525+
"- [Seeding synthetic data generation with an external dataset](https://nvidia-nemo.github.io/DataDesigner/latest/notebooks/3-seeding-with-a-dataset/)\n",
526+
"\n",
527+
"- [Providing images as context](https://nvidia-nemo.github.io/DataDesigner/latest/notebooks/4-providing-images-as-context/)\n"
525528
]
526529
}
527530
],

0 commit comments

Comments
 (0)