Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
77 changes: 37 additions & 40 deletions docs/colab_notebooks/1-the-basics.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
"cells": [
{
"cell_type": "markdown",
"id": "9f804f90",
"id": "56daa304",
"metadata": {},
"source": [
"# 🎨 Data Designer Tutorial: The Basics\n",
Expand All @@ -14,7 +14,7 @@
},
{
"cell_type": "markdown",
"id": "9cb786eb",
"id": "8734a74a",
"metadata": {},
"source": [
"### ⚡ Colab Setup\n",
Expand All @@ -25,7 +25,7 @@
{
"cell_type": "code",
"execution_count": null,
"id": "7f45ea56",
"id": "45510d11",
"metadata": {},
"outputs": [],
"source": [
Expand All @@ -36,7 +36,7 @@
{
"cell_type": "code",
"execution_count": null,
"id": "ea86e81e",
"id": "4bad4940",
"metadata": {},
"outputs": [],
"source": [
Expand All @@ -53,7 +53,7 @@
},
{
"cell_type": "markdown",
"id": "16611c7b",
"id": "0543d90e",
"metadata": {},
"source": [
"### 📦 Import the essentials\n",
Expand All @@ -64,7 +64,7 @@
{
"cell_type": "code",
"execution_count": null,
"id": "875342bb",
"id": "90185344",
"metadata": {},
"outputs": [],
"source": [
Expand All @@ -85,7 +85,7 @@
},
{
"cell_type": "markdown",
"id": "b58ac676",
"id": "e6fcf82b",
"metadata": {},
"source": [
"### ⚙️ Initialize the Data Designer interface\n",
Expand All @@ -98,7 +98,7 @@
{
"cell_type": "code",
"execution_count": null,
"id": "3ce805ad",
"id": "8760c1ef",
"metadata": {},
"outputs": [],
"source": [
Expand All @@ -107,7 +107,7 @@
},
{
"cell_type": "markdown",
"id": "50e961ed",
"id": "da9d9f06",
"metadata": {},
"source": [
"### 🎛️ Define model configurations\n",
Expand All @@ -124,7 +124,7 @@
{
"cell_type": "code",
"execution_count": null,
"id": "1b07a6a5",
"id": "03760d56",
"metadata": {},
"outputs": [],
"source": [
Expand All @@ -135,28 +135,26 @@
"MODEL_ID = \"nvidia/nemotron-3-nano-30b-a3b\"\n",
"\n",
"# We choose this alias to be descriptive for our use case.\n",
"MODEL_ALIAS = \"nemotron-nano-v2\"\n",
"\n",
"# This sets reasoning to False for the nemotron-nano-v2 model.\n",
"SYSTEM_PROMPT = \"/no_think\"\n",
"MODEL_ALIAS = \"nemotron-nano-v3\"\n",
"\n",
"model_configs = [\n",
" ModelConfig(\n",
" alias=MODEL_ALIAS,\n",
" model=MODEL_ID,\n",
" provider=MODEL_PROVIDER,\n",
" inference_parameters=ChatCompletionInferenceParams(\n",
" temperature=0.5,\n",
" temperature=1.0,\n",
" top_p=1.0,\n",
" max_tokens=1024,\n",
" max_tokens=2048,\n",
" extra_body={\"chat_template_kwargs\": {\"enable_thinking\": False}},\n",
" ),\n",
" )\n",
"]"
]
},
{
"cell_type": "markdown",
"id": "6d873251",
"id": "a968637c",
"metadata": {},
"source": [
"### 🏗️ Initialize the Data Designer Config Builder\n",
Expand All @@ -171,7 +169,7 @@
{
"cell_type": "code",
"execution_count": null,
"id": "d45fac13",
"id": "e5768870",
"metadata": {},
"outputs": [],
"source": [
Expand All @@ -180,7 +178,7 @@
},
{
"cell_type": "markdown",
"id": "c35b0274",
"id": "d12c1559",
"metadata": {},
"source": [
"## 🎲 Getting started with sampler columns\n",
Expand All @@ -197,7 +195,7 @@
{
"cell_type": "code",
"execution_count": null,
"id": "14cb9967",
"id": "3c47fbe6",
"metadata": {},
"outputs": [],
"source": [
Expand All @@ -206,7 +204,7 @@
},
{
"cell_type": "markdown",
"id": "40945aea",
"id": "b47862c5",
"metadata": {},
"source": [
"Let's start designing our product review dataset by adding product category and subcategory columns.\n"
Expand All @@ -215,7 +213,7 @@
{
"cell_type": "code",
"execution_count": null,
"id": "a7d87e00",
"id": "6ff2257f",
"metadata": {},
"outputs": [],
"source": [
Expand Down Expand Up @@ -296,7 +294,7 @@
},
{
"cell_type": "markdown",
"id": "48699878",
"id": "a26f889e",
"metadata": {},
"source": [
"Next, let's add samplers to generate data related to the customer and their review.\n"
Expand All @@ -305,7 +303,7 @@
{
"cell_type": "code",
"execution_count": null,
"id": "df84faf3",
"id": "e603d4cc",
"metadata": {},
"outputs": [],
"source": [
Expand Down Expand Up @@ -342,7 +340,7 @@
},
{
"cell_type": "markdown",
"id": "8288352d",
"id": "cf5070af",
"metadata": {},
"source": [
"## 🦜 LLM-generated columns\n",
Expand All @@ -357,7 +355,7 @@
{
"cell_type": "code",
"execution_count": null,
"id": "157919b4",
"id": "775c6fa8",
"metadata": {},
"outputs": [],
"source": [
Expand All @@ -370,7 +368,6 @@
" \"on products related to '{{ product_subcategory }}'. The target age range of the ideal customer is \"\n",
" \"{{ target_age_range }} years old. Respond with only the product name, no other text.\"\n",
" ),\n",
" system_prompt=SYSTEM_PROMPT,\n",
" model_alias=MODEL_ALIAS,\n",
" )\n",
")\n",
Expand All @@ -382,9 +379,9 @@
" \"You are a customer named {{ customer.first_name }} from {{ customer.city }}, {{ customer.state }}. \"\n",
" \"You are {{ customer.age }} years old and recently purchased a product called {{ product_name }}. \"\n",
" \"Write a review of this product, which you gave a rating of {{ number_of_stars }} stars. \"\n",
" \"The style of the review should be '{{ review_style }}'.\"\n",
" \"The style of the review should be '{{ review_style }}'. \"\n",
" \"Respond with only the review, no other text.\"\n",
" ),\n",
" system_prompt=SYSTEM_PROMPT,\n",
" model_alias=MODEL_ALIAS,\n",
" )\n",
")\n",
Expand All @@ -394,7 +391,7 @@
},
{
"cell_type": "markdown",
"id": "009646e4",
"id": "25796666",
"metadata": {},
"source": [
"### 🔁 Iteration is key – preview the dataset!\n",
Expand All @@ -411,7 +408,7 @@
{
"cell_type": "code",
"execution_count": null,
"id": "a9c90236",
"id": "ba90ee16",
"metadata": {},
"outputs": [],
"source": [
Expand All @@ -421,7 +418,7 @@
{
"cell_type": "code",
"execution_count": null,
"id": "3cfe180e",
"id": "db9d6f8a",
"metadata": {},
"outputs": [],
"source": [
Expand All @@ -432,7 +429,7 @@
{
"cell_type": "code",
"execution_count": null,
"id": "65b2f595",
"id": "cb555bd5",
"metadata": {},
"outputs": [],
"source": [
Expand All @@ -442,7 +439,7 @@
},
{
"cell_type": "markdown",
"id": "2134fa0f",
"id": "b35ee52b",
"metadata": {},
"source": [
"### 📊 Analyze the generated data\n",
Expand All @@ -455,7 +452,7 @@
{
"cell_type": "code",
"execution_count": null,
"id": "8a37dd61",
"id": "0d15fb8d",
"metadata": {},
"outputs": [],
"source": [
Expand All @@ -465,7 +462,7 @@
},
{
"cell_type": "markdown",
"id": "b715bc3a",
"id": "4fefec9f",
"metadata": {},
"source": [
"### 🆙 Scale up!\n",
Expand All @@ -478,7 +475,7 @@
{
"cell_type": "code",
"execution_count": null,
"id": "565f03a1",
"id": "395faa2c",
"metadata": {},
"outputs": [],
"source": [
Expand All @@ -488,7 +485,7 @@
{
"cell_type": "code",
"execution_count": null,
"id": "9d4c91ad",
"id": "65dcd625",
"metadata": {},
"outputs": [],
"source": [
Expand All @@ -501,7 +498,7 @@
{
"cell_type": "code",
"execution_count": null,
"id": "93c5a082",
"id": "1aef103b",
"metadata": {},
"outputs": [],
"source": [
Expand All @@ -513,7 +510,7 @@
},
{
"cell_type": "markdown",
"id": "13f7c942",
"id": "09ec21ba",
"metadata": {},
"source": [
"## ⏭️ Next Steps\n",
Expand Down
Loading