You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/notebooks/intro.md
+33-62Lines changed: 33 additions & 62 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -2,41 +2,6 @@
2
2
3
3
Welcome to the Data Designer tutorial series! These hands-on notebooks will guide you through the core concepts and features of Data Designer, from basic synthetic data generation to advanced techniques like structured outputs and dataset seeding.
4
4
5
-
## 📚 Tutorial Series
6
-
7
-
The tutorials are designed to be completed in sequence, building upon concepts introduced in previous notebooks:
8
-
9
-
### [1. The Basics](1-the-basics.ipynb)
10
-
11
-
Learn the fundamentals of Data Designer by generating a simple product review dataset. This notebook covers:
12
-
13
-
- Setting up the `DataDesigner` interface
14
-
- Configuring models and inference parameters
15
-
- Using built-in samplers (Category, Person, Uniform)
16
-
- Generating LLM text columns with dependencies
17
-
- Understanding the generation workflow
18
-
19
-
**Start here if you're new to Data Designer!**
20
-
21
-
### [2. Structured Outputs and Jinja Expressions](2-structured-outputs-and-jinja-expressions.ipynb)
22
-
23
-
Explore more advanced data generation capabilities:
24
-
25
-
- Creating structured JSON outputs with schemas
26
-
- Using Jinja expressions for derived columns
27
-
- Combining samplers with structured data
28
-
- Building complex data dependencies
29
-
- Working with nested data structures
30
-
31
-
### [3. Seeding with an External Dataset](3-seeding-with-a-dataset.ipynb)
32
-
33
-
Learn how to leverage existing datasets to guide synthetic data generation:
34
-
35
-
- Loading and using seed datasets
36
-
- Sampling from real data distributions
37
-
- Combining seed data with LLM generation
38
-
- Creating realistic synthetic data based on existing patterns
For more information, check the [Quick Start](../quick-start.md), [Default Model Settings](../models/default-model-settings.md) and how to [Configure Model Settings Using The CLI](../models/configure-model-settings-with-the-cli.md).
94
59
95
-
### Recommended Setup
60
+
## 📚 Tutorial Series
61
+
62
+
The tutorials are designed to be completed in sequence, building upon concepts introduced in previous notebooks:
96
63
97
-
For the best experience with Data Designer notebooks:
64
+
### [1. The Basics](1-the-basics.ipynb)
98
65
99
-
1.**Python Version:** Use Python 3.11 or later
100
-
2.**Virtual Environment:** Always use a virtual environment to avoid dependency conflicts
101
-
3.**Jupyter Extensions:** Consider installing JupyterLab for a more modern interface
102
-
4.**API Access:** Ensure you have valid API keys for your chosen LLM provider
66
+
Learn the fundamentals of Data Designer by generating a simple product review dataset. This notebook covers:
67
+
68
+
- Setting up the `DataDesigner` interface
69
+
- Configuring models and inference parameters
70
+
- Using built-in samplers (Category, Person, Uniform)
71
+
- Generating LLM text columns with dependencies
72
+
- Understanding the generation workflow
73
+
74
+
**Start here if you're new to Data Designer!**
75
+
76
+
### [2. Structured Outputs and Jinja Expressions](2-structured-outputs-and-jinja-expressions.ipynb)
77
+
78
+
Explore more advanced data generation capabilities:
79
+
80
+
- Creating structured JSON outputs with schemas
81
+
- Using Jinja expressions for derived columns
82
+
- Combining samplers with structured data
83
+
- Building complex data dependencies
84
+
- Working with nested data structures
85
+
86
+
### [3. Seeding with an External Dataset](3-seeding-with-a-dataset.ipynb)
87
+
88
+
Learn how to leverage existing datasets to guide synthetic data generation:
89
+
90
+
- Loading and using seed datasets
91
+
- Sampling from real data distributions
92
+
- Combining seed data with LLM generation
93
+
- Creating realistic synthetic data based on existing patterns
103
94
104
95
## 📖 Important Documentation Sections
105
96
@@ -125,24 +116,4 @@ Quick reference guides for the main configuration objects:
125
116
-**[column_configs](../code_reference/column_configs.md)** - All column configuration types
126
117
-**[config_builder](../code_reference/config_builder.md)** - The `DataDesignerConfigBuilder` API
127
118
-**[data_designer_config](../code_reference/data_designer_config.md)** - Main configuration schema
0 commit comments