Skip to content
Merged
Show file tree
Hide file tree
Changes from 44 commits
Commits
Show all changes
52 commits
Select commit Hold shift + click to select a range
dc041f7
Add generation type to ModelConfig
nabinchha Nov 25, 2025
0d6b830
pass tests
nabinchha Nov 25, 2025
254fd8a
added generate_text_embeddings
nabinchha Nov 25, 2025
1126ea1
tests
nabinchha Nov 25, 2025
744bc8f
remove sensitive=True old artifact no longer needed
nabinchha Nov 25, 2025
b913f8d
Slight refactor
nabinchha Nov 26, 2025
052db7a
slight refactor
nabinchha Nov 26, 2025
5504c8d
Added embedding generator
nabinchha Nov 26, 2025
4b6f877
chunk_separator -> chunk_pattern
nabinchha Nov 26, 2025
04fc0f3
update tests
nabinchha Nov 26, 2025
26d6da1
rename for consistency
nabinchha Nov 26, 2025
6facbd2
Restructure InferenceParameters -> CompletionInferenceParameters, Bas…
nabinchha Nov 26, 2025
2c1b267
Remove purpose from consolidated kwargs
nabinchha Nov 26, 2025
4b1492b
WithModelConfiguration.inference_parameters should should be typed wi…
nabinchha Dec 2, 2025
c445caf
Type as WithModelGeneration
nabinchha Dec 2, 2025
4b8aa2b
Add image generation modality
nabinchha Dec 2, 2025
2c5933f
update return type for generate_kwargs
nabinchha Dec 3, 2025
c6c29d4
make generation_type a field of ModelConfig as opposed to a prop reso…
nabinchha Dec 3, 2025
06a724b
remove regex based chunking from embedding generator
nabinchha Dec 3, 2025
bbb6a83
Merge branch 'main' into nmulepati/feat/support-embedding-generation
nabinchha Dec 8, 2025
b9455d4
Remove image generation for now
nabinchha Dec 8, 2025
e5c0b7a
more tests and updates
nabinchha Dec 9, 2025
6460c6b
column_type_is_llm_generated -> column_type_is_model_generated
nabinchha Dec 9, 2025
e294b40
change set to list: fix flaky tests
nabinchha Dec 9, 2025
4e697ec
CompletionInferenceParameters -> ChatCompletionInferenceParameters fo…
nabinchha Dec 9, 2025
d650398
Update docs
nabinchha Dec 9, 2025
4601e3f
fix deprecation warning originating from cli model settings
nabinchha Dec 9, 2025
65ba5bf
update display of inference parameters in cli list
nabinchha Dec 9, 2025
0917d6e
save prog on inference parameter
nabinchha Dec 10, 2025
1aa74dd
updates for the ocnfig builder
nabinchha Dec 10, 2025
d72b204
update cli readme
nabinchha Dec 10, 2025
4c53f1f
update cli for inference parmeters
nabinchha Dec 10, 2025
bd63f91
Merge branch 'main' into nmulepati/feat/support-embedding-generation
nabinchha Dec 10, 2025
3413799
update inference parameter names
nabinchha Dec 10, 2025
5425df5
flip order of vars
nabinchha Dec 10, 2025
7723764
WithCompletion -> WithChatCompletion
nabinchha Dec 10, 2025
9ae1bb8
specify InferenceParamsT
nabinchha Dec 10, 2025
3aa3326
Update columns.md with EmbeddingColumnConfig info
nabinchha Dec 10, 2025
c73b183
make generation_type a descriminator field in inference params. add c…
nabinchha Dec 10, 2025
6899805
DRY out some stuff in field.py
nabinchha Dec 10, 2025
299479a
Merge branch 'main' into nmulepati/feat/support-embedding-generation
nabinchha Dec 11, 2025
0d61587
Merge branch 'main' into nmulepati/feat/support-embedding-generation
nabinchha Dec 11, 2025
8e91e95
Merge branch 'main' into nmulepati/feat/support-embedding-generation
nabinchha Dec 11, 2025
51dcffa
Update nomenclature. prompt tokens -> input tokens, completion tokens…
nabinchha Dec 11, 2025
7253898
Add nvidia-embedding and openai-embedding to default model configs
nabinchha Dec 12, 2025
0f21576
Merge branch 'main' into nmulepati/feat/support-embedding-generation
nabinchha Dec 12, 2025
9acf600
Fix typo in docs
nabinchha Dec 13, 2025
954d4d0
Merge branch 'main' into nmulepati/feat/support-embedding-generation
nabinchha Dec 13, 2025
c7176b9
Make generate collab notebooks
nabinchha Dec 13, 2025
b47495c
Merge branch 'main' into nmulepati/feat/support-embedding-generation
nabinchha Dec 15, 2025
a68beb2
Merge branch 'main' into nmulepati/feat/support-embedding-generation
nabinchha Dec 15, 2025
bed085f
fine-tune -> adjust
nabinchha Dec 15, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
16 changes: 16 additions & 0 deletions docs/concepts/columns.md
Original file line number Diff line number Diff line change
Expand Up @@ -64,6 +64,22 @@ Define scoring rubrics (relevance, accuracy, fluency, helpfulness) and the judge

Use judge columns for data quality filtering (e.g., keep only 4+ rated responses), A/B testing generation strategies, and quality monitoring over time.

### 🧬 Embedding Columns

Embedding columns generate vector embeddings (numerical representations) for text content using embedding models. These embeddings capture semantic meaning, enabling similarity search, clustering, and semantic analysis.

Specify a `target_column` containing text, and Data Designer generates embeddings for that content. The target column can contain either a single text string or a list of text strings in stringified JSON format—in the latter case, embeddings are generated for each text string in the list.

Common use cases:

- **Semantic search**: Generate embeddings for documents, then find similar content by vector similarity
- **Clustering**: Group similar texts based on embedding proximity
- **Recommendation systems**: Match content by semantic similarity
- **Anomaly detection**: Identify outliers in embedding space

!!! note "Embedding Models"
Embedding columns require an embedding model configured with `EmbeddingInferenceParams`. These models differ from chat completion models—they output vectors rather than text. The generation type is automatically determined by the inference parameters type.

### 🧩 Expression Columns

Expression columns handle simple transformations using **Jinja2 templates**—concatenate first and last names, calculate numerical totals, format date strings. No LLM overhead needed.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -130,6 +130,6 @@ The CLI will show which configuration files exist and ask for confirmation befor
## See Also

- **[Model Providers](model-providers.md)**: Learn about the `ModelProvider` class and provider configuration
- **[Model Configurations](model-configs.md)**: Learn about `ModelConfig` and `InferenceParameters`
- **[Model Configurations](model-configs.md)**: Learn about `ModelConfig`
- **[Default Model Settings](default-model-settings.md)**: Pre-configured providers and model settings included with Data Designer
- **[Quick Start Guide](../../quick-start.md)**: Get started with a simple example
147 changes: 147 additions & 0 deletions docs/concepts/models/inference-parameters.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,147 @@
# Inference Parameters

Inference parameters control how models generate responses during synthetic data generation. Data Designer provides two types of inference parameters: `ChatCompletionInferenceParams` for text/code/structured generation and `EmbeddingInferenceParams` for embedding generation.

## Overview

When you create a `ModelConfig`, you can specify inference parameters to fine-tune model behavior. These parameters control aspects like randomness (temperature), diversity (top_p), context size (max_tokens), and more. Data Designer supports both static values and dynamic distribution-based sampling for certain parameters.

## Chat Completion Inference Parameters

The `ChatCompletionInferenceParams` class controls how models generate text completions (for text, code, and structured data generation). It provides fine-grained control over generation behavior and supports both static values and dynamic distribution-based sampling.

!!! warning "InferenceParameters is Deprecated"
The `InferenceParameters` class is deprecated and will be removed in a future version. Use `ChatCompletionInferenceParams` instead. The old `InferenceParameters` class now shows a deprecation warning when used.

### Fields

| Field | Type | Required | Description |
|-------|------|----------|-------------|
| `temperature` | `float` or `Distribution` | No | Controls randomness in generation (0.0 to 2.0). Higher values = more creative/random |
| `top_p` | `float` or `Distribution` | No | Nucleus sampling parameter (0.0 to 1.0). Controls diversity by filtering low-probability tokens |
| `max_tokens` | `int` | No | Maximum number of tokens for the request, including both input and output tokens (≥ 1) |
| `max_parallel_requests` | `int` | No | Maximum concurrent API requests (default: 4, ≥ 1) |
| `timeout` | `int` | No | API request timeout in seconds (≥ 1) |
| `extra_body` | `dict[str, Any]` | No | Additional parameters to include in the API request body |

!!! note "Default Values"
If `temperature`, `top_p`, or `max_tokens` are not provided, the model provider's default values will be used. Different providers and models may have different defaults.

!!! tip "Controlling Reasoning Effort for GPT-OSS Models"
For gpt-oss models like `gpt-oss-20b` and `gpt-oss-120b`, you can control the reasoning effort using the `extra_body` parameter:

```python
from data_designer.essentials import ChatCompletionInferenceParams

# High reasoning effort (more thorough, slower)
inference_parameters = ChatCompletionInferenceParams(
extra_body={"reasoning_effort": "high"}
)

# Medium reasoning effort (balanced)
inference_parameters = ChatCompletionInferenceParams(
extra_body={"reasoning_effort": "medium"}
)

# Low reasoning effort (faster, less thorough)
inference_parameters = ChatCompletionInferenceParams(
extra_body={"reasoning_effort": "low"}
)
```

### Temperature and Top P Guidelines

- **Temperature**:
- `0.0-0.3`: Highly deterministic, focused outputs (ideal for structured/reasoning tasks)
- `0.4-0.7`: Balanced creativity and coherence (general purpose)
- `0.8-1.0`: Creative, diverse outputs (ideal for creative writing)
- `1.0+`: Highly random and experimental

- **Top P**:
- `0.1-0.5`: Very focused, only most likely tokens
- `0.6-0.9`: Balanced diversity
- `0.95-1.0`: Maximum diversity, including less likely tokens

!!! tip "Adjusting Temperature and Top P Together"
When tuning both parameters simultaneously, consider these combinations:

- **For deterministic/structured outputs**: Low temperature (`0.0-0.3`) + moderate-to-high top_p (`0.8-0.95`)
- The low temperature ensures focus, while top_p allows some token diversity
- **For balanced generation**: Moderate temperature (`0.5-0.7`) + high top_p (`0.9-0.95`)
- This is a good starting point for most use cases
- **For creative outputs**: Higher temperature (`0.8-1.0`) + high top_p (`0.95-1.0`)
- Both parameters work together to maximize diversity

**Avoid**: Setting both very low (overly restrictive) or adjusting both dramatically at once. When experimenting, adjust one parameter at a time to understand its individual effect.

## Distribution-Based Inference Parameters

For `temperature` and `top_p` in `ChatCompletionInferenceParams`, you can specify distributions instead of fixed values. This allows Data Designer to sample different values for each generation request, introducing controlled variability into your synthetic data.

### Uniform Distribution

Samples values uniformly between a low and high bound:

```python
from data_designer.essentials import (
ChatCompletionInferenceParams,
UniformDistribution,
UniformDistributionParams,
)

inference_params = ChatCompletionInferenceParams(
temperature=UniformDistribution(
params=UniformDistributionParams(low=0.7, high=1.0)
),
)
```

### Manual Distribution

Samples from a discrete set of values with optional weights:

```python
from data_designer.essentials import (
ChatCompletionInferenceParams,
ManualDistribution,
ManualDistributionParams,
)

# Equal probability for each value
inference_params = ChatCompletionInferenceParams(
temperature=ManualDistribution(
params=ManualDistributionParams(values=[0.5, 0.7, 0.9])
),
)

# Weighted probabilities (normalized automatically)
inference_params = ChatCompletionInferenceParams(
top_p=ManualDistribution(
params=ManualDistributionParams(
values=[0.8, 0.9, 0.95],
weights=[0.2, 0.5, 0.3] # 20%, 50%, 30% probability
)
),
)
```

## Embedding Inference Parameters

The `EmbeddingInferenceParams` class controls how models generate embeddings. This is used when working with embedding models for tasks like semantic search or similarity analysis.

### Fields

| Field | Type | Required | Description |
|-------|------|----------|-------------|
| `encoding_format` | `Literal["float", "base64"]` | No | Format of the embedding encoding (default: "float") |
| `dimensions` | `int` | No | Number of dimensions for the embedding |
| `max_parallel_requests` | `int` | No | Maximum concurrent API requests (default: 4, ≥ 1) |
| `timeout` | `int` | No | API request timeout in seconds (≥ 1) |
| `extra_body` | `dict[str, Any]` | No | Additional parameters to include in the API request body |


## See Also

- **[Model Configurations](model-configs.md)**: Learn about configuring model settings
- **[Model Providers](model-providers.md)**: Learn about configuring model providers
- **[Default Model Settings](default-model-settings.md)**: Pre-configured model settings included with Data Designer
Loading