-
Notifications
You must be signed in to change notification settings - Fork 52
feat: support native embedding generation #106
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from 44 commits
Commits
Show all changes
52 commits
Select commit
Hold shift + click to select a range
dc041f7
Add generation type to ModelConfig
nabinchha 0d6b830
pass tests
nabinchha 254fd8a
added generate_text_embeddings
nabinchha 1126ea1
tests
nabinchha 744bc8f
remove sensitive=True old artifact no longer needed
nabinchha b913f8d
Slight refactor
nabinchha 052db7a
slight refactor
nabinchha 5504c8d
Added embedding generator
nabinchha 4b6f877
chunk_separator -> chunk_pattern
nabinchha 04fc0f3
update tests
nabinchha 26d6da1
rename for consistency
nabinchha 6facbd2
Restructure InferenceParameters -> CompletionInferenceParameters, Bas…
nabinchha 2c1b267
Remove purpose from consolidated kwargs
nabinchha 4b1492b
WithModelConfiguration.inference_parameters should should be typed wi…
nabinchha c445caf
Type as WithModelGeneration
nabinchha 4b8aa2b
Add image generation modality
nabinchha 2c5933f
update return type for generate_kwargs
nabinchha c6c29d4
make generation_type a field of ModelConfig as opposed to a prop reso…
nabinchha 06a724b
remove regex based chunking from embedding generator
nabinchha bbb6a83
Merge branch 'main' into nmulepati/feat/support-embedding-generation
nabinchha b9455d4
Remove image generation for now
nabinchha e5c0b7a
more tests and updates
nabinchha 6460c6b
column_type_is_llm_generated -> column_type_is_model_generated
nabinchha e294b40
change set to list: fix flaky tests
nabinchha 4e697ec
CompletionInferenceParameters -> ChatCompletionInferenceParameters fo…
nabinchha d650398
Update docs
nabinchha 4601e3f
fix deprecation warning originating from cli model settings
nabinchha 65ba5bf
update display of inference parameters in cli list
nabinchha 0917d6e
save prog on inference parameter
nabinchha 1aa74dd
updates for the ocnfig builder
nabinchha d72b204
update cli readme
nabinchha 4c53f1f
update cli for inference parmeters
nabinchha bd63f91
Merge branch 'main' into nmulepati/feat/support-embedding-generation
nabinchha 3413799
update inference parameter names
nabinchha 5425df5
flip order of vars
nabinchha 7723764
WithCompletion -> WithChatCompletion
nabinchha 9ae1bb8
specify InferenceParamsT
nabinchha 3aa3326
Update columns.md with EmbeddingColumnConfig info
nabinchha c73b183
make generation_type a descriminator field in inference params. add c…
nabinchha 6899805
DRY out some stuff in field.py
nabinchha 299479a
Merge branch 'main' into nmulepati/feat/support-embedding-generation
nabinchha 0d61587
Merge branch 'main' into nmulepati/feat/support-embedding-generation
nabinchha 8e91e95
Merge branch 'main' into nmulepati/feat/support-embedding-generation
nabinchha 51dcffa
Update nomenclature. prompt tokens -> input tokens, completion tokens…
nabinchha 7253898
Add nvidia-embedding and openai-embedding to default model configs
nabinchha 0f21576
Merge branch 'main' into nmulepati/feat/support-embedding-generation
nabinchha 9acf600
Fix typo in docs
nabinchha 954d4d0
Merge branch 'main' into nmulepati/feat/support-embedding-generation
nabinchha c7176b9
Make generate collab notebooks
nabinchha b47495c
Merge branch 'main' into nmulepati/feat/support-embedding-generation
nabinchha a68beb2
Merge branch 'main' into nmulepati/feat/support-embedding-generation
nabinchha bed085f
fine-tune -> adjust
nabinchha File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,147 @@ | ||
| # Inference Parameters | ||
|
|
||
| Inference parameters control how models generate responses during synthetic data generation. Data Designer provides two types of inference parameters: `ChatCompletionInferenceParams` for text/code/structured generation and `EmbeddingInferenceParams` for embedding generation. | ||
|
|
||
| ## Overview | ||
|
|
||
| When you create a `ModelConfig`, you can specify inference parameters to fine-tune model behavior. These parameters control aspects like randomness (temperature), diversity (top_p), context size (max_tokens), and more. Data Designer supports both static values and dynamic distribution-based sampling for certain parameters. | ||
nabinchha marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
|
|
||
| ## Chat Completion Inference Parameters | ||
|
|
||
| The `ChatCompletionInferenceParams` class controls how models generate text completions (for text, code, and structured data generation). It provides fine-grained control over generation behavior and supports both static values and dynamic distribution-based sampling. | ||
|
|
||
| !!! warning "InferenceParameters is Deprecated" | ||
| The `InferenceParameters` class is deprecated and will be removed in a future version. Use `ChatCompletionInferenceParams` instead. The old `InferenceParameters` class now shows a deprecation warning when used. | ||
|
|
||
| ### Fields | ||
|
|
||
| | Field | Type | Required | Description | | ||
| |-------|------|----------|-------------| | ||
| | `temperature` | `float` or `Distribution` | No | Controls randomness in generation (0.0 to 2.0). Higher values = more creative/random | | ||
| | `top_p` | `float` or `Distribution` | No | Nucleus sampling parameter (0.0 to 1.0). Controls diversity by filtering low-probability tokens | | ||
| | `max_tokens` | `int` | No | Maximum number of tokens for the request, including both input and output tokens (≥ 1) | | ||
| | `max_parallel_requests` | `int` | No | Maximum concurrent API requests (default: 4, ≥ 1) | | ||
| | `timeout` | `int` | No | API request timeout in seconds (≥ 1) | | ||
| | `extra_body` | `dict[str, Any]` | No | Additional parameters to include in the API request body | | ||
|
|
||
| !!! note "Default Values" | ||
| If `temperature`, `top_p`, or `max_tokens` are not provided, the model provider's default values will be used. Different providers and models may have different defaults. | ||
|
|
||
| !!! tip "Controlling Reasoning Effort for GPT-OSS Models" | ||
| For gpt-oss models like `gpt-oss-20b` and `gpt-oss-120b`, you can control the reasoning effort using the `extra_body` parameter: | ||
|
|
||
| ```python | ||
| from data_designer.essentials import ChatCompletionInferenceParams | ||
|
|
||
| # High reasoning effort (more thorough, slower) | ||
| inference_parameters = ChatCompletionInferenceParams( | ||
| extra_body={"reasoning_effort": "high"} | ||
| ) | ||
|
|
||
| # Medium reasoning effort (balanced) | ||
| inference_parameters = ChatCompletionInferenceParams( | ||
| extra_body={"reasoning_effort": "medium"} | ||
| ) | ||
|
|
||
| # Low reasoning effort (faster, less thorough) | ||
| inference_parameters = ChatCompletionInferenceParams( | ||
| extra_body={"reasoning_effort": "low"} | ||
| ) | ||
| ``` | ||
|
|
||
| ### Temperature and Top P Guidelines | ||
|
|
||
| - **Temperature**: | ||
| - `0.0-0.3`: Highly deterministic, focused outputs (ideal for structured/reasoning tasks) | ||
| - `0.4-0.7`: Balanced creativity and coherence (general purpose) | ||
| - `0.8-1.0`: Creative, diverse outputs (ideal for creative writing) | ||
| - `1.0+`: Highly random and experimental | ||
|
|
||
| - **Top P**: | ||
| - `0.1-0.5`: Very focused, only most likely tokens | ||
| - `0.6-0.9`: Balanced diversity | ||
| - `0.95-1.0`: Maximum diversity, including less likely tokens | ||
|
|
||
| !!! tip "Adjusting Temperature and Top P Together" | ||
| When tuning both parameters simultaneously, consider these combinations: | ||
|
|
||
| - **For deterministic/structured outputs**: Low temperature (`0.0-0.3`) + moderate-to-high top_p (`0.8-0.95`) | ||
| - The low temperature ensures focus, while top_p allows some token diversity | ||
| - **For balanced generation**: Moderate temperature (`0.5-0.7`) + high top_p (`0.9-0.95`) | ||
| - This is a good starting point for most use cases | ||
| - **For creative outputs**: Higher temperature (`0.8-1.0`) + high top_p (`0.95-1.0`) | ||
| - Both parameters work together to maximize diversity | ||
|
|
||
| **Avoid**: Setting both very low (overly restrictive) or adjusting both dramatically at once. When experimenting, adjust one parameter at a time to understand its individual effect. | ||
|
|
||
| ## Distribution-Based Inference Parameters | ||
|
|
||
| For `temperature` and `top_p` in `ChatCompletionInferenceParams`, you can specify distributions instead of fixed values. This allows Data Designer to sample different values for each generation request, introducing controlled variability into your synthetic data. | ||
|
|
||
| ### Uniform Distribution | ||
|
|
||
| Samples values uniformly between a low and high bound: | ||
|
|
||
| ```python | ||
| from data_designer.essentials import ( | ||
| ChatCompletionInferenceParams, | ||
| UniformDistribution, | ||
| UniformDistributionParams, | ||
| ) | ||
|
|
||
| inference_params = ChatCompletionInferenceParams( | ||
| temperature=UniformDistribution( | ||
| params=UniformDistributionParams(low=0.7, high=1.0) | ||
| ), | ||
| ) | ||
| ``` | ||
|
|
||
| ### Manual Distribution | ||
|
|
||
| Samples from a discrete set of values with optional weights: | ||
|
|
||
| ```python | ||
| from data_designer.essentials import ( | ||
| ChatCompletionInferenceParams, | ||
| ManualDistribution, | ||
| ManualDistributionParams, | ||
| ) | ||
|
|
||
| # Equal probability for each value | ||
| inference_params = ChatCompletionInferenceParams( | ||
| temperature=ManualDistribution( | ||
| params=ManualDistributionParams(values=[0.5, 0.7, 0.9]) | ||
| ), | ||
| ) | ||
|
|
||
| # Weighted probabilities (normalized automatically) | ||
| inference_params = ChatCompletionInferenceParams( | ||
| top_p=ManualDistribution( | ||
| params=ManualDistributionParams( | ||
| values=[0.8, 0.9, 0.95], | ||
| weights=[0.2, 0.5, 0.3] # 20%, 50%, 30% probability | ||
| ) | ||
| ), | ||
| ) | ||
| ``` | ||
|
|
||
| ## Embedding Inference Parameters | ||
|
|
||
| The `EmbeddingInferenceParams` class controls how models generate embeddings. This is used when working with embedding models for tasks like semantic search or similarity analysis. | ||
|
|
||
| ### Fields | ||
|
|
||
| | Field | Type | Required | Description | | ||
| |-------|------|----------|-------------| | ||
| | `encoding_format` | `Literal["float", "base64"]` | No | Format of the embedding encoding (default: "float") | | ||
| | `dimensions` | `int` | No | Number of dimensions for the embedding | | ||
| | `max_parallel_requests` | `int` | No | Maximum concurrent API requests (default: 4, ≥ 1) | | ||
| | `timeout` | `int` | No | API request timeout in seconds (≥ 1) | | ||
| | `extra_body` | `dict[str, Any]` | No | Additional parameters to include in the API request body | | ||
|
|
||
|
|
||
| ## See Also | ||
|
|
||
| - **[Model Configurations](model-configs.md)**: Learn about configuring model settings | ||
| - **[Model Providers](model-providers.md)**: Learn about configuring model providers | ||
| - **[Default Model Settings](default-model-settings.md)**: Pre-configured model settings included with Data Designer | ||
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.