Skip to content

Commit 2e50a27

Browse files
committed
cleaning up some slop
1 parent 8f1cac1 commit 2e50a27

File tree

3 files changed

+26
-130
lines changed

3 files changed

+26
-130
lines changed
Lines changed: 20 additions & 36 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
# data-designer-config
22

3-
Configuration layer for NVIDIA DataDesigner synthetic data generation framework.
3+
Configuration layer for NeMo Data Designer synthetic data generation framework.
44

55
This package provides the configuration API for defining synthetic data generation pipelines. It's a lightweight dependency that can be used standalone for configuration management.
66

@@ -10,55 +10,39 @@ This package provides the configuration API for defining synthetic data generati
1010
pip install data-designer-config
1111
```
1212

13-
## Features
14-
15-
- Configuration builder API (`DataDesignerConfigBuilder`)
16-
- Column configuration types (Sampler, LLM, Expression, Validation, etc.)
17-
- Model configurations with inference parameters
18-
- Seed dataset configuration
19-
- Constraint system for data generation
20-
- Plugin system for extensibility
21-
2213
## Usage
2314

2415
```python
25-
from data_designer.config import DataDesignerConfigBuilder, ModelConfig
16+
import data_designer.config as dd
2617

27-
# Create configuration builder
28-
builder = DataDesignerConfigBuilder(
18+
# Initialize config builder with model config(s)
19+
config_builder = dd.DataDesignerConfigBuilder(
2920
model_configs=[
30-
ModelConfig(
21+
dd.ModelConfig(
3122
alias="my-model",
3223
model="meta/llama-3-70b-instruct",
33-
inference_parameters={"temperature": 0.7}
34-
)
24+
inference_parameters={"temperature": 0.7},
25+
),
3526
]
3627
)
3728

3829
# Add columns
39-
builder.add_column(
40-
name="user_id",
41-
sampler_type="uuid",
42-
column_type="sampler"
30+
config_builder.add_column(
31+
dd.SamplerColumnConfig(
32+
name="user_id",
33+
sampler_type=dd.SamplerType.UUID,
34+
)
4335
)
44-
45-
builder.add_column(
46-
name="description",
47-
column_type="llm-text",
48-
prompt="Write a product description",
49-
model_alias="my-model"
36+
config_builder.add_column(
37+
dd.LLMTextColumnConfig(
38+
name="description",
39+
prompt="Write a product description",
40+
model_alias="my-model",
41+
)
5042
)
5143

5244
# Build configuration
53-
config = builder.build()
45+
config = config_builder.build()
5446
```
5547

56-
## Documentation
57-
58-
- [Full Documentation](https://nvidia.github.io/DataDesigner/)
59-
- [Configuration Guide](https://nvidia.github.io/DataDesigner/configuration/)
60-
- [API Reference](https://nvidia.github.io/DataDesigner/api/config/)
61-
62-
## License
63-
64-
Apache-2.0 - Copyright (c) 2025-2026 NVIDIA CORPORATION & AFFILIATES
48+
See main [README.md](../../README.md) for more information.
Lines changed: 3 additions & 38 deletions
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,8 @@
11
# data-designer-engine
22

3-
Generation engine for NVIDIA DataDesigner synthetic data generation framework.
3+
Generation engine for NeMo Data Designer synthetic data generation framework.
44

5-
This package contains the execution engine that powers DataDesigner. It depends on `data-designer-config` and includes heavy dependencies like pandas, numpy, and LLM integration via litellm.
5+
This package contains the execution engine that powers Data Designer. It depends on `data-designer-config` and includes heavy dependencies like pandas, numpy, and LLM integration via litellm.
66

77
## Installation
88

@@ -12,39 +12,4 @@ pip install data-designer-engine
1212

1313
This automatically installs `data-designer-config` as a dependency.
1414

15-
## Features
16-
17-
- DAG-based dataset generation orchestration
18-
- LLM integration via LiteLLM (supports 100+ providers)
19-
- Sophisticated sampling generators (Person, Entity, etc.)
20-
- Column validators (Python, SQL, Code, Remote)
21-
- Dataset profiling and analysis
22-
- Artifact storage and management
23-
24-
## Usage
25-
26-
```python
27-
from data_designer.config import DataDesignerConfig
28-
from data_designer.engine import compile_data_designer_config
29-
from data_designer.engine.dataset_builders import ColumnWiseDatasetBuilder
30-
31-
# Assuming you have a config from data-designer-config
32-
config = DataDesignerConfig(...)
33-
34-
# Compile configuration
35-
compiled_config = compile_data_designer_config(config)
36-
37-
# Create builder and generate data
38-
builder = ColumnWiseDatasetBuilder(compiled_config, resource_provider)
39-
result = builder.build(num_records=100)
40-
```
41-
42-
## Documentation
43-
44-
- [Full Documentation](https://nvidia.github.io/DataDesigner/)
45-
- [Engine Architecture](https://nvidia.github.io/DataDesigner/architecture/)
46-
- [API Reference](https://nvidia.github.io/DataDesigner/api/engine/)
47-
48-
## License
49-
50-
Apache-2.0 - Copyright (c) 2025-2026 NVIDIA CORPORATION & AFFILIATES
15+
See main [README.md](../../README.md) for more information.

packages/data-designer/README.md

Lines changed: 3 additions & 56 deletions
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,8 @@
11
# data-designer
22

3-
Complete NVIDIA DataDesigner framework for synthetic data generation.
3+
Complete NeMo Data Designer framework for synthetic data generation.
44

5-
This is the full installation including the CLI, interface layer, and all dependencies. For lightweight installations, consider `data-designer-config` (config only) or `data-designer-engine` (config + engine).
5+
This is the full installation including the CLI, interface layer, and all dependencies.
66

77
## Installation
88

@@ -15,57 +15,4 @@ This installs all three packages:
1515
- `data-designer-engine` - Generation engine
1616
- `data-designer` - CLI and interface
1717

18-
## Quick Start
19-
20-
```python
21-
from data_designer import DataDesigner
22-
23-
# Initialize
24-
dd = DataDesigner()
25-
26-
# Create configuration
27-
builder = dd.create_config_builder(
28-
model_configs=[
29-
dd.create_model_config(
30-
alias="nvidia-text",
31-
model="meta/llama-3-70b-instruct"
32-
)
33-
]
34-
)
35-
36-
# Configure data generation
37-
builder.add_column(name="id", sampler_type="uuid", column_type="sampler")
38-
builder.add_column(
39-
name="review",
40-
column_type="llm-text",
41-
prompt="Write a product review",
42-
model_alias="nvidia-text"
43-
)
44-
45-
# Generate data
46-
result = dd.generate(config=builder.build(), num_records=100)
47-
print(result.dataset)
48-
```
49-
50-
## CLI Usage
51-
52-
```bash
53-
# Interactive configuration
54-
data-designer
55-
56-
# List available models
57-
data-designer models list
58-
59-
# Configure providers
60-
data-designer providers add nvidia --api-key YOUR_KEY
61-
```
62-
63-
## Documentation
64-
65-
- [Full Documentation](https://nvidia.github.io/DataDesigner/)
66-
- [Tutorials](https://nvidia.github.io/DataDesigner/tutorials/)
67-
- [Recipes](https://nvidia.github.io/DataDesigner/recipes/)
68-
69-
## License
70-
71-
Apache-2.0 - Copyright (c) 2025-2026 NVIDIA CORPORATION & AFFILIATES
18+
See main [README.md](../../README.md) for more information.

0 commit comments

Comments
 (0)