Skip to content

Commit bc2ed96

Browse files
authored
feat: new cli for model provider and config management (#35)
1 parent 55c21ef commit bc2ed96

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

56 files changed

+5806
-350
lines changed

pyproject.toml

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -29,8 +29,10 @@ dependencies = [
2929
"pygments>=2.19.2",
3030
"pyyaml>=6.0.1",
3131
"python-json-logger==2.0.7",
32+
"prompt-toolkit>=3.0.0",
3233
"requests<3,>=2.32.2",
3334
"rich>=13.7.1",
35+
"typer>=0.12.0",
3436
"anyascii>=0.3.3,<1.0",
3537
"boto3==1.35.74",
3638
"datasets>=4.0.0",
@@ -52,6 +54,9 @@ dependencies = [
5254
"ruff==0.12.3",
5355
]
5456

57+
[project.scripts]
58+
data-designer = "data_designer.cli:main"
59+
5560
[dependency-groups]
5661
dev = [
5762
"jsonpath-ng==1.5.3",

src/data_designer/cli/README.md

Lines changed: 236 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,236 @@
1+
# 🎨 NeMo Data Designer CLI
2+
3+
This directory contains the Command-Line Interface (CLI) for configuring model providers and model configurations used in Data Designer.
4+
5+
## Overview
6+
7+
The CLI provides an interactive interface for managing:
8+
- **Model Providers**: LLM API endpoints (NVIDIA, OpenAI, Anthropic, custom providers)
9+
- **Model Configs**: Specific model configurations with inference parameters
10+
11+
Configuration files are stored in `~/.data-designer/` by default and can be referenced by Data Designer workflows.
12+
13+
## Architecture
14+
15+
The CLI follows a **layered architecture** pattern, separating concerns into distinct layers:
16+
17+
```
18+
┌─────────────────────────────────────────────────────────────┐
19+
│ Commands │
20+
│ Entry points for CLI commands (list, providers, models) │
21+
└─────────────────────────────────────────────────────────────┘
22+
23+
24+
┌─────────────────────────────────────────────────────────────┐
25+
│ Controllers │
26+
│ Orchestrate user workflows and coordinate between layers │
27+
└─────────────────────────────────────────────────────────────┘
28+
29+
┌───────────────────┴───────────────────┐
30+
▼ ▼
31+
┌──────────────────┐ ┌──────────────────┐
32+
│ Services │ │ Forms │
33+
│ Business logic │ │ Interactive UI │
34+
└──────────────────┘ └──────────────────┘
35+
36+
37+
┌──────────────────┐
38+
│ Repositories │
39+
│ Data persistence│
40+
└──────────────────┘
41+
```
42+
43+
### Layer Responsibilities
44+
45+
#### 1. **Commands** (`commands/`)
46+
- **Purpose**: Define CLI command entry points using Typer
47+
- **Responsibilities**:
48+
- Parse command-line arguments and options
49+
- Initialize controllers with appropriate configuration
50+
- Handle top-level error reporting
51+
- **Files**:
52+
- `list.py`: List current configurations
53+
- `models.py`: Configure models
54+
- `providers.py`: Configure providers
55+
- `reset.py`: Reset/delete configurations
56+
57+
#### 2. **Controllers** (`controllers/`)
58+
- **Purpose**: Orchestrate user workflows and coordinate between services, forms, and UI
59+
- **Responsibilities**:
60+
- Implement the main workflow logic (add, update, delete, etc.)
61+
- Coordinate between services and interactive forms
62+
- Handle user navigation and session state
63+
- Manage associated resource deletion (e.g., deleting models when provider is deleted)
64+
- **Files**:
65+
- `model_controller.py`: Orchestrates model configuration workflows
66+
- `provider_controller.py`: Orchestrates provider configuration workflows
67+
68+
**Key Features**:
69+
- **Associated Resource Management**: When deleting a provider, the controller checks for associated models and prompts the user to delete them together
70+
- **Interactive Navigation**: Supports add/update/delete/delete_all operations with user-friendly menus
71+
72+
#### 3. **Services** (`services/`)
73+
- **Purpose**: Implement business logic and enforce domain rules
74+
- **Responsibilities**:
75+
- Validate business rules (e.g., unique names, required fields)
76+
- Implement CRUD operations with validation
77+
- Coordinate between multiple repositories when needed
78+
- Handle default management (e.g., default provider selection)
79+
- **Files**:
80+
- `model_service.py`: Model configuration business logic
81+
- `provider_service.py`: Provider business logic
82+
83+
**Key Methods**:
84+
- `list_all()`: Get all configured items
85+
- `get_by_*()`: Retrieve specific items
86+
- `add()`: Add new item with validation
87+
- `update()`: Update existing item
88+
- `delete()`: Delete single item
89+
- `delete_by_aliases()`: Batch delete (models only)
90+
- `find_by_provider()`: Find models by provider (models only)
91+
- `set_default()`, `get_default()`: Manage default provider (providers only)
92+
93+
#### 4. **Repositories** (`repositories/`)
94+
- **Purpose**: Handle data persistence (YAML file I/O)
95+
- **Responsibilities**:
96+
- Load configuration from YAML files
97+
- Save configuration to YAML files
98+
- Check file existence
99+
- Delete configuration files
100+
- **Files**:
101+
- `base.py`: Abstract base repository with common operations
102+
- `model_repository.py`: Model configuration persistence
103+
- `provider_repository.py`: Provider persistence
104+
105+
**Base Repository Pattern**:
106+
```python
107+
class ConfigRepository(ABC, Generic[T]):
108+
def load(self) -> T | None: ...
109+
def save(self, config: T) -> None: ...
110+
def exists(self) -> bool: ...
111+
def delete(self) -> None: ...
112+
```
113+
114+
#### 5. **Forms** (`forms/`)
115+
- **Purpose**: Interactive form-based data collection from users
116+
- **Responsibilities**:
117+
- Define form fields with validation
118+
- Collect user input interactively
119+
- Support navigation (back, cancel)
120+
- Build configuration objects from form data
121+
- **Files**:
122+
- `builder.py`: Abstract form builder base
123+
- `field.py`: Form field types (TextField, SelectField, NumericField)
124+
- `form.py`: Form container and prompt orchestration
125+
- `model_builder.py`: Interactive model configuration builder
126+
- `provider_builder.py`: Interactive provider configuration builder
127+
128+
**Form Features**:
129+
- Field-level validation
130+
- Auto-completion support
131+
- History navigation (arrow keys)
132+
- Default value handling
133+
- Back navigation support
134+
135+
#### 6. **UI Utilities** (`ui.py`)
136+
- **Purpose**: User interface utilities for terminal output and input
137+
- **Responsibilities**:
138+
- Interactive menus with arrow key navigation
139+
- Text input prompts with validation
140+
- Confirmation dialogs
141+
- Styled output (success, error, warning, info)
142+
- Configuration preview displays
143+
- **Key Functions**:
144+
- `select_with_arrows()`: Interactive arrow-key menu
145+
- `prompt_text_input()`: Text input with validation and completion
146+
- `confirm_action()`: Yes/no confirmation
147+
- `print_*()`: Styled console output
148+
- `display_config_preview()`: YAML preview with syntax highlighting
149+
150+
151+
## Configuration Files
152+
153+
The CLI manages two YAML configuration files:
154+
155+
### `~/.data-designer/model_providers.yaml`
156+
157+
Stores model provider configurations (API endpoints):
158+
159+
```yaml
160+
providers:
161+
- name: nvidia
162+
endpoint: https://integrate.api.nvidia.com/v1
163+
provider_type: openai
164+
api_key: NVIDIA_API_KEY
165+
- name: openai
166+
endpoint: https://api.openai.com/v1
167+
provider_type: openai
168+
api_key: OPENAI_API_KEY
169+
default: nvidia
170+
```
171+
172+
### `~/.data-designer/model_configs.yaml`
173+
174+
Stores model configurations:
175+
176+
```yaml
177+
model_configs:
178+
- alias: llama3-70b
179+
model: meta/llama-3.1-70b-instruct
180+
provider: nvidia
181+
inference_parameters:
182+
temperature: 0.7
183+
top_p: 0.9
184+
max_tokens: 2048
185+
max_parallel_requests: 4
186+
- alias: gpt-4
187+
model: gpt-4-turbo
188+
provider: openai
189+
inference_parameters:
190+
temperature: 0.8
191+
top_p: 0.95
192+
max_tokens: 4096
193+
```
194+
195+
## Usage Examples
196+
197+
### Configure Providers
198+
199+
```bash
200+
# Interactive provider configuration
201+
data-designer config providers
202+
203+
# Options:
204+
# - Add a new provider (predefined: nvidia, openai, anthropic, or custom)
205+
# - Update an existing provider
206+
# - Delete a provider (with associated model cleanup)
207+
# - Delete all providers
208+
# - Change default provider
209+
```
210+
211+
### Configure Models
212+
213+
```bash
214+
# Interactive model configuration
215+
data-designer config models
216+
217+
# Options:
218+
# - Add a new model
219+
# - Update an existing model
220+
# - Delete a model
221+
# - Delete all models
222+
```
223+
224+
### List Configurations
225+
226+
```bash
227+
# Display current configurations
228+
data-designer config list
229+
```
230+
231+
### Reset Configurations
232+
233+
```bash
234+
# Delete configuration files (with confirmation)
235+
data-designer config reset
236+
```

src/data_designer/cli/__init__.py

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
2+
# SPDX-License-Identifier: Apache-2.0
3+
4+
from data_designer.cli.main import app, main
5+
6+
__all__ = ["app", "main"]
Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,2 @@
1+
# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
2+
# SPDX-License-Identifier: Apache-2.0
Lines changed: 130 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,130 @@
1+
# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
2+
# SPDX-License-Identifier: Apache-2.0
3+
4+
from rich.table import Table
5+
6+
from data_designer.cli.repositories.model_repository import ModelRepository
7+
from data_designer.cli.repositories.provider_repository import ProviderRepository
8+
from data_designer.cli.ui import console, print_error, print_header, print_info, print_warning
9+
from data_designer.config.utils.constants import DATA_DESIGNER_HOME_DIR, NordColor
10+
11+
12+
def list_command() -> None:
13+
"""List current Data Designer configurations.
14+
15+
Returns:
16+
None
17+
"""
18+
# Determine config directory
19+
print_header("Data Designer Configurations")
20+
print_info(f"Configuration directory: {DATA_DESIGNER_HOME_DIR}")
21+
console.print()
22+
23+
# Display providers
24+
display_providers(ProviderRepository(DATA_DESIGNER_HOME_DIR))
25+
display_models(ModelRepository(DATA_DESIGNER_HOME_DIR))
26+
27+
28+
def display_providers(provider_repo: ProviderRepository) -> None:
29+
"""Load and display model providers.
30+
31+
Args:
32+
provider_repo: Provider repository
33+
34+
Returns:
35+
None
36+
"""
37+
try:
38+
provider_registry = provider_repo.load()
39+
40+
if not provider_registry:
41+
print_warning("Providers have not been configured. Run 'data-designer config providers' to configure them.")
42+
console.print()
43+
return
44+
45+
# Display as table
46+
table = Table(title="Model Providers", border_style=NordColor.NORD8.value)
47+
table.add_column("Name", style=NordColor.NORD14.value, no_wrap=True)
48+
table.add_column("Endpoint", style=NordColor.NORD4.value)
49+
table.add_column("Type", style=NordColor.NORD9.value, no_wrap=True)
50+
table.add_column("API Key", style=NordColor.NORD7.value)
51+
table.add_column("Default", style=NordColor.NORD13.value, justify="center")
52+
53+
default_name = provider_registry.default or provider_registry.providers[0].name
54+
55+
for provider in provider_registry.providers:
56+
is_default = "✓" if provider.name == default_name else ""
57+
api_key_display = provider.api_key or "(not set)"
58+
59+
# Mask actual API keys (keep env var names visible)
60+
if provider.api_key and not provider.api_key.isupper():
61+
api_key_display = "***" + provider.api_key[-4:] if len(provider.api_key) > 4 else "***"
62+
63+
table.add_row(
64+
provider.name,
65+
provider.endpoint,
66+
provider.provider_type,
67+
api_key_display,
68+
is_default,
69+
)
70+
71+
console.print(table)
72+
console.print()
73+
except Exception as e:
74+
print_error(f"Error loading provider configuration: {e}")
75+
console.print()
76+
77+
78+
def display_models(model_repo: ModelRepository) -> None:
79+
"""Load and display model configurations.
80+
81+
Args:
82+
model_repo: Model repository
83+
84+
Returns:
85+
None
86+
"""
87+
try:
88+
registry = model_repo.load()
89+
90+
if not registry:
91+
print_warning("Models have not been configured. Run 'data-designer config models' to configure them.")
92+
console.print()
93+
return
94+
95+
# Display as table
96+
table = Table(title="Model Configurations", border_style=NordColor.NORD8.value)
97+
table.add_column("Alias", style=NordColor.NORD14.value, no_wrap=True)
98+
table.add_column("Model ID", style=NordColor.NORD4.value)
99+
table.add_column("Provider", style=NordColor.NORD9.value, no_wrap=True)
100+
table.add_column("Temperature", style=NordColor.NORD15.value, justify="right")
101+
table.add_column("Top P", style=NordColor.NORD15.value, justify="right")
102+
table.add_column("Max Tokens", style=NordColor.NORD15.value, justify="right")
103+
104+
for mc in registry.model_configs:
105+
# Handle distribution-based parameters
106+
temp_display = (
107+
f"{mc.inference_parameters.temperature:.2f}"
108+
if isinstance(mc.inference_parameters.temperature, (int, float))
109+
else "dist"
110+
)
111+
top_p_display = (
112+
f"{mc.inference_parameters.top_p:.2f}"
113+
if isinstance(mc.inference_parameters.top_p, (int, float))
114+
else "dist"
115+
)
116+
117+
table.add_row(
118+
mc.alias,
119+
mc.model,
120+
mc.provider or "(default)",
121+
temp_display,
122+
top_p_display,
123+
str(mc.inference_parameters.max_tokens) if mc.inference_parameters.max_tokens else "(none)",
124+
)
125+
126+
console.print(table)
127+
console.print()
128+
except Exception as e:
129+
print_error(f"Error loading model configuration: {e}")
130+
console.print()

0 commit comments

Comments
 (0)