Skip to content

Commit 297276b

Browse files
dittopsclaude
andcommitted
feat: Add classifications API and enhance embeddings with new parameters
- Add Classifications resource for text classification with label-score results - Extend Embeddings with modality, priority, chunking, cache_options parameters - Add ClassifyResponse, ClassifyLabelScore, ClassifyUsage models - Add text and chunk_text fields to EmbeddingData for chunking support - Add comprehensive API documentation in docs/ folder - Add examples for embeddings and classifications in inference_example.py - Add unit tests for all new functionality - Update README with Inference API section and links to documentation Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
1 parent cbf594b commit 297276b

File tree

15 files changed

+2685
-14
lines changed

15 files changed

+2685
-14
lines changed

README.md

Lines changed: 152 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5,11 +5,22 @@ Official Python SDK for the BudAI Foundry Platform. Build, manage, and execute D
55
## Features
66

77
- **Python SDK** - Full-featured client library for the BudAI Foundry API
8+
- **OpenAI-Compatible Inference** - Chat completions, embeddings, and classifications
89
- **CLI Tool** - Command-line interface for pipeline operations
910
- **Pipeline DSL** - Pythonic way to define DAG pipelines
1011
- **Async Support** - Both sync and async clients available
1112
- **Type Safety** - Full type hints and Pydantic models
1213

14+
## Documentation
15+
16+
- [Quick Start Guide](docs/quickstart.md)
17+
- [Configuration & Authentication](docs/configuration.md)
18+
- **API Reference**
19+
- [Chat Completions](docs/api/chat.md)
20+
- [Embeddings](docs/api/embeddings.md)
21+
- [Classifications](docs/api/classifications.md)
22+
- [Models](docs/api/models.md)
23+
1324
## Installation
1425

1526
```bash
@@ -254,6 +265,147 @@ action = client.actions.get("log")
254265
print(f"Parameters: {action.params}")
255266
```
256267

268+
---
269+
270+
## Inference API
271+
272+
The SDK provides OpenAI-compatible inference endpoints for chat, embeddings, and classifications.
273+
274+
> See [examples/inference_example.py](examples/inference_example.py) for complete working examples.
275+
276+
### Chat Completions
277+
278+
Create chat completions with streaming support. [Full documentation](docs/api/chat.md)
279+
280+
```python
281+
from bud import BudClient
282+
283+
client = BudClient(api_key="your-api-key")
284+
285+
# Basic chat completion
286+
response = client.chat.completions.create(
287+
model="gpt-4",
288+
messages=[
289+
{"role": "system", "content": "You are a helpful assistant."},
290+
{"role": "user", "content": "Hello!"}
291+
],
292+
temperature=0.7,
293+
max_tokens=100,
294+
)
295+
print(response.choices[0].message.content)
296+
297+
# Streaming
298+
stream = client.chat.completions.create(
299+
model="gpt-4",
300+
messages=[{"role": "user", "content": "Count to 5"}],
301+
stream=True
302+
)
303+
for chunk in stream:
304+
if chunk.choices[0].delta.content:
305+
print(chunk.choices[0].delta.content, end="")
306+
```
307+
308+
### Embeddings
309+
310+
Create text, image, or audio embeddings with chunking and caching support. [Full documentation](docs/api/embeddings.md)
311+
312+
```python
313+
# Basic embedding
314+
response = client.embeddings.create(
315+
model="bge-m3",
316+
input="Hello, world!"
317+
)
318+
print(f"Dimensions: {len(response.data[0].embedding)}")
319+
320+
# Batch embeddings
321+
response = client.embeddings.create(
322+
model="bge-m3",
323+
input=["First text", "Second text", "Third text"]
324+
)
325+
326+
# With caching
327+
response = client.embeddings.create(
328+
model="bge-m3",
329+
input="Frequently requested text",
330+
cache_options={"enabled": "on", "max_age_s": 3600}
331+
)
332+
333+
# With chunking for long documents
334+
response = client.embeddings.create(
335+
model="bge-m3",
336+
input="Very long document...",
337+
chunking={"strategy": "sentence", "chunk_size": 512}
338+
)
339+
```
340+
341+
**Embedding Parameters:**
342+
343+
| Parameter | Type | Description |
344+
|-----------|------|-------------|
345+
| `model` | `str` | Model ID (required) |
346+
| `input` | `str \| list[str]` | Text to embed (required) |
347+
| `encoding_format` | `str` | `"float"` or `"base64"` |
348+
| `modality` | `str` | `"text"`, `"image"`, or `"audio"` |
349+
| `dimensions` | `int` | Output dimensions (0 = full) |
350+
| `priority` | `str` | `"high"`, `"normal"`, or `"low"` |
351+
| `include_input` | `bool` | Return original text in response |
352+
| `chunking` | `dict` | Chunking configuration |
353+
| `cache_options` | `dict` | Cache settings |
354+
355+
### Classifications
356+
357+
Classify text using deployed classifier models. [Full documentation](docs/api/classifications.md)
358+
359+
```python
360+
# Single classification
361+
response = client.classifications.create(
362+
model="finbert",
363+
input=["The stock market rallied today with strong gains."]
364+
)
365+
366+
for label_score in response.data[0]:
367+
print(f"{label_score.label}: {label_score.score:.2%}")
368+
# Output: positive: 92.84%, neutral: 5.06%, negative: 2.10%
369+
370+
# Batch classification
371+
response = client.classifications.create(
372+
model="finbert",
373+
input=[
374+
"Company reports record profits.",
375+
"Market crash leads to losses.",
376+
"Trading volume steady today."
377+
],
378+
priority="high"
379+
)
380+
381+
for i, result in enumerate(response.data):
382+
top = max(result, key=lambda x: x.score)
383+
print(f"Text {i+1}: {top.label} ({top.score:.1%})")
384+
```
385+
386+
**Classification Parameters:**
387+
388+
| Parameter | Type | Description |
389+
|-----------|------|-------------|
390+
| `input` | `list[str]` | Texts to classify (required) |
391+
| `model` | `str` | Classifier model ID |
392+
| `raw_scores` | `bool` | Return raw scores vs normalized |
393+
| `priority` | `str` | `"high"`, `"normal"`, or `"low"` |
394+
395+
### List Models
396+
397+
```python
398+
# List all available models
399+
models = client.models.list()
400+
for model in models.data:
401+
print(f"{model.id} - {model.owned_by}")
402+
403+
# Get specific model info
404+
model = client.models.retrieve("gpt-4")
405+
```
406+
407+
---
408+
257409
## Pipeline DSL
258410

259411
Define pipelines using Python:

0 commit comments

Comments
 (0)