@@ -5,11 +5,22 @@ Official Python SDK for the BudAI Foundry Platform. Build, manage, and execute D
55## Features
66
77- ** Python SDK** - Full-featured client library for the BudAI Foundry API
8+ - ** OpenAI-Compatible Inference** - Chat completions, embeddings, and classifications
89- ** CLI Tool** - Command-line interface for pipeline operations
910- ** Pipeline DSL** - Pythonic way to define DAG pipelines
1011- ** Async Support** - Both sync and async clients available
1112- ** Type Safety** - Full type hints and Pydantic models
1213
14+ ## Documentation
15+
16+ - [ Quick Start Guide] ( docs/quickstart.md )
17+ - [ Configuration & Authentication] ( docs/configuration.md )
18+ - ** API Reference**
19+ - [ Chat Completions] ( docs/api/chat.md )
20+ - [ Embeddings] ( docs/api/embeddings.md )
21+ - [ Classifications] ( docs/api/classifications.md )
22+ - [ Models] ( docs/api/models.md )
23+
1324## Installation
1425
1526``` bash
@@ -254,6 +265,147 @@ action = client.actions.get("log")
254265print (f " Parameters: { action.params} " )
255266```
256267
268+ ---
269+
270+ ## Inference API
271+
272+ The SDK provides OpenAI-compatible inference endpoints for chat, embeddings, and classifications.
273+
274+ > See [ examples/inference_example.py] ( examples/inference_example.py ) for complete working examples.
275+
276+ ### Chat Completions
277+
278+ Create chat completions with streaming support. [ Full documentation] ( docs/api/chat.md )
279+
280+ ``` python
281+ from bud import BudClient
282+
283+ client = BudClient(api_key = " your-api-key" )
284+
285+ # Basic chat completion
286+ response = client.chat.completions.create(
287+ model = " gpt-4" ,
288+ messages = [
289+ {" role" : " system" , " content" : " You are a helpful assistant." },
290+ {" role" : " user" , " content" : " Hello!" }
291+ ],
292+ temperature = 0.7 ,
293+ max_tokens = 100 ,
294+ )
295+ print (response.choices[0 ].message.content)
296+
297+ # Streaming
298+ stream = client.chat.completions.create(
299+ model = " gpt-4" ,
300+ messages = [{" role" : " user" , " content" : " Count to 5" }],
301+ stream = True
302+ )
303+ for chunk in stream:
304+ if chunk.choices[0 ].delta.content:
305+ print (chunk.choices[0 ].delta.content, end = " " )
306+ ```
307+
308+ ### Embeddings
309+
310+ Create text, image, or audio embeddings with chunking and caching support. [ Full documentation] ( docs/api/embeddings.md )
311+
312+ ``` python
313+ # Basic embedding
314+ response = client.embeddings.create(
315+ model = " bge-m3" ,
316+ input = " Hello, world!"
317+ )
318+ print (f " Dimensions: { len (response.data[0 ].embedding)} " )
319+
320+ # Batch embeddings
321+ response = client.embeddings.create(
322+ model = " bge-m3" ,
323+ input = [" First text" , " Second text" , " Third text" ]
324+ )
325+
326+ # With caching
327+ response = client.embeddings.create(
328+ model = " bge-m3" ,
329+ input = " Frequently requested text" ,
330+ cache_options = {" enabled" : " on" , " max_age_s" : 3600 }
331+ )
332+
333+ # With chunking for long documents
334+ response = client.embeddings.create(
335+ model = " bge-m3" ,
336+ input = " Very long document..." ,
337+ chunking = {" strategy" : " sentence" , " chunk_size" : 512 }
338+ )
339+ ```
340+
341+ ** Embedding Parameters:**
342+
343+ | Parameter | Type | Description |
344+ | -----------| ------| -------------|
345+ | ` model ` | ` str ` | Model ID (required) |
346+ | ` input ` | ` str \| list[str] ` | Text to embed (required) |
347+ | ` encoding_format ` | ` str ` | ` "float" ` or ` "base64" ` |
348+ | ` modality ` | ` str ` | ` "text" ` , ` "image" ` , or ` "audio" ` |
349+ | ` dimensions ` | ` int ` | Output dimensions (0 = full) |
350+ | ` priority ` | ` str ` | ` "high" ` , ` "normal" ` , or ` "low" ` |
351+ | ` include_input ` | ` bool ` | Return original text in response |
352+ | ` chunking ` | ` dict ` | Chunking configuration |
353+ | ` cache_options ` | ` dict ` | Cache settings |
354+
355+ ### Classifications
356+
357+ Classify text using deployed classifier models. [ Full documentation] ( docs/api/classifications.md )
358+
359+ ``` python
360+ # Single classification
361+ response = client.classifications.create(
362+ model = " finbert" ,
363+ input = [" The stock market rallied today with strong gains." ]
364+ )
365+
366+ for label_score in response.data[0 ]:
367+ print (f " { label_score.label} : { label_score.score:.2% } " )
368+ # Output: positive: 92.84%, neutral: 5.06%, negative: 2.10%
369+
370+ # Batch classification
371+ response = client.classifications.create(
372+ model = " finbert" ,
373+ input = [
374+ " Company reports record profits." ,
375+ " Market crash leads to losses." ,
376+ " Trading volume steady today."
377+ ],
378+ priority = " high"
379+ )
380+
381+ for i, result in enumerate (response.data):
382+ top = max (result, key = lambda x : x.score)
383+ print (f " Text { i+ 1 } : { top.label} ( { top.score:.1% } ) " )
384+ ```
385+
386+ ** Classification Parameters:**
387+
388+ | Parameter | Type | Description |
389+ | -----------| ------| -------------|
390+ | ` input ` | ` list[str] ` | Texts to classify (required) |
391+ | ` model ` | ` str ` | Classifier model ID |
392+ | ` raw_scores ` | ` bool ` | Return raw scores vs normalized |
393+ | ` priority ` | ` str ` | ` "high" ` , ` "normal" ` , or ` "low" ` |
394+
395+ ### List Models
396+
397+ ``` python
398+ # List all available models
399+ models = client.models.list()
400+ for model in models.data:
401+ print (f " { model.id} - { model.owned_by} " )
402+
403+ # Get specific model info
404+ model = client.models.retrieve(" gpt-4" )
405+ ```
406+
407+ ---
408+
257409## Pipeline DSL
258410
259411Define pipelines using Python:
0 commit comments