Releases: cloudera/CAI_AMP_Synthetic_Data_Studio
v1.0.4
Release SDS Version 1.0.4
Key Features
- Configurable Concurrency Parameters
Added max_concurrent_topics parameter to SynthesisRequest (1-100, default: 5)
Added max_workers parameter to EvaluationRequest (1-100, default: 4)
Allows users to optimize performance based on infrastructure - Service Architecture Refactor
Split synthesis services:
synthesis_service.py - Freeform data generation only
synthesis_legacy_service.py - SFT and Custom Workflow
Split evaluator services:
evaluator_service.py - Freeform evaluation only
evaluator_legacy_service.py - SFT and Custom Workflow evaluation
Improved code maintainability and separation of concerns
Backward compatible - all existing endpoints work as before
This is a minor release to provide concurrent threads as a variable via backend API for advanced users
SDS release version 1.0.3
Release Notes
Template-Driven Onboarding
A redesigned home page now features ready-to-use templates with concise descriptions, so new users can generate their first dataset in just a few clicks.
Unified Freeform Workflow
We collapsed multiple paths into a single Freeform experience that covers all previous options—including Supervised Finetuning (for both code-generation and text-to-SQL templates) and Custom Data Augmentation through seed uploads—making project setup faster and less error-prone.
Resilient Large-Data Runs
Long-running jobs now save progress incrementally. If a run is interrupted, partial output is saved. The progress on partial data also can be viewed on homepage under column Completed Rows.
Unstructured Document Ingestion
Freeform now accepts unstructured documents (PDFs), enabling users to synthesize data directly from internal content with no extra preprocessing steps.
Home-Page Performance Boost
Backend optimizations cut initial data-load times, so the dashboard and template gallery appear noticeably faster, even on slower networks.
SDS Release version 1.0.2
🌟 Key Feature Updates
✅ Seed Instruction Upload via JSON (List Format)
- Users can now upload multiple seed instructions as a list within a JSON file.
- This replaces the earlier manual entry process, significantly reducing time and effort, especially for large or complex datasets.
Dynamic Rendering of Seed Instructions
- Uploaded JSON seed instructions are automatically rendered in the UI.
- Users can instantly review pre-filled data in real time.
Flexible Seed Management
- Users retain full control over seed data post-upload.
- They can add new instructions directly from the UI.
Enhanced Prompt Generation with AI Assistance
- New AI Assistant icon added for intelligent prompt generation based on examples and instructions.
- New Restore icon allows reverting to previously generated or edited prompts.
- Provides greater flexibility and encourages iterative prompt refinement.
Schema-aware Prompt Synthesis
- Prompts generated by AI are aligned with the schema of uploaded examples.
- Ensures better contextual relevance and structural consistency of the output.
Stability Improvements and Bug Fixes
- Fixed intermittent errors during example uploads.
- Enhancements result in improved reliability and a smoother user experience.