Releases: databricks-solutions/project-0xfffff
v1.8.0 - Judge Tuning UI Improvements & MLflow Sync Fixes
Features
Auto-Evaluation Feature
LLM judge evaluation runs automatically in background when annotation phase begins
Auto-derives judge prompts from rubric questions
UI displays evaluation results and supports re-evaluation after alignment
Stores auto_evaluation_model to ensure re-evaluation uses same model
Binary Scale Support
Fixed annotation phase UI to show Pass/Fail instead of star ratings for binary rubrics
Fixed log_feedback to MLflow to log 0/1 for binary instead of always logging 3
Parse per-question judge_type from rubric questions for correct type detection
Fixed re-evaluation to use correct judge type from rubric
Re-Evaluation Capabilities
Re-evaluate now loads registered judge with episodic/semantic memory from alignment
Fixed prompt version selection (V2 scores now correctly stored against V2)
Fixed spinner to stop when re-evaluation job completes
This enables direct comparison of pre-align and post-align judge
UI/UX Improvements
Fixed SME vs Participant annotation labels using user_role field
Removed verbose "No raw text response" warnings from logs
Added GPT-5.2 and Claude Opus 4.5 to model dropdowns
v1.7.3
What's New in v1.7.3
Installation
Download project-with-build.zip with pre-built files.
New Features
- Time-based SQLite Backup: Changed SQLite UC volume backup from write-count-based (every 50 writes) to time-based (every 10 minutes by default)
- New environment variable:
SQLITE_BACKUP_INTERVAL_MINUTES(default: 10, set to 0 to disable) - Background timer runs in daemon thread, starts on app startup, stops gracefully on shutdown
- Replaced deprecated
SQLITE_BACKUP_AFTER_OPSenvironment variable
- New environment variable:
Improvements
- TraceViewer UI Improvements: Enhanced trace viewing experience
- Concurrent Annotation Tests: Added comprehensive test suite for concurrent annotation scenarios
- E2E Tests: Added annotation last trace e2e test scenarios
Technical Changes
- Added
start_backup_timer()andstop_backup_timer()functions to sqlite_rescue module record_write_operation()is now a no-op (kept for backward compatibility)- Updated
get_rescue_status()to returnbackup_interval_minutesandbackup_timer_running
Configuration
env:
- name: SQLITE_VOLUME_PATH
valueFrom: db_backup_volume
- name: SQLITE_BACKUP_INTERVAL_MINUTES
value: "10" # Backup every 10 minutes (default: 10, 0 to disable)v1.7.2 - Multi-judge MLflow sync, improved JSON viewer
What's New
Installation
Download project-with-build.zip which includes pre-built frontend assets.
Features
- Multi-judge MLflow sync - Background processing for faster rubric operations
- Rubric CRUD improvements - Instant create/delete (bypasses slow proxy)
- Annotation validation - SMEs must rate all criteria before clicking Next
- Binary scale display - Results Review shows Pass/Fail (0/1) for binary questions
- SmartJsonRenderer - Handles arbitrary JSON schemas with key labels and value blocks
Fixes
- JSON parsing for strings with unescaped newlines
- Race conditions in rubric operations removed
- MLflow resync now non-blocking
Files Changed
- 13 files changed, 1012 insertions(+), 310 deletions(-)
v1.7.1 - SQLite Rescue & Databricks Apps Improvements
Installation
Download project-with-build.zip which includes pre-built frontend assets.
New Features
SQLite Rescue Module (`server/sqlite_rescue.py`)
Comprehensive backup/restore system using Databricks SDK Files API for Unity Catalog volume operations:
- Startup restore from Unity Catalog volumes
- Shutdown backup for data persistence
- Periodic background backups based on write operation counts
Databricks Apps Integration (`app.yaml`)
Updated deployment configuration with:
- Environment variables for SQLite rescue
- Auto-app creation support
- Root Node.js build support
Improvements
Build Pipeline
- Added root `package.json` for Databricks Apps Node.js build environment
- Optimized npm install execution in the ui-build step
Deployment Automation (`justfile`)
- Updated deploy recipe to use Databricks CLI directly instead of custom scripts
Documentation (`specs/BUILD_AND_DEPLOY_SPEC.md`)
- Added comprehensive Databricks Apps deployment section
- Authentication configuration guide
- SQLite rescue configuration details
- Troubleshooting guide
v1.7.0 - Smart JSON Renderer & UI Improvements
Installation
Download project-with-build.zip which includes pre-built frontend assets.
Features
- Smart JSON renderer for trace viewer - auto-detects and formats markdown, URLs, nested JSON
- Collapsible sections for complex JSON data with expand/collapse
- Clickable URLs in trace display
- Improved rubric description formatting with bullet points (splits on ' - ')
- Expand/collapse for long rubric descriptions (shows first 2 items)
- Newline preservation in rubric text
Fixes
- Top-level JSON fields now expanded by default for better visibility
- E2E test updates for smart JSON renderer
Notes
- JSONPath is optional - smart renderer works without configuration
- If JSONPath is configured, it extracts specific values before rendering
v1.6.3 - Save Reliability & UI Fixes
Installation
Download project-with-build.zip which includes pre-built frontend assets.
Changes
Save Reliability Improvements
- Add failed save queue with auto-retry for annotation phase
- Add failed save queue with auto-retry for discovery phase
- Navigation debouncing (300ms) to prevent rapid clicks overwhelming backend
- Visual indicator showing pending saves with manual retry button
- Beforeunload warning when there are unsaved annotations/findings
UI Fixes
- Fix rubric question delete to update UI immediately without refresh
- Restore error handling in annotation submission endpoint
Technical Details
- Auto-retry runs every 5 seconds with up to 10 attempts
- Failed saves are queued and retried automatically
- Users can manually trigger retry by clicking the pending saves indicator
v1.6.2 - Patch Release
v1.6.2 Patch Release
Changes
- Bug fixes and improvements
- Updated build artifacts
Assets
project-with-build.zip- Full project with pre-built frontend assets
v1.6.1 - Patch Release
v1.6.1 Patch Release
Changes
- Minor bug fixes and improvements
- Updated build artifacts
Assets
project-with-build.zip- Full project with pre-built frontend assets
v1.6.0 - JSONPath TraceViewer & Spec Coverage
What's New in v1.6.0
Installation
Download project-with-build.zip which includes pre-built frontend assets.
Features
- JSONPath Display Customization for TraceViewer: Configurable JSON path display for trace input/output
- Automated Spec Coverage Tracking: Framework-specific markers for test coverage
- Spec-Driven Development Guardrails: Coding agent support with structured skills
Improvements
- Builder updates for withRealApi and test on annotation traces
- Tag all tests with spec coverage markers
Full Changelog
See commit history for detailed changes.
v1.5.0: Coding Specs, Multi-Workshop Support, CSV Upload into mlflow, multiple Judge Tuning.
🎉 Release v1.5.0
📦 Installation
Download project-with-build.zip which includes pre-built frontend assets.
Coding agent support
- Structured specs definition: better support for Claude/AI coding assistants to understand patterns
- Enables AI coding agents: quickly find relevant documentation when working on specific features.
Toggle Randomization:
- Facilitator can choose to turn on randomization of traces in discovery and annotation phase to enable participants/SMEs to see the same set of traces but in different order
- Default setting is OFF.
🏢 Multi-Workshop Support
- Isolated Workshop Data: One facilitator can create separate workshops with isolated data
- Workshop Selection: Participants/SMEs can select which workshop to join from login page
- Persistent Storage: All workshop data saved in the same database with proper isolation
📤 CSV Upload Improvements
- Dual Upload Options: Choose between importing directly into Discovery or logging to MLflow as traces
- MLflow Config Persistence: Databricks host, token, and experiment ID remembered in localStorage
- Quote Cleanup: Automatic removal of double quotes from imported CSV and MLflow trace content
- Separate Paths: CSV-to-MLflow and CSV-to-Discovery are now independent operations
⚖️ Judge Tuning Enhancements
- Per-Question Persistence: Judge evaluation results saved per rubric question
- Automatic Restore: Switch between judges without losing evaluation data (24-hour retention)
- Explicit Judge Type: Frontend passes judge_type to backend for accurate MLflow evaluation
🔄 Phase Reset Features
- Reset Annotation: New button for facilitators to reconfigure annotation phase
- Compact Toggles: Randomization toggles streamlined to single-line format
- Reset Discovery: Improved with comprehensive data clearing
🔧 Bug Fixes & Improvements
- SQLite Concurrency: WAL mode enabled on every connection for reliable multi-user access
- Last Trace Save Fix: Complete button now properly saves final trace/annotation
- Enhanced Retry Logic: Automatic 3-retry mechanism for database saves
- Progress Tracking: Fixed issue where last trace wasn't marked as completed