Skip to content

Releases: databricks-solutions/project-0xfffff

v1.8.0 - Judge Tuning UI Improvements & MLflow Sync Fixes

03 Feb 01:11
v1.8.0
efa260f

Choose a tag to compare

Features

Auto-Evaluation Feature

LLM judge evaluation runs automatically in background when annotation phase begins
Auto-derives judge prompts from rubric questions
UI displays evaluation results and supports re-evaluation after alignment
Stores auto_evaluation_model to ensure re-evaluation uses same model

Binary Scale Support

Fixed annotation phase UI to show Pass/Fail instead of star ratings for binary rubrics
Fixed log_feedback to MLflow to log 0/1 for binary instead of always logging 3
Parse per-question judge_type from rubric questions for correct type detection
Fixed re-evaluation to use correct judge type from rubric

Re-Evaluation Capabilities

Re-evaluate now loads registered judge with episodic/semantic memory from alignment
Fixed prompt version selection (V2 scores now correctly stored against V2)
Fixed spinner to stop when re-evaluation job completes
This enables direct comparison of pre-align and post-align judge

UI/UX Improvements

Fixed SME vs Participant annotation labels using user_role field
Removed verbose "No raw text response" warnings from logs
Added GPT-5.2 and Claude Opus 4.5 to model dropdowns

v1.7.3

02 Feb 15:19
v1.7.3
20cb61e

Choose a tag to compare

What's New in v1.7.3

Installation

Download project-with-build.zip with pre-built files.

New Features

  • Time-based SQLite Backup: Changed SQLite UC volume backup from write-count-based (every 50 writes) to time-based (every 10 minutes by default)
    • New environment variable: SQLITE_BACKUP_INTERVAL_MINUTES (default: 10, set to 0 to disable)
    • Background timer runs in daemon thread, starts on app startup, stops gracefully on shutdown
    • Replaced deprecated SQLITE_BACKUP_AFTER_OPS environment variable

Improvements

  • TraceViewer UI Improvements: Enhanced trace viewing experience
  • Concurrent Annotation Tests: Added comprehensive test suite for concurrent annotation scenarios
  • E2E Tests: Added annotation last trace e2e test scenarios

Technical Changes

  • Added start_backup_timer() and stop_backup_timer() functions to sqlite_rescue module
  • record_write_operation() is now a no-op (kept for backward compatibility)
  • Updated get_rescue_status() to return backup_interval_minutes and backup_timer_running

Configuration

env:
  - name: SQLITE_VOLUME_PATH
    valueFrom: db_backup_volume
  - name: SQLITE_BACKUP_INTERVAL_MINUTES
    value: "10"  # Backup every 10 minutes (default: 10, 0 to disable)

v1.7.2 - Multi-judge MLflow sync, improved JSON viewer

30 Jan 01:53
v1.7.2
1ac017a

Choose a tag to compare

What's New

Installation

Download project-with-build.zip which includes pre-built frontend assets.

Features

  • Multi-judge MLflow sync - Background processing for faster rubric operations
  • Rubric CRUD improvements - Instant create/delete (bypasses slow proxy)
  • Annotation validation - SMEs must rate all criteria before clicking Next
  • Binary scale display - Results Review shows Pass/Fail (0/1) for binary questions
  • SmartJsonRenderer - Handles arbitrary JSON schemas with key labels and value blocks

Fixes

  • JSON parsing for strings with unescaped newlines
  • Race conditions in rubric operations removed
  • MLflow resync now non-blocking

Files Changed

  • 13 files changed, 1012 insertions(+), 310 deletions(-)

v1.7.1 - SQLite Rescue & Databricks Apps Improvements

27 Jan 20:16
v1.7.1
282f0ad

Choose a tag to compare

Installation

Download project-with-build.zip which includes pre-built frontend assets.

New Features

SQLite Rescue Module (`server/sqlite_rescue.py`)

Comprehensive backup/restore system using Databricks SDK Files API for Unity Catalog volume operations:

  • Startup restore from Unity Catalog volumes
  • Shutdown backup for data persistence
  • Periodic background backups based on write operation counts

Databricks Apps Integration (`app.yaml`)

Updated deployment configuration with:

  • Environment variables for SQLite rescue
  • Auto-app creation support
  • Root Node.js build support

Improvements

Build Pipeline

  • Added root `package.json` for Databricks Apps Node.js build environment
  • Optimized npm install execution in the ui-build step

Deployment Automation (`justfile`)

  • Updated deploy recipe to use Databricks CLI directly instead of custom scripts

Documentation (`specs/BUILD_AND_DEPLOY_SPEC.md`)

  • Added comprehensive Databricks Apps deployment section
  • Authentication configuration guide
  • SQLite rescue configuration details
  • Troubleshooting guide

v1.7.0 - Smart JSON Renderer & UI Improvements

27 Jan 19:55
v1.7.0
05fb6fd

Choose a tag to compare

Installation

Download project-with-build.zip which includes pre-built frontend assets.

Features

  • Smart JSON renderer for trace viewer - auto-detects and formats markdown, URLs, nested JSON
  • Collapsible sections for complex JSON data with expand/collapse
  • Clickable URLs in trace display
  • Improved rubric description formatting with bullet points (splits on ' - ')
  • Expand/collapse for long rubric descriptions (shows first 2 items)
  • Newline preservation in rubric text

Fixes

  • Top-level JSON fields now expanded by default for better visibility
  • E2E test updates for smart JSON renderer

Notes

  • JSONPath is optional - smart renderer works without configuration
  • If JSONPath is configured, it extracts specific values before rendering

v1.6.3 - Save Reliability & UI Fixes

27 Jan 02:19
v1.6.3
88684b4

Choose a tag to compare

Installation

Download project-with-build.zip which includes pre-built frontend assets.

Changes

Save Reliability Improvements

  • Add failed save queue with auto-retry for annotation phase
  • Add failed save queue with auto-retry for discovery phase
  • Navigation debouncing (300ms) to prevent rapid clicks overwhelming backend
  • Visual indicator showing pending saves with manual retry button
  • Beforeunload warning when there are unsaved annotations/findings

UI Fixes

  • Fix rubric question delete to update UI immediately without refresh
  • Restore error handling in annotation submission endpoint

Technical Details

  • Auto-retry runs every 5 seconds with up to 10 attempts
  • Failed saves are queued and retried automatically
  • Users can manually trigger retry by clicking the pending saves indicator

v1.6.2 - Patch Release

21 Jan 21:45
8c8193b

Choose a tag to compare

v1.6.2 Patch Release

Changes

  • Bug fixes and improvements
  • Updated build artifacts

Assets

  • project-with-build.zip - Full project with pre-built frontend assets

v1.6.1 - Patch Release

21 Jan 20:01
e655822

Choose a tag to compare

v1.6.1 Patch Release

Changes

  • Minor bug fixes and improvements
  • Updated build artifacts

Assets

  • project-with-build.zip - Full project with pre-built frontend assets

v1.6.0 - JSONPath TraceViewer & Spec Coverage

21 Jan 15:14
25e0e95

Choose a tag to compare

What's New in v1.6.0

Installation

Download project-with-build.zip which includes pre-built frontend assets.

Features

  • JSONPath Display Customization for TraceViewer: Configurable JSON path display for trace input/output
  • Automated Spec Coverage Tracking: Framework-specific markers for test coverage
  • Spec-Driven Development Guardrails: Coding agent support with structured skills

Improvements

  • Builder updates for withRealApi and test on annotation traces
  • Tag all tests with spec coverage markers

Full Changelog

See commit history for detailed changes.

v1.5.0: Coding Specs, Multi-Workshop Support, CSV Upload into mlflow, multiple Judge Tuning.

17 Jan 14:53
v1.5.0
a01c4f5

Choose a tag to compare

🎉 Release v1.5.0

📦 Installation

Download project-with-build.zip which includes pre-built frontend assets.

Coding agent support

  • Structured specs definition: better support for Claude/AI coding assistants to understand patterns
  • Enables AI coding agents: quickly find relevant documentation when working on specific features.

Toggle Randomization:

  • Facilitator can choose to turn on randomization of traces in discovery and annotation phase to enable participants/SMEs to see the same set of traces but in different order
  • Default setting is OFF.

🏢 Multi-Workshop Support

  • Isolated Workshop Data: One facilitator can create separate workshops with isolated data
  • Workshop Selection: Participants/SMEs can select which workshop to join from login page
  • Persistent Storage: All workshop data saved in the same database with proper isolation

📤 CSV Upload Improvements

  • Dual Upload Options: Choose between importing directly into Discovery or logging to MLflow as traces
  • MLflow Config Persistence: Databricks host, token, and experiment ID remembered in localStorage
  • Quote Cleanup: Automatic removal of double quotes from imported CSV and MLflow trace content
  • Separate Paths: CSV-to-MLflow and CSV-to-Discovery are now independent operations

⚖️ Judge Tuning Enhancements

  • Per-Question Persistence: Judge evaluation results saved per rubric question
  • Automatic Restore: Switch between judges without losing evaluation data (24-hour retention)
  • Explicit Judge Type: Frontend passes judge_type to backend for accurate MLflow evaluation

🔄 Phase Reset Features

  • Reset Annotation: New button for facilitators to reconfigure annotation phase
  • Compact Toggles: Randomization toggles streamlined to single-line format
  • Reset Discovery: Improved with comprehensive data clearing

🔧 Bug Fixes & Improvements

  • SQLite Concurrency: WAL mode enabled on every connection for reliable multi-user access
  • Last Trace Save Fix: Complete button now properly saves final trace/annotation
  • Enhanced Retry Logic: Automatic 3-retry mechanism for database saves
  • Progress Tracking: Fixed issue where last trace wasn't marked as completed