Skip to content

Conversation

@ditadi
Copy link
Contributor

@ditadi ditadi commented Dec 9, 2025

Persistent Cache with Lakebase Support

Summary

This PR introduces a persistent caching layer for Databricks AppKit using Lakebase (PostgreSQL-compatible) as the storage backend, with automatic fallback to in-memory caching when persistent storage is unavailable.

Key Changes

  • New LakebaseConnector: Enterprise-grade PostgreSQL connector with connection pooling, credential rotation, and automatic retry logic
  • Refactored CacheManager: Pluggable storage backends supporting both in-memory and persistent (Lakebase) storage
  • Full OpenTelemetry integration: Observability for cache hits/misses and database queries
  • Comprehensive test coverage: ~2,500+ lines of new tests

Features

Cache Manager

  • Singleton pattern with getInstanceSync() for easy access after AppKit initialization
  • getOrExecute() pattern for cache-aside with automatic deduplication of concurrent requests
  • Configurable TTL and LRU eviction for both storage backends
  • Probabilistic cleanup with rate limiting to prevent database overload
  • Graceful degradation: Falls back to in-memory when Lakebase unavailable

Lakebase Connector

  • Connection pooling with configurable pool size
  • Automatic credential rotation before token expiry
  • Transient error retry with exponential backoff
  • Connection string support for flexible configuration
  • Health checks for monitoring

Configuration Options

interface CacheConfig {
  enabled?: boolean;              // Enable/disable cache
  ttl?: number;                   // Time to live in seconds (default: 3600)
  maxSize?: number;               // Max entries for in-memory (default: 1000)
  maxBytes?: number;              // Max bytes for persistent (default: 256MB)
  maxEntryBytes?: number;         // Max bytes per entry (default: 10MB)
  persistentCache?: boolean;      // Use Lakebase (default: true)
  strictPersistence?: boolean;    // Disable cache if Lakebase unavailable
  cleanupProbability?: number;    // Cleanup trigger probability (default: 0.01)
}

Architecture

┌─────────────────────────────────────────────────────────┐
│                     CacheManager                         │
│  - Singleton instance                                    │
│  - getOrExecute() with deduplication                    │
│  - Probabilistic cleanup with rate limiting             │
└─────────────────────────┬───────────────────────────────┘
                          │
            ┌─────────────┴─────────────┐
            │                           │
   ┌────────▼────────┐       ┌─────────▼─────────┐
   │  InMemoryStorage │       │ PersistentStorage  │
   │  - LRU eviction  │       │  - Lakebase/PG     │
   │  - Map-based     │       │  - LRU eviction    │
   └──────────────────┘       │  - Size limits     │
                              └─────────┬──────────┘
                                        │
                              ┌─────────▼──────────┐
                              │  LakebaseConnector  │
                              │  - Connection pool  │
                              │  - Credential mgmt  │
                              │  - Retry logic      │
                              └────────────────────┘

Test Plan

  • Unit tests for CacheManager (34 tests)
  • Unit tests for InMemoryStorage (18 tests)
  • Unit tests for PersistentStorage (27 tests)
  • Unit tests for LakebaseConnector (32 tests)
  • Integration with existing Analytics plugin tests
  • Fallback behavior when persistent storage unavailable
  • Error handling and recovery paths
  • Cleanup rate limiting

Breaking Changes

None. The cache is automatically initialized by AppKit with sensible defaults.

Environment Variables

For persistent cache (Lakebase):

PGHOST=your-host.databricks.com
PGDATABASE=your-database
PGAPPNAME=your-app-name
PGPORT=5432          # optional, default: 5432
PGSSLMODE=require    # optional, default: require

Expected env vars when you add lakebase resource to an databricks app

Or via connection string:

const connector = new LakebaseConnector({
  connectionString: 'postgresql://host:5432/database?appName=myapp'
});

Dependencies Added

  • pg (^8.16.3) - PostgreSQL client
  • @types/pg (^8.15.6) - TypeScript definitions

@ditadi ditadi requested review from a team, MarioCadenas and fjakobs December 9, 2025 12:21
@ditadi ditadi requested a review from fjakobs December 10, 2025 12:30
@ditadi ditadi merged commit 2e6efad into main Dec 10, 2025
3 checks passed
@MarioCadenas MarioCadenas deleted the feat/lakebase-cache branch December 19, 2025 17:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants