Skip to content

Latest commit

 

History

History
188 lines (152 loc) · 6.27 KB

File metadata and controls

188 lines (152 loc) · 6.27 KB

Changelog

All notable changes to this project will be documented in this file.

The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.


[0.1.0] - 2026-02-02

Initial Release

Provider-agnostic Text-to-Video middleware with async polling, retry logic, and comprehensive error handling.

Features

Provider Architecture

  • Provider = Backend = Vertragspartner = Eine DPA
    • Clean separation: One provider represents one backend with one Data Processing Agreement
    • OpenAISoraProvider - OpenAI Sora (Sora 2, Sora 2 Pro)
    • GoogleVeoProvider - Google Veo via Vertex AI (Veo 2, Veo 3, Veo 3.1 + fast variants)

OpenAI Sora Provider

  • Sora 2 (sora-2) - Standard text-to-video generation
    • Durations: 4, 8, 12 seconds
    • Resolutions: 720p, 1080p
    • Audio generation included
    • Image-to-video via referenceImage (multipart upload)
  • Sora 2 Pro (sora-2-pro) - Higher quality generation
    • Durations: 10, 15, 25 seconds
    • Resolutions: 720p, 1080p
    • Audio generation included
    • Image-to-video via referenceImage
  • Async pattern: POST /v1/videos → poll GET /v1/videos/{id} → download GET /v1/videos/{id}/content

Google Veo Provider

  • Veo 2 (veo-2.0-generate-001) - Stable text-to-video
  • Veo 3 (veo-3.0-generate-001) - Native audio generation
  • Veo 3 Fast (veo-3.0-fast-generate-001) - Lower cost variant
  • Veo 3.1 (veo-3.1-generate-001) - 4K support
  • Veo 3.1 Fast (veo-3.1-fast-generate-001) - 4K + lower cost
  • Async pattern: POST predictLongRunning → poll operation → get video URL
  • Veo returns videos as base64 data (decoded to Buffer automatically); use storageUri in providerOptions to write directly to GCS instead
  • OAuth2 via Application Default Credentials with automatic token caching

Async Polling

  • Both providers use asynchronous job submission + polling
  • Configurable polling behavior via PollingOptions:
    {
      intervalMs: 10000,       // Start at 10s
      maxIntervalMs: 30000,    // Cap at 30s
      backoffMultiplier: 1.5,  // Increase by 1.5x each poll
      timeoutMs: 600000,       // Give up after 10 minutes
    }
  • Optional onProgress callback for real-time status updates

Image-to-Video

  • OpenAI Sora: referenceImage becomes first frame (multipart upload)
  • Google Veo: referenceImage as first frame, optional lastFrameImage for keyframe interpolation

Video Extension (Google Veo only)

  • extend() method to continue an existing video
  • Accepts video as Buffer or provider-specific ID
  • Up to 20 consecutive extensions (7 seconds each)

Auto-Download

  • downloadToBuffer: true downloads generated video into a Buffer
  • Default: returns URL only (no download, lower memory usage)

Retry Logic

  • Exponential backoff with jitter for transient errors
  • Retryable: 429, 408, 500, 502, 503, 504, network timeouts, ECONNRESET, ECONNREFUSED
  • Not retried: 400, 401, 403, and other client errors
  • Configurable via RetryOptions:
    {
      maxRetries: 3,
      delayMs: 1000,
      backoffMultiplier: 2.0,
      maxDelayMs: 30000,
      jitter: true,
    }

Error Handling

  • Typed error class hierarchy extending TTVError:
    • InvalidConfigError - Configuration/validation errors
    • QuotaExceededError - Rate limits (429)
    • ProviderUnavailableError - Service unavailable (5xx)
    • GenerationFailedError - Generation failures
    • NetworkError - Network issues
    • CapabilityNotSupportedError - Model doesn't support feature
    • PollingTimeoutError - Video generation timed out
    • ContentModeratedError - Content blocked by safety filters

Dry Mode

  • dry: true skips API calls, returns placeholder response
  • Validates requests and logs to debug files (if enabled)
  • No costs during development

Logging

  • Configurable log levels: debug, info, warn, error, silent
  • Set via TTV_LOG_LEVEL env var or setLogLevel() API
  • Markdown debug logging via TTVDebugger (enable with DEBUG_TTV_REQUESTS=true)

Type Safety

  • Full TypeScript support with comprehensive type definitions:
    • TTVRequest, TTVResponse, TTVVideo
    • TTVExtendRequest for video extension
    • TTVProgress, TTVProgressCallback for polling updates
    • ModelInfo, TTVCapabilities
    • RetryOptions, PollingOptions
    • TTVUsage, TTVBilling
    • TTVProvider enum, TTVErrorCode type

Project Structure

src/
├── index.ts
└── middleware/
    ├── types/
    │   └── index.ts                    # All type definitions
    └── services/
        └── ttv/
            ├── ttv.service.ts          # Service orchestrator
            ├── providers/
            │   ├── base-ttv-provider.ts    # Abstract base, error classes, retry
            │   ├── openai-sora-provider.ts
            │   └── google-veo-provider.ts
            ├── utils/
            │   ├── polling.utils.ts        # Async polling with backoff
            │   └── debug-ttv.utils.ts      # Markdown debug logging
            └── assets/
                └── placeholder-video.ts    # Dry-mode placeholder

Configuration

Environment Variables

# Default provider
TTV_DEFAULT_PROVIDER=openai-sora

# Logging
TTV_LOG_LEVEL=info

# OpenAI Sora
OPENAI_API_KEY=sk-...

# Google Veo (Vertex AI)
GOOGLE_CLOUD_PROJECT=your-project-id
GOOGLE_APPLICATION_CREDENTIALS=./service-account.json
GOOGLE_CLOUD_REGION=us-central1

Peer Dependencies (optional)

  • openai >= 4.0.0 (for OpenAI Sora)
  • google-auth-library (for Google Veo)

Known Limitations

  • No unit tests yet - Initial release focuses on architecture and provider implementations
  • Sora Remix not exposed as dedicated method (use providerOptions)
  • Veo regions limited to us-central1 for most models
  • Video extension only supported by Google Veo

Compatibility

Node.js

  • Minimum: 18.0.0
  • Recommended: 20.x or later

TypeScript

  • Minimum: 5.0.0
  • Recommended: 5.3.x or later

Links