Frameverse is a multimodal movie search platform. It turns a full-length film into a structured, searchable scene index so users can describe a moment, motive, dialogue fragment, or visual situation and jump to the right timestamp.
Public entry points:
https://frameverse.ru/- web applicationhttps://frameverse.ru/api/v0- API
This repository is a monorepo with two application packages:
packages/client- the web application for browsing movies, tasks, scenes, and search resultspackages/server- the backend API and background processing pipeline
The repository also includes infrastructure configuration for local and server
deployment through Docker Compose, Dokploy, and Traefik.
External infrastructure such as S3, PostgreSQL, and pgvector is
provisioned separately and is not started by this repository's
docker-compose.yml.
Frameverse preprocesses a movie through five stages:
ASR- speech-to-text transcription with time-aligned segmentsSBD- scene boundary detectionSBE- scene artifact extraction: clips, keyframes, and transcript slicesANN- scene annotation using visual and textual contextEMB- embedding generation for semantic retrieval
After the pipeline finishes, a movie becomes a structured search asset with scenes, frames, transcripts, annotations, and vectors ready for semantic search.
Frameverse combines three kinds of information during retrieval:
- Transcript data - audio information converted into searchable text
- Annotation data - visual understanding represented as text
- Visual data - scene imagery encoded as visual embeddings
This combination makes it possible to search by spoken content, described scene meaning, and purely visual similarity.
packages/
client/ # frontend application
server/ # API, workers, and pipeline logic
docker-compose.yml
Backend and worker package for movie ingestion, orchestration, indexing, and retrieval.
Main technologies:
- Python
3.12 Litestarfor the HTTP APIOpenAPIwith Scalar for API schema and docsTemporalfor workflow orchestration and background jobsLangfusefor tracing and prompt managementAdvanced Alchemy,Pydantic, andpydantic-settingsasyncpgandpgvectorfor PostgreSQL access and vector searchaioboto3for S3-compatible object storageSceneDetectandffmpegfor video processinguvicornfor serving the APIruffandpytestfor linting and tests
Runtime responsibilities:
- accept uploads and manage movie metadata
- run the preprocessing pipeline
- expose movie, scene, frame, transcript, task, and search endpoints
- orchestrate long-running jobs through Temporal workers
- store and retrieve structured scene data for search
Server architecture is organized into three levels:
- Protocol - abstract interfaces for external systems, including method contracts, input parameters, and return types
- Adapter - provider-specific implementations of those protocols
- Service - business logic that connects adapters, storage, and pipeline steps
This keeps the backend modular, so model providers and external tools can be swapped without rewriting the core business flow.
Frontend package for operating the platform and exploring indexed movies.
Main technologies:
React 19TanStack StartTanStack RouterTanStack QueryViteTypeScriptTailwind CSS 4Radix UIBiome,Prettier, andVitest
Runtime responsibilities:
- provide the operator-facing UI
- display movies, scenes, and processing tasks
- call the backend API
- surface semantic search results and playback navigation
The current setup uses:
- Qwen/Qwen3-VL-32B-Instruct for scene annotation
- Qwen/Qwen3-Embedding-8B for text embeddings
- nvidia/llama-nemotron-embed-vl-1b-v2 for visual embeddings
The system is model-agnostic by design. These models can be replaced with other hosted or self-hosted alternatives depending on quality, latency, and deployment constraints.
Use this option to run the repository in its intended multi-service form.
docker compose up --builddocker-compose.yml defines:
clientserverworkertemporaltemporal-ui- Temporal bootstrap helpers for database setup and namespace creation
It does not provision S3, PostgreSQL, or pgvector. These dependencies are
expected to be available as external services.
This setup is designed to work behind Traefik and on a Dokploy host, using
the external dokploy-network.
Requirements:
- Node.js
20.19+or22.12+ pnpm 10
Install and start:
cd packages/client
pnpm install
pnpm devOther useful commands:
pnpm build
pnpm preview
pnpm lint
pnpm fmtRequirements:
- Python
3.12 uv- external dependencies required by the backend environment:
- PostgreSQL with
pgvector - S3-compatible object storage
- Temporal
- model provider credentials
- Langfuse project credentials
- PostgreSQL with
Install dependencies:
cd packages/server
uv syncRun the API:
uv run uvicorn src.main:app --reloadRun the Temporal worker:
uv run python -m src.workers.runThis repository is prepared for containerized deployment:
Docker Composedefines the multi-service topologyDokployis the intended deployment environmentTraefikroutes the public web app, API, and Temporal UIS3,PostgreSQL, andpgvectorare provisioned outside this repository'sdocker-compose.yml- the backend exposes
OpenAPIschema/docs Temporalhandles orchestration of long-running pipeline jobsLangfuseprovides tracing and prompt management for LLM-powered steps
Frameverse is built for search beyond keywords. Instead of indexing a movie as a single media file, it indexes the movie as scenes enriched with transcript, visual context, annotations, and embeddings, making semantic video retrieval practical.
