ChronoView 🎯

"Skip to the Good Part."

AI-powered semantic search engine for video content — find any moment, instantly.

What is ChronoView?

ChronoView transforms any video — a lecture, meeting, tutorial, or conference talk — into a fully searchable knowledge base using multimodal AI. Type a natural language query and jump to the exact timestamp where that moment occurs. No scrubbing. No guessing. No re-watching hours of content.

The Problem

500+ hours of video are uploaded every minute globally
There is no "Ctrl+F" for video content
Students waste hours scrubbing lecture recordings
Enterprises lose $37B/year to unsearchable meeting recordings
Existing tools match keywords — not meaning

The Solution

ChronoView processes three parallel data streams from every video:

Stream	Model	Output
🎙️ Speech	OpenAI Whisper	Timestamped transcripts
👁️ Visual	Vision Transformer (ViT)	Scene embeddings
📄 On-screen text	Tesseract OCR	Slide & code text

All three streams are fused using a CLIP-inspired contrastive learning model into a unified semantic embedding — stored in a FAISS vector database for millisecond-speed retrieval.

Key Features

🔍 Semantic Search — natural language query → exact timestamp
🧠 Direct Q&A — AI-generated answers extracted from the video
📚 Auto-Chapters — AI-generated titled navigation segments
🌍 Multilingual Search — query in any language
🔗 Shareable Timestamp Links — share exact video moments
📊 Engagement Heatmaps — analytics on which segments were searched most
✂️ Highlight Reel Export — compile relevant segments into a short clip

Architecture

Video Input
    │
    ▼
FFmpeg (segment splitting · keyframe extraction · audio strip)
    │
    ├──────────────────────────────────────┐─────────────────────┐
    ▼                                      ▼                     ▼
Whisper ASR                         Tesseract OCR           ViT Model
(Speech → text)                  (Slides & code text)   (Scene encoding)
    │                                      │                     │
    └──────────────────────────────────────┘─────────────────────┘
                                           │
                                           ▼
                              CLIP Fusion Model (PyTorch)
                           Unified semantic embedding space
                                           │
                                           ▼
                                  FAISS / ChromaDB
                               Vector similarity index
                                           │
                                           ▼
                              FastAPI Backend (REST API)
                                           │
                                           ▼
                          React Dashboard + Video Player

Tech Stack

Layer	Technology
Video Processing	FFmpeg
Speech Recognition	OpenAI Whisper
Scene Understanding	Vision Transformer (ViT)
OCR	Tesseract
Semantic Fusion	CLIP (PyTorch + HuggingFace)
Vector Search	FAISS / ChromaDB
Backend API	FastAPI
Frontend	React / Streamlit
Analytics	Plotly
Storage	AWS S3 / Google Cloud Storage
Containerization	Docker

Getting Started

Prerequisites

Python 3.10+
NVIDIA GPU (RTX 3060 or higher recommended)
CUDA 11.8+
Node.js 18+ (for React frontend)
FFmpeg installed on system

Installation

# Clone the repository
git clone https://github.com/yourusername/chronoview.git
cd chronoview

# Create virtual environment
python -m venv venv
source venv/bin/activate  # Windows: venv\Scripts\activate

# Install dependencies
pip install -r requirements.txt

# Install frontend dependencies
cd frontend
npm install
cd ..

Environment Setup

# Copy environment template
cp .env.example .env

# Add your keys in .env
OPENAI_WHISPER_MODEL=base
HUGGINGFACE_TOKEN=your_token_here
AWS_ACCESS_KEY=your_key_here
AWS_SECRET_KEY=your_secret_here

Running the App

# Step 1 — Index a video
python pipeline/index_video.py --input your_video.mp4

# Step 2 — Start the backend
uvicorn app.main:app --reload --port 8000

# Step 3 — Start the frontend
cd frontend && npm run dev

Open http://localhost:3000 in your browser.

Project Structure

chronoview/
│
├── pipeline/
│   ├── ingest.py          # FFmpeg video segmentation
│   ├── transcribe.py      # Whisper ASR
│   ├── ocr.py             # Tesseract OCR
│   ├── vision.py          # ViT scene encoding
│   ├── fuse.py            # CLIP fusion model
│   └── index.py           # FAISS vector indexing
│
├── app/
│   ├── main.py            # FastAPI entry point
│   ├── search.py          # Query embedding + retrieval
│   ├── qa.py              # Direct Q&A generation
│   └── analytics.py       # Heatmap + usage analytics
│
├── frontend/
│   ├── src/
│   │   ├── pages/         # Home, Results, Library, Analytics
│   │   └── components/    # SearchBar, ResultCard, VideoPlayer
│   └── package.json
│
├── models/                # Saved model checkpoints
├── tests/                 # Unit and integration tests
├── docker-compose.yml
├── requirements.txt
└── README.md

API Reference

Search endpoint

POST /api/search
Content-Type: application/json

{
  "query": "explain gradient descent",
  "video_id": "cs229_lecture4",
  "top_k": 5
}

Response:

{
  "query": "explain gradient descent",
  "ai_answer": "Gradient descent minimizes loss by...",
  "results": [
    {
      "timestamp": "14:32",
      "title": "Gradient descent intuition",
      "snippet": "...learning rate alpha controls step size...",
      "confidence": 0.97,
      "sources": ["audio", "slide"]
    }
  ]
}

Research References

#	Paper	Venue
1	Radford et al. — CLIP (2021)	ICML 2021
2	Radford et al. — Whisper (2022)	arXiv:2212.04356
3	Dosovitskiy et al. — ViT (2020)	ICLR 2021
4	Johnson et al. — FAISS (2019)	IEEE Trans. Big Data
5	Liu et al. — Video Moment Localization (2023)	ACM Computing Surveys

Use Cases

Sector	Use Case
🎓 Education	Students search lecture recordings by concept
🏢 Enterprise	Teams retrieve decisions from meeting archives
🔬 Research	Scientists index conference talks and webinars
🧑‍💻 Developers	Search coding tutorials for exact implementations
♿ Accessibility	Semantic index for hearing-impaired users

Roadmap

Contributing

Pull requests are welcome. For major changes, please open an issue first to discuss what you would like to change.

# Run tests
pytest tests/

# Format code
black pipeline/ app/

License

MIT License — see LICENSE for details.

Author

Tanmay Built for hackathon — "Skip to the Good Part."

ChronoView does for video what Google did for the web.

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
public		public
src		src
.gitignore		.gitignore
README.md		README.md
bun.lock		bun.lock
bun.lockb		bun.lockb
components.json		components.json
eslint.config.js		eslint.config.js
index.html		index.html
package-lock.json		package-lock.json
package.json		package.json
playwright-fixture.ts		playwright-fixture.ts
playwright.config.ts		playwright.config.ts
postcss.config.js		postcss.config.js
tailwind.config.ts		tailwind.config.ts
tsconfig.app.json		tsconfig.app.json
tsconfig.json		tsconfig.json
tsconfig.node.json		tsconfig.node.json
vite.config.ts		vite.config.ts
vitest.config.ts		vitest.config.ts

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ChronoView 🎯

"Skip to the Good Part."

What is ChronoView?

The Problem

The Solution

Key Features

Architecture

Tech Stack

Getting Started

Prerequisites

Installation

Environment Setup

Running the App

Project Structure

API Reference

Search endpoint

Research References

Use Cases

Roadmap

Contributing

License

Author

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

ChronoView 🎯

"Skip to the Good Part."

What is ChronoView?

The Problem

The Solution

Key Features

Architecture

Tech Stack

Getting Started

Prerequisites

Installation

Environment Setup

Running the App

Project Structure

API Reference

Search endpoint

Research References

Use Cases

Roadmap

Contributing

License

Author

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages