Gemini Video Analyzer

A comprehensive web application for analyzing videos using Google's Gemini 2.0 Flash API. This app demonstrates all three methods of video input and showcases the full range of Gemini's video understanding capabilities.

Features

Three Video Input Methods

File API Upload - For large files (>20MB) and videos you want to reuse
- Upload videos up to 100MB
- Reuse across multiple requests
- Persistent storage in Gemini
Inline Data - For quick analysis of small videos (<20MB)
- Direct upload without File API
- Faster for small clips
- Base64 encoding
YouTube URLs - Analyze YouTube videos directly
- No download required
- Preview feature (free tier: 8 hours/day)
- Public videos only

Video Analysis Capabilities

Summarization - Get concise summaries of video content
Quiz Generation - Create quizzes with answer keys
Transcription - Audio transcription with timestamps
Visual Descriptions - Detailed descriptions of visual content
Timestamp Queries - Ask about specific moments (e.g., "What happens at 01:30?")
Action Items - Extract key takeaways and tasks

Advanced Features

Video Clipping - Analyze specific segments using start/end offsets
Custom FPS - Adjust frame sampling rate (default: 1 FPS)
- Lower FPS for static videos (lectures)
- Higher FPS for fast-action content
Prompt Templates - Pre-built prompts for common use cases
File Management - View and delete uploaded files

Tech Stack

Frontend: React 18 + Vite
Backend: Node.js + Express
AI: Google Gemini 2.0 Flash API
File Upload: Multer
Styling: Custom CSS with gradient design

Prerequisites

Node.js 18 or higher
Google Gemini API Key (Get one here)

Installation

Clone or navigate to the repository
```
cd "ACT III - Gemini Video"
```
Install dependencies
```
npm install
```
Set up environment variables

Create a .env file in the root directory:
```
cp .env.example .env
```
Edit .env and add your Gemini API key:
```
GEMINI_API_KEY=your_api_key_here
PORT=3001
```
Start the application
```
npm run dev
```
This will start:
- Frontend dev server on http://localhost:3000
- Backend API server on http://localhost:3001
Open your browser

Navigate to http://localhost:3000

Usage Guide

File API Upload

Select the "File Upload (File API)" tab
Drag and drop a video or click to browse
Enter your analysis prompt or use a template
(Optional) Configure advanced options:
- Start/End offsets for clipping
- Custom FPS for frame sampling
Click "Analyze Video"

Inline Data Processing

Select the "Inline Data (<20MB)" tab
Upload a video file smaller than 20MB
Enter your prompt
Click "Analyze Video"

YouTube URL Analysis

Select the "YouTube URL" tab
Paste a YouTube URL (public videos only)
Enter your analysis prompt
(Optional) Set start/end offsets to analyze specific segments
Click "Analyze Video"

Prompt Templates

The app includes 6 pre-built prompt templates:

Summarize - Get a 3-5 sentence summary
Create Quiz - Generate 5 questions with answers
Transcribe - Get audio transcription with timestamps
Timestamps - Query specific moments in the video
Detailed Analysis - Comprehensive visual and audio analysis
Action Items - Extract key takeaways

Advanced Options

Video Clipping

Analyze specific segments by setting offsets:

Format: "40s" or "1m20s"
Example: Start at "1m30s", End at "3m"

Custom FPS

Adjust frame sampling rate (default: 1 FPS):

Lower FPS (< 1): Better for static videos like lectures
Higher FPS (> 1): Better for fast-action sequences
Range: 0.1 to 10 FPS

Timestamp Queries

Reference specific moments using MM:SS format:

"What happens at 01:30 and 02:45?"
"Describe the scene at 00:15"

API Endpoints

POST /api/analyze-upload

Upload and analyze a video file using File API.

Body: FormData

video: Video file
prompt: Analysis prompt
startOffset: (optional) Start time
endOffset: (optional) End time
fps: (optional) Frames per second

POST /api/analyze-inline

Analyze a small video with inline data.

Body: JSON

{
  "videoData": "base64_encoded_video",
  "mimeType": "video/mp4",
  "prompt": "Your prompt",
  "fps": 1
}

POST /api/analyze-youtube

Analyze a YouTube video.

Body: JSON

{
  "youtubeUrl": "https://youtube.com/watch?v=...",
  "prompt": "Your prompt",
  "startOffset": "40s",
  "endOffset": "80s"
}

GET /api/health

Check server status and API configuration.

GET /api/files

List all uploaded files in Gemini File API.

DELETE /api/files/:name

Delete a specific file from Gemini File API.

Supported Video Formats

MP4 (video/mp4)
MPEG (video/mpeg)
MOV (video/mov)
AVI (video/avi)
FLV (video/x-flv)
MPG (video/mpg)
WEBM (video/webm)
WMV (video/wmv)
3GPP (video/3gpp)

Technical Details

Token Consumption

Default Resolution: ~300 tokens/second of video
Low Resolution: ~100 tokens/second of video
Calculation: (258 tokens/frame × 1 FPS) + (32 tokens/second audio)

Context Windows

2M Context: Up to 2 hours (default) or 6 hours (low resolution)
1M Context: Up to 1 hour (default) or 3 hours (low resolution)

Frame Sampling

Default: 1 frame per second (1 FPS)
Customizable via videoMetadata.fps
Timestamps added every second

Best Practices

File Size
- Use File API for videos >20MB
- Use Inline Data for quick analysis of small files
Video Length
- Place text prompts AFTER video parts in requests
- Consider video length vs. context window limits
Frame Rate
- Use lower FPS (<1) for static content (lectures, presentations)
- Use higher FPS (>1) for action sequences
- Be aware: Fast action may lose detail at 1 FPS
Clipping
- Use start/end offsets to analyze specific segments
- Reduces token consumption for long videos

Troubleshooting

"Gemini API not configured"

Check that your .env file exists
Verify GEMINI_API_KEY is set correctly
Restart the server after changing .env

"File too large" error

Use File API Upload tab for files >20MB
Check file size before inline processing

"Video processing failed"

Verify video format is supported
Check file isn't corrupted
Try a different video

YouTube URL errors

Ensure video is public (not private/unlisted)
Check URL format is correct
Free tier has 8 hours/day limit

Project Structure

ACT III - Gemini Video/
├── server/
│   └── index.js          # Express backend
├── src/
│   ├── App.jsx           # React main component
│   ├── main.jsx          # React entry point
│   └── index.css         # Styles
├── uploads/              # Temporary video uploads
├── package.json          # Dependencies
├── vite.config.js        # Vite configuration
├── .env                  # Environment variables (create this)
└── README.md            # This file

Environment Variables

GEMINI_API_KEY=your_gemini_api_key_here
PORT=3001

Contributing

Feel free to submit issues and enhancement requests!

License

MIT

Resources

Acknowledgments

Built with Google Gemini 2.0 Flash API - showcasing multimodal AI capabilities for video understanding.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
server		server
src		src
.env.example		.env.example
.gitignore		.gitignore
CLAUDE.md		CLAUDE.md
README.md		README.md
index.html		index.html
package.json		package.json
video_understanding.md		video_understanding.md
vite.config.js		vite.config.js

Folders and files

Latest commit

History

Repository files navigation

Gemini Video Analyzer

Features

Three Video Input Methods

Video Analysis Capabilities

Advanced Features

Tech Stack

Prerequisites

Installation

Usage Guide

File API Upload

Inline Data Processing

YouTube URL Analysis

Prompt Templates

Advanced Options

Video Clipping

Custom FPS

Timestamp Queries

API Endpoints

POST /api/analyze-upload

POST /api/analyze-inline

POST /api/analyze-youtube

GET /api/health

GET /api/files

DELETE /api/files/:name

Supported Video Formats

Technical Details

Token Consumption

Context Windows

Frame Sampling

Best Practices

Troubleshooting

"Gemini API not configured"

"File too large" error

"Video processing failed"

YouTube URL errors

Project Structure

Environment Variables

Contributing

License

Resources

Acknowledgments

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages