A comprehensive web application for analyzing videos using Google's Gemini 2.0 Flash API. This app demonstrates all three methods of video input and showcases the full range of Gemini's video understanding capabilities.
-
File API Upload - For large files (>20MB) and videos you want to reuse
- Upload videos up to 100MB
- Reuse across multiple requests
- Persistent storage in Gemini
-
Inline Data - For quick analysis of small videos (<20MB)
- Direct upload without File API
- Faster for small clips
- Base64 encoding
-
YouTube URLs - Analyze YouTube videos directly
- No download required
- Preview feature (free tier: 8 hours/day)
- Public videos only
- Summarization - Get concise summaries of video content
- Quiz Generation - Create quizzes with answer keys
- Transcription - Audio transcription with timestamps
- Visual Descriptions - Detailed descriptions of visual content
- Timestamp Queries - Ask about specific moments (e.g., "What happens at 01:30?")
- Action Items - Extract key takeaways and tasks
- Video Clipping - Analyze specific segments using start/end offsets
- Custom FPS - Adjust frame sampling rate (default: 1 FPS)
- Lower FPS for static videos (lectures)
- Higher FPS for fast-action content
- Prompt Templates - Pre-built prompts for common use cases
- File Management - View and delete uploaded files
- Frontend: React 18 + Vite
- Backend: Node.js + Express
- AI: Google Gemini 2.0 Flash API
- File Upload: Multer
- Styling: Custom CSS with gradient design
- Node.js 18 or higher
- Google Gemini API Key (Get one here)
-
Clone or navigate to the repository
cd "ACT III - Gemini Video"
-
Install dependencies
npm install
-
Set up environment variables
Create a
.envfile in the root directory:cp .env.example .env
Edit
.envand add your Gemini API key:GEMINI_API_KEY=your_api_key_here PORT=3001 -
Start the application
npm run dev
This will start:
- Frontend dev server on http://localhost:3000
- Backend API server on http://localhost:3001
-
Open your browser
Navigate to http://localhost:3000
- Select the "File Upload (File API)" tab
- Drag and drop a video or click to browse
- Enter your analysis prompt or use a template
- (Optional) Configure advanced options:
- Start/End offsets for clipping
- Custom FPS for frame sampling
- Click "Analyze Video"
- Select the "Inline Data (<20MB)" tab
- Upload a video file smaller than 20MB
- Enter your prompt
- Click "Analyze Video"
- Select the "YouTube URL" tab
- Paste a YouTube URL (public videos only)
- Enter your analysis prompt
- (Optional) Set start/end offsets to analyze specific segments
- Click "Analyze Video"
The app includes 6 pre-built prompt templates:
- Summarize - Get a 3-5 sentence summary
- Create Quiz - Generate 5 questions with answers
- Transcribe - Get audio transcription with timestamps
- Timestamps - Query specific moments in the video
- Detailed Analysis - Comprehensive visual and audio analysis
- Action Items - Extract key takeaways
Analyze specific segments by setting offsets:
- Format: "40s" or "1m20s"
- Example: Start at "1m30s", End at "3m"
Adjust frame sampling rate (default: 1 FPS):
- Lower FPS (< 1): Better for static videos like lectures
- Higher FPS (> 1): Better for fast-action sequences
- Range: 0.1 to 10 FPS
Reference specific moments using MM:SS format:
- "What happens at 01:30 and 02:45?"
- "Describe the scene at 00:15"
Upload and analyze a video file using File API.
Body: FormData
video: Video fileprompt: Analysis promptstartOffset: (optional) Start timeendOffset: (optional) End timefps: (optional) Frames per second
Analyze a small video with inline data.
Body: JSON
{
"videoData": "base64_encoded_video",
"mimeType": "video/mp4",
"prompt": "Your prompt",
"fps": 1
}Analyze a YouTube video.
Body: JSON
{
"youtubeUrl": "https://youtube.com/watch?v=...",
"prompt": "Your prompt",
"startOffset": "40s",
"endOffset": "80s"
}Check server status and API configuration.
List all uploaded files in Gemini File API.
Delete a specific file from Gemini File API.
- MP4 (video/mp4)
- MPEG (video/mpeg)
- MOV (video/mov)
- AVI (video/avi)
- FLV (video/x-flv)
- MPG (video/mpg)
- WEBM (video/webm)
- WMV (video/wmv)
- 3GPP (video/3gpp)
- Default Resolution: ~300 tokens/second of video
- Low Resolution: ~100 tokens/second of video
- Calculation: (258 tokens/frame × 1 FPS) + (32 tokens/second audio)
- 2M Context: Up to 2 hours (default) or 6 hours (low resolution)
- 1M Context: Up to 1 hour (default) or 3 hours (low resolution)
- Default: 1 frame per second (1 FPS)
- Customizable via
videoMetadata.fps - Timestamps added every second
-
File Size
- Use File API for videos >20MB
- Use Inline Data for quick analysis of small files
-
Video Length
- Place text prompts AFTER video parts in requests
- Consider video length vs. context window limits
-
Frame Rate
- Use lower FPS (<1) for static content (lectures, presentations)
- Use higher FPS (>1) for action sequences
- Be aware: Fast action may lose detail at 1 FPS
-
Clipping
- Use start/end offsets to analyze specific segments
- Reduces token consumption for long videos
- Check that your
.envfile exists - Verify
GEMINI_API_KEYis set correctly - Restart the server after changing
.env
- Use File API Upload tab for files >20MB
- Check file size before inline processing
- Verify video format is supported
- Check file isn't corrupted
- Try a different video
- Ensure video is public (not private/unlisted)
- Check URL format is correct
- Free tier has 8 hours/day limit
ACT III - Gemini Video/
├── server/
│ └── index.js # Express backend
├── src/
│ ├── App.jsx # React main component
│ ├── main.jsx # React entry point
│ └── index.css # Styles
├── uploads/ # Temporary video uploads
├── package.json # Dependencies
├── vite.config.js # Vite configuration
├── .env # Environment variables (create this)
└── README.md # This file
GEMINI_API_KEY=your_gemini_api_key_here
PORT=3001Feel free to submit issues and enhancement requests!
MIT
Built with Google Gemini 2.0 Flash API - showcasing multimodal AI capabilities for video understanding.