This documentation was created with the help of Generating documentation with Amazon Q Developer
Build a serverless solution that processes video content and makes it searchable using natural language. This solution extracts meaningful information from both audio and video, allowing you to find specific moments using simple queries. The app stores all vector information in Amazon Aurora PostgreSQL with pgvector, enabling combined semantic searches across visual and audio content. This project uses four AWS CDK stacks to create a complete video processing and search solution.
This project implements a scalable and modular architecture for processing audio/video content using:
- Amazon Elastic Container Service (ECS) to handle compute-intensive video processing
- Amazon Aurora PostgreSQL with pgvector to store vectors and enable similarity search
- Amazon Transcribe for speech-to-text conversion
- AWS Step Functions for workflow management
- Amazon Bedrock for generating embeddings
The workflow begins when you upload a video to an Amazon S3 bucket, which triggers an AWS Step Functions workflow that orchestrates parallel processing streams.
The architecture splits into two main processing branches that work simultaneously:
-
Visual Processing Pipeline: Video → Frame extraction → Image embeddings
- Uses an Amazon ECS task running FFmpeg to extract frames at 1 FPS
- Processes these frames through Amazon Bedrock's Titan multimodal model to generate image embeddings
-
Audio Processing Pipeline: Audio → Transcribe → Text Chunks → Embeddings
- Employs Amazon Transcribe to convert speech to text
- Segments the transcribed text semantically while maintaining temporal context
An AWS Lambda function serves as the convergence point, processing both the extracted frames and transcriptions to generate the corresponding embeddings. All this vectorized information is then stored in Amazon Aurora Serverless PostgreSQL using pgvector, enabling combined semantic searches of visual and audio content.
Before you begin, ensure you have:
- AWS CLI configured with appropriate permissions
- AWS CDK installed (
npm install -g aws-cdk) - Python 3.8 or higher
- Docker installed (for local development and testing)
- Access to Amazon Bedrock (request access if needed)
This solution uses several AWS services that will incur costs:
- Amazon Bedrock Pricing
- AWS Lambda Pricing
- Amazon Aurora Pricing
- Amazon S3 Pricing
- Amazon ECS Pricing
- Amazon Transcribe Pricing
. ├── 01-ecs-cluster/ # ECS cluster infrastructure ├── 02-aurora-pg-vector/ # Aurora PostgreSQL with pgvector setup │ ├── aurora_postgres/ # Aurora database configuration │ └── lambdas/ # Database initialization Lambda functions ├── 03-audio-video-workflow/ # Main processing pipeline │ ├── container/ # Docker container for video processing │ ├── databases/ # Database interaction layer │ ├── lambdas/ # Lambda functions for pipeline steps │ │ └── code/ # Lambda function implementations │ └── workflows/ # Step Functions workflow definitions └── 04-retrieval/ # Retrieval functionality └── test-retrieval/ # Test scripts and notebooks
git clone https://github.com/build-on-aws/langchain-embeddings
cd container-video-embeddingsCreate a virtual environment:
python3 -m venv .venvActivate the virtual environment:
# For Linux/macOS
source .venv/bin/activate
# For Windows
.venv\Scripts\activate.batInstall dependencies:
pip install -r 04-retrieval/requirements.txtDeploy the Amazon ECS cluster:
``bash cd 01-ecs-cluster cdk deploy
Deploy Amazon Aurora PostgreSQL:
``bash
cd ../02-aurora-pg-vector
cdk deploy
Deploy AWS Step Functions workflow: ``bash cd ../03-audio-video-workflow cdk deploy
Deploy retrieval workflow:
``bash
cd ../04-retrieval
cdk deploy
Navigate to the test environment:
cd ../04-retrieval/test-retrieval/aws s3 cp your-video.mp4 s3://your-input-bucket/- Extract audio and start transcription
- Process video frames and generate embeddings
- Store results in Aurora PostgreSQL
- Open the notebook 01_query_audio_video_embeddings.ipynb
- Try the API with 02_test_webhook.ipynb
- Deployment failures: Check CloudWatch Logs for specific error messages
- Missing permissions: Ensure your AWS account has access to all required services
- Bedrock access: Verify you have access to the Bedrock models used in this project
- Database connection issues: Check security groups and network ACLs
See CONTRIBUTING for more information.
This library is licensed under the MIT-0 License. See the LICENSE file.
This README was generated and improved with Amazon Q CLI.

