Name	Name	Last commit message	Last commit date
parent directory ..
01-ecs-cluster	01-ecs-cluster
02-aurora-pg-vector	02-aurora-pg-vector
03-audio-video-workflow	03-audio-video-workflow
04-retrieval	04-retrieval
image	image
.gitignore	.gitignore
.gitignore copy	.gitignore copy
CODE_OF_CONDUCT.md	CODE_OF_CONDUCT.md
CONTRIBUTING.md	CONTRIBUTING.md
LICENSE	LICENSE
README.md	README.md

Ask Your Video: Audio/Video Processing Pipeline with Vector Search

This documentation was created with the help of Generating documentation with Amazon Q Developer

Build a serverless solution that processes video content and makes it searchable using natural language. This solution extracts meaningful information from both audio and video, allowing you to find specific moments using simple queries. The app stores all vector information in Amazon Aurora PostgreSQL with pgvector, enabling combined semantic searches across visual and audio content. This project uses four AWS CDK stacks to create a complete video processing and search solution.

Architecture Overview

This project implements a scalable and modular architecture for processing audio/video content using:

Amazon Elastic Container Service (ECS) to handle compute-intensive video processing
Amazon Aurora PostgreSQL with pgvector to store vectors and enable similarity search
Amazon Transcribe for speech-to-text conversion
AWS Step Functions for workflow management
Amazon Bedrock for generating embeddings

How It Works

The workflow begins when you upload a video to an Amazon S3 bucket, which triggers an AWS Step Functions workflow that orchestrates parallel processing streams.

The architecture splits into two main processing branches that work simultaneously:

Visual Processing Pipeline: Video → Frame extraction → Image embeddings
- Uses an Amazon ECS task running FFmpeg to extract frames at 1 FPS
- Processes these frames through Amazon Bedrock's Titan multimodal model to generate image embeddings
Audio Processing Pipeline: Audio → Transcribe → Text Chunks → Embeddings
- Employs Amazon Transcribe to convert speech to text
- Segments the transcribed text semantically while maintaining temporal context

An AWS Lambda function serves as the convergence point, processing both the extracted frames and transcriptions to generate the corresponding embeddings. All this vectorized information is then stored in Amazon Aurora Serverless PostgreSQL using pgvector, enabling combined semantic searches of visual and audio content.

Prerequisites

Before you begin, ensure you have:

AWS CLI configured with appropriate permissions
AWS CDK installed (npm install -g aws-cdk)
Python 3.8 or higher
Docker installed (for local development and testing)
Access to Amazon Bedrock (request access if needed)

💰 Cost Considerations

This solution uses several AWS services that will incur costs:

App Structure

. ├── 01-ecs-cluster/ # ECS cluster infrastructure ├── 02-aurora-pg-vector/ # Aurora PostgreSQL with pgvector setup │ ├── aurora_postgres/ # Aurora database configuration │ └── lambdas/ # Database initialization Lambda functions ├── 03-audio-video-workflow/ # Main processing pipeline │ ├── container/ # Docker container for video processing │ ├── databases/ # Database interaction layer │ ├── lambdas/ # Lambda functions for pipeline steps │ │ └── code/ # Lambda function implementations │ └── workflows/ # Step Functions workflow definitions └── 04-retrieval/ # Retrieval functionality └── test-retrieval/ # Test scripts and notebooks

🚀 Deployment Guide

1. Clone the repository:

git clone https://github.com/build-on-aws/langchain-embeddings
cd container-video-embeddings

2. Set up the environment:

Create a virtual environment:

python3 -m venv .venv

Activate the virtual environment:

# For Linux/macOS
source .venv/bin/activate

# For Windows
.venv\Scripts\activate.bat

Install dependencies:

pip install -r 04-retrieval/requirements.txt

3. Deploy the infrastructure stacks:

Deploy the Amazon ECS cluster:

``bash cd 01-ecs-cluster cdk deploy


Deploy Amazon Aurora PostgreSQL:

``bash
cd ../02-aurora-pg-vector
cdk deploy

Deploy AWS Step Functions workflow: ``bash cd ../03-audio-video-workflow cdk deploy


Deploy retrieval workflow:
``bash
cd ../04-retrieval
cdk deploy

Testing the Application

Navigate to the test environment:

cd ../04-retrieval/test-retrieval/

1. Upload a video file to the input S3 bucket:

aws s3 cp your-video.mp4 s3://your-input-bucket/

2. The pipeline will automatically:

Extract audio and start transcription
Process video frames and generate embeddings
Store results in Aurora PostgreSQL

3. Query the results:

Open the notebook 01_query_audio_video_embeddings.ipynb
Try the API with 02_test_webhook.ipynb

Troubleshooting

Deployment failures: Check CloudWatch Logs for specific error messages
Missing permissions: Ensure your AWS account has access to all required services
Bedrock access: Verify you have access to the Bedrock models used in this project
Database connection issues: Check security groups and network ACLs

🇻🇪🇨🇱 ¡Gracias!

Security

See CONTRIBUTING for more information.

License

This library is licensed under the MIT-0 License. See the LICENSE file.

This README was generated and improved with Amazon Q CLI.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

Ask Your Video: Audio/Video Processing Pipeline with Vector Search

Architecture Overview

How It Works

Prerequisites

💰 Cost Considerations

App Structure

🚀 Deployment Guide

1. Clone the repository:

2. Set up the environment:

3. Deploy the infrastructure stacks:

Testing the Application

1. Upload a video file to the input S3 bucket:

2. The pipeline will automatically:

3. Query the results:

Troubleshooting

🇻🇪🇨🇱 ¡Gracias!

Security

License

FilesExpand file tree

container-video-embeddings

Directory actions

More options

Directory actions

More options

Latest commit

History

container-video-embeddings

Folders and files

parent directory

README.md

Ask Your Video: Audio/Video Processing Pipeline with Vector Search

Architecture Overview

How It Works

Prerequisites

💰 Cost Considerations

App Structure

🚀 Deployment Guide

1. Clone the repository:

2. Set up the environment:

3. Deploy the infrastructure stacks:

Testing the Application

1. Upload a video file to the input S3 bucket:

2. The pipeline will automatically:

3. Query the results:

Troubleshooting

🇻🇪🇨🇱 ¡Gracias!

Security

License