A serverless document processing solution that deploys Docling on AWS Lambda for intelligent document parsing, extraction, and analysis at scale.
This project demonstrates how to deploy IBM's Docling document AI on AWS Lambda, enabling you to:
- Process PDFs, Word docs, PowerPoints and more
- Extract text, tables, images with AI precision
- Scale automatically from zero to thousands of documents
- Pay only for what you use with serverless architecture
- Deploy in minutes with our step-by-step guide
- Automated CI/CD with GitHub Actions for seamless updates
Before starting, make sure you have:
- Python 3.13+ - Download here
- Docker - Download here
- AWS CLI - Installation guide
- Git - Download here
- An AWS account with billing enabled
- AWS CLI configured with credentials:
aws configure
- Required AWS permissions:
- Lambda functions (create, update, invoke)
- ECR repositories (create, push images)
- IAM roles (create execution roles)
- CloudWatch logs (for monitoring)
# Clone the repository
git clone https://github.com/nikhil/aws-serverless-docling.git
cd aws-serverless-docling
# Build the Lambda container for AMD64 architecture
docker buildx build --platform linux/amd64 --provenance=false -f docling/Dockerfile -t aws-serverless-docling:latest .
# Test locally (optional but recommended)
docker run --platform linux/amd64 -p 9000:8080 aws-serverless-docling:latest
# In another terminal, test the function:
curl -XPOST "http://localhost:9000/2015-03-31/functions/function/invocations" \
-d '{"body": "{\"presignedUrl\": \"s3://your-bucket-name/document.pdf\"}"}'
You have two deployment options:
-
Fork this repository to your GitHub account
-
Set up GitHub Secrets in your repository settings:
AWS_ACCESS_KEY_ID: Your AWS access key AWS_SECRET_ACCESS_KEY: Your AWS secret key AWS_REGION: us-east-1 (or your preferred region) ECR_DOCLING_REPOSITORY: aws-serverless-docling
-
Create ECR repository first:
aws ecr create-repository --repository-name aws-serverless-docling --region us-east-1
-
Push to IQA branch to trigger deployment:
git checkout -b iqa git push origin iqa
The GitHub Actions workflow will automatically:
- Build the Docker image
- Push to ECR
- Deploy to Lambda (if configured)
# Set your AWS account ID and region
export AWS_ACCOUNT_ID=$(aws sts get-caller-identity --query Account --output text)
export AWS_REGION=us-east-1
export REPO_NAME=aws-serverless-docling
# Create ECR repository (mutable type for version updates)
aws ecr create-repository \
--repository-name $REPO_NAME \
--image-scanning-configuration scanOnPush=true \
--image-tag-mutability MUTABLE \
--region $AWS_REGION
# Login to ECR
aws ecr get-login-password --region $AWS_REGION | \
docker login --username AWS --password-stdin $AWS_ACCOUNT_ID.dkr.ecr.$AWS_REGION.amazonaws.com
# Tag and push image to ECR
docker tag aws-serverless-docling:latest \
$AWS_ACCOUNT_ID.dkr.ecr.$AWS_REGION.amazonaws.com/$REPO_NAME:latest
docker push $AWS_ACCOUNT_ID.dkr.ecr.$AWS_REGION.amazonaws.com/$REPO_NAME:latest
# Create IAM role for Lambda execution
aws iam create-role \
--role-name lambda-docling-execution-role \
--assume-role-policy-document '{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"Service": "lambda.amazonaws.com"
},
"Action": "sts:AssumeRole"
}
]
}'
# Attach basic execution policy
aws iam attach-role-policy \
--role-name lambda-docling-execution-role \
--policy-arn arn:aws:iam::aws:policy/service-role/AWSLambdaBasicExecutionRole
# Create Lambda function with recommended settings
aws lambda create-function \
--function-name aws-serverless-docling \
--package-type Image \
--code ImageUri=$AWS_ACCOUNT_ID.dkr.ecr.$AWS_REGION.amazonaws.com/$REPO_NAME:latest \
--role arn:aws:iam::$AWS_ACCOUNT_ID:role/lambda-docling-execution-role \
--timeout 180 \
--memory-size 3008 \
--description "Serverless document processing with Docling"
aws-serverless-docling/
βββ π docling/ # Main application code
β βββ Dockerfile # Container definition
β βββ lambda_function.py # Lambda handler
β βββ requirements.txt # Python dependencies
β βββ utils/ # Helper functions
βββ π .github/workflows/ # CI/CD workflows
β βββ iqa_release.yml # Automated deployment
βββ π tests/ # Test files
β βββ test_lambda.py # Unit tests
β βββ test_integration.py # Integration tests
βββ README.md # This documentation
βββ .env.example # Environment variables template
The project includes an automated CI/CD pipeline using GitHub Actions:
- Triggered on: Push to
iqa
branch with changes indocling/
directory - Path filtering: Only builds when Docling code changes
- Multi-platform: Builds for AMD64 architecture
- ECR integration: Automatically pushes to Amazon ECR
- Tagging: Uses commit SHA and latest tags
-
Fork the repository
-
Add GitHub Secrets:
AWS_ACCESS_KEY_ID
AWS_SECRET_ACCESS_KEY
AWS_REGION
ECR_DOCLING_REPOSITORY
-
Create ECR repository:
aws ecr create-repository --repository-name aws-serverless-docling
-
Push to trigger deployment:
git checkout -b iqa # Make changes to docling/ directory git add . git commit -m "Update docling implementation" git push origin iqa
# Test the deployed function
aws lambda invoke \
--function-name aws-serverless-docling \
--payload '{"presignedUrl": "s3://your-bucket/document.pdf"}' \
response.json
# Check the response
cat response.json
import boto3
import json
# Initialize Lambda client
lambda_client = boto3.client('lambda', region_name='us-east-1')
# Invoke function
response = lambda_client.invoke(
FunctionName='aws-serverless-docling',
Payload=json.dumps({
'presignedUrl': 's3://your-bucket/document.pdf'
})
)
# Parse response
result = json.loads(response['Payload'].read())
print(f"Processing result: {result}")
For HTTP API access, you can integrate with API Gateway:
# Example HTTP request after API Gateway setup
curl -X POST https://your-api-gateway-url/process \
-H "Content-Type: application/json" \
-d '{"presignedUrl": "s3://your-bucket/document.pdf"}'
Set these in your Lambda function configuration:
# Lambda environment variables
TORCH_HOME=/tmp/torch
LOG_LEVEL=INFO
MAX_DOCUMENT_SIZE=50000000 # 50MB in bytes
TIMEOUT_SECONDS=180 # 3 minutes
ENABLE_TABLE_EXTRACTION=true
ENABLE_IMAGE_EXTRACTION=true
OUTPUT_FORMAT=json # json, markdown, or text
Recommended configuration for optimal performance:
Setting | Value | Reason |
---|---|---|
Memory | 3008 MB | Docling requires significant RAM |
Timeout | 3 minutes (180s) | Document processing time |
Storage | 10 GB | For temporary file processing |
Architecture | x86_64 | Better compatibility |
1. Memory Errors
Error: Runtime exited with error: signal: killed
Solution: Increase Lambda memory allocation to 3008 MB
2. Timeout Issues
Task timed out after X seconds
Solution: Increase timeout to 180+ seconds for document processing
3. Docker Build Issues
Platform mismatch error
Solution: Use --platform linux/amd64
flag when building
4. Cold Start Performance
- First invocation may take 30+ seconds
- Consider provisioned concurrency for production
- Implement warm-up strategies for consistent performance
5. CI/CD Pipeline Issues
Error: Could not assume role
Solution: Check AWS credentials in GitHub Secrets
# Check Lambda logs
aws logs filter-log-events \
--log-group-name /aws/lambda/aws-serverless-docling \
--start-time $(date -d '1 hour ago' +%s)000
# Update function configuration
aws lambda update-function-configuration \
--function-name aws-serverless-docling \
--memory-size 3008 \
--timeout 300
# Check GitHub Actions logs
# Go to Actions tab in your GitHub repository
# Build and test locally
docker buildx build --platform linux/amd64 --provenance=false -f docling/Dockerfile -t test-docling .
docker run -p 9000:8080 test-docling
# Test with sample document
curl -XPOST "http://localhost:9000/2015-03-31/functions/function/invocations" \
-d '{"body": "{\"presignedUrl\": \"s3://test-bucket/sample.pdf\"}"}'
# Test deployed Lambda function
aws lambda invoke \
--function-name aws-serverless-docling \
--payload '{"test": true}' \
test-response.json
- Right-size memory allocation based on your document types
- Use appropriate timeout settings
- Implement request batching for multiple documents
- Monitor and optimize cold start frequency
- Use provisioned concurrency for production workloads
We welcome contributions! Here's how to get started:
- Fork the repository
- Create a feature branch:
git checkout -b feature/amazing-feature
- Make your changes in the
docling/
directory - Test locally using Docker
- Push to your fork:
git push origin feature/amazing-feature
- Create a Pull Request
- Changes to
docling/
directory trigger automated builds - The
iqa
branch is used for integration testing - Pull requests are automatically tested
- Follow Python PEP 8 style guide
- Add unit tests for new features
- Update README for significant changes
- Test Docker builds locally before submitting
- Ensure CI/CD pipeline passes
This project is licensed under the MIT License - see the LICENSE file for details.
MIT License
Copyright (c) 2024 Nikhil
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
- IBM Research for creating Docling
- AWS Lambda Team for serverless infrastructure
- Docker Community for containerization tools
- GitHub Actions for CI/CD capabilities
- Open Source Contributors who make projects like this possible