A modular AWS CDK implementation of the GenAI Intelligent Document Processing (IDP) Accelerator, designed to transform unstructured documents into structured data at scale using AWS's latest AI/ML services.
This project is a representation of the GenAI Intelligent Document Processing Accelerator as a set of composable AWS CDK packages, enabling more flexible deployment, customization, and integration options.
@cdklabs/genai-idp- Core building blocks for document processing infrastructure@cdklabs/genai-idp-bda-processor- Pattern 1 implementation using Amazon Bedrock Data Automation@cdklabs/genai-idp-bedrock-llm-processor- Pattern 2 implementation for custom extraction using Amazon Bedrock models@cdklabs/genai-idp-sagemaker-udop-processor- Pattern 3 implementation for specialized document processing using Sagemaker Endpoint
sample-bda-lending- Complete Pattern 1 implementation for processing lending documents using Amazon Bedrock Data Automationsample-bedrock- Pattern 2 demonstration using custom extraction with Amazon Bedrock foundation modelssample-sagemaker-udop-rvl-cdip- Pattern 3 implementation using fine-tuned Hugging Face RVL-CDIP model on Amazon SageMaker
- Modular CDK Architecture: Organized as reusable CDK constructs that can be composed into complete solutions
- Multiple Processing Patterns: Pre-built document processing patterns for different use cases
- Serverless Design: Built on AWS Lambda, Step Functions, SQS, and other serverless technologies
- AI-Powered Document Processing: Leverages Amazon Bedrock, Textract, and other AWS AI services
- Web User Interface: Optional secure web interface for document tracking and management
- Document Knowledge Base: Query processed documents using natural language
- NVM (Node Version Manager)
- yarn for node package management
- Docker CLI (can be Docker Desktop or Rancher Desktop)
- rsync for copying assets to packages
- Python for building Python GenAI IDP distributable packages
- .NET SDK for building .NET GenAI IDP distributable packages
- AWS CLI configured with appropriate credentials
- AWS CDK CLI (
npm install -g aws-cdk)
- Set up the correct Node.js version using NVM:
# Install the required Node.js version specified in .nvmrc
nvm install
# Use the project's Node.js version
nvm use- Install Yarn globally (if not already installed):
npm i -g yarn- Install project dependencies:
yarn install-
Ensure Docker is running and rsync is available
-
(Re)scaffold the project:
yarn projen- Build the packages:
yarn buildNote: During the first run this might take a while
This project is licensed under the terms specified in the LICENSE file.
We welcome contributions! Please see our Contributing Guidelines for details on how to get started, development workflow, and coding standards.