Skip to content

neomartinez30/bedrock-agent-txt2sql

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

321 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Amazon Bedrock Agent for Text-to-SQLPublic Using Amazon Athena with Streamlit + FDA Drug Lookup via Public API + Knowledgebase

Overview

This repository provides an end-to-end solution for building an Amazon Bedrock Agent. The agent is an assistant for virtual nurses that can:

  • Translate natural language queries into SQL to retrieve patient and procedure data from Amazon Athena (using S3 as the data source).
  • Simultaneously look up drug information from the public FDA drug database via a Lambda function.
  • Simultaneously retrieves telehealth triage instructions from Schmitt Thmoson Protocols stored in a knowledgebase
  • Orchestrate all three actions to provide comprehensive answers to health profesionals queries.

The solution is fully serverless, leveraging AWS CloudFormation for infrastructure, AWS Lambda for compute, Amazon Athena for SQL queries, and Amazon Bedrock for orchestration and LLM-powered reasoning.


Solution Architecture

  1. Amazon S3: Stores CSV data for Athena queries (see S3data/health/health.csv).
  2. AWS Glue: Catalogs S3 data for Athena (see athena-queries.txt for table DDL).
  3. Amazon Athena: Executes SQL queries generated by the agent.
  4. AWS Lambda:
    • AthenaQueryLambda: Executes SQL queries on Athena and returns results.
    • FDADrugLookupLambda: Calls the public FDA drug API and returns drug information.
  5. Amazon Bedrock Agent:
    • Orchestrates Athena, Knowledgebase and FDA lookups using two action groups.
    • Uses OpenAPI schemas to define action group interfaces.
    • Employs a custom orchestration prompt for reasoning and tool selection.

Prerequisites

  • An AWS account with permissions to deploy CloudFormation stacks, create Lambda functions, IAM roles, S3 buckets, Athena databases, and Bedrock agents.
  • Access to Amazon Bedrock models (e.g., Anthropic Claude 3 Haiku).
  • AWS CLI and AWS SAM CLI (optional, for local development).

Quick Start

You can use these templates and modify them for your usecase by following every step below in order. Even though these steps show exactly what to modify, you should still review each file in the repo thoroughly to make sure every component is named and configured to your liking. If you want to deply this solution as is, simply follow the steps below and skip any "optional" steps.

1. Clone the Repository

git clone https://github.com/neomartinez30/bedrock-agent-txt2sql.git
cd bedrock-agent-txt2sql

2. Prepare Your Data

  • (Optional) Review and update the Athena table DDL in athena-queries.txt if your schema differs.
  • Update the s3 location in your athena-queries.txt file. The S3 bucket should be the same as the bucket in Location of query result in Athena. go to Athena > Query Editor > Settings > Location of query result.
  • (Optional) Update the Unique identifier in the athena-schema.json file. I am using the patients "name" as the identifier but anything can be used such as an ID number, etc. This field is right underneath the "properties" field in the json request.
  • Replace the CSV file in the S3data folder with your data. Note: make sure to also update the url path to this csv in the cfn/1-athena-glue-s3-template.yaml file. Go to the file, press crtl-F and search for "urls".
  • Update the s3 file path in the funtion/lambda_function.py file with the same S3 path used above.

3. Deploy Infrastructure

  • Edit cfn/1-athena-glue-s3-template.yaml with the following:
    • (Optional) Update the column names in the GlueTableHealth to match your csv columns.
    • (Optional) Update the FDADrugLookupLambda to any api function call you wish to use. Note: if you do update this lambda, make sure to also update its corresponding agent bedrock OpenAPI code.
  • Deploy the Athena/Glue/S3 stack:
    aws cloudformation deploy --template-file cfn/1-athena-glue-s3-template.yaml --stack-name athena-glue-s3-stack --capabilities CAPABILITY_NAMED_IAM
  • Edit cfn/2-bedrock-agent-lambda-template.yaml:
    • When you updload this template in CloudFormation, set the Alias parameter to a unique value for your environment.
    • (Optional) Review every prompt and make sure its appropriate for your usecase. ** make sure the schema in the orchestrator prompt is apporicate for your csv data.
    • (Optional) Press Ctrl-F, search for 'ApiSchema', and then update the Unique identifier. I am using the patients "name" as the identifier, this should match the identifier set for in the athena-schema.json file.
  • Deploy the Bedrock Agent and Lambda stack:
    aws cloudformation deploy --template-file cfn/2-bedrock-agent-lambda-template.yaml --stack-name bedrock-agent-stack --capabilities CAPABILITY_NAMED_IAM
  • Once the second template is deployed. Go to Lake Formation > Data Permissions > Grant > find your Lambda execution role under IAM users and roles > select Named Data Catalog Resources > select your catalog > select your database > select your table > add reading access at minimum.

4. Update Bedrock Agent Configuration

  • In the AWS Console, navigate to Amazon Bedrock > Agents.
  • Confirm that your agent has two action groups:
    • query-athena (for Athena SQL queries)
    • lookup-fda-drug (for FDA drug lookups)
  • Ensure the orchestration prompt and instructions mention both tools.

5. Test the Solution

  • Use the Bedrock Agent test UI or your own client to send natural language queries, e.g.:
    • "How many patients are taking Lipitor and what is Lipitor used for?"
    • "Show me all patients with cancer and tell me about Paracetamol."
  • The agent should:
    • Generate and execute the correct SQL query on Athena.
    • Call the FDA API for drug information.
    • Return a combined, well-structured response.

Components to Update for Your Use

  • S3 Data: Replace S3data/health/health.csv with your own data.
  • Athena Table DDL: Update athena-queries.txt and Athena/Glue stack to match your schema.
  • Lambda Code: Edit function/lambda_function.py (Athena) and the inline code in 2-bedrock-agent-lambda-template.yaml (FDA lookup) if you need custom logic.
  • CloudFormation Parameters: Set the Alias and any other parameters in the templates to unique values for your environment.
  • OpenAPI Schemas: If your data model changes, update the OpenAPI schemas in the CloudFormation template and agent configuration.
  • Agent Instructions/Prompt: Adjust the orchestration prompt in the CloudFormation template to reflect your use case and tools.

Files and Directories

  • S3data/health/health.csv: Example patient data for Athena.
  • athena-queries.txt: Athena DDL and example queries.
  • cfn/1-athena-glue-s3-template.yaml: CloudFormation for S3, Glue, Athena setup.
  • cfn/2-bedrock-agent-lambda-template.yaml: CloudFormation for Lambda functions and Bedrock Agent.
  • function/lambda_function.py: Lambda code for Athena queries.
  • schema/athena-schema.json: OpenAPI schema for Athena action group.
  • README.md: Main project documentation.

Security & Cost

  • Review IAM roles and policies in the CloudFormation templates.
  • Set up AWS Budgets to monitor costs, especially for Bedrock model usage.

License

See LICENSE for details.


Support

For issues or questions, please open an issue in this repository.

About

Use natural language to Generate Amazon Athena SQL queries to fetch data.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages

  • Python 100.0%