Skip to content

Conversation

@adamrefaey
Copy link
Collaborator

@adamrefaey adamrefaey commented Apr 3, 2025

Change

This pull request introduces a significant update to the backend module, focusing on adding a new document processing feature, updating dependencies, and enhancing configuration and testing.

New Document Processing Feature:

Dependency Updates:

  • backend/package.json: Added new dependencies for AWS SDK clients (@aws-sdk/client-bedrock, @aws-sdk/client-bedrock-runtime, @aws-sdk/client-textract) and updated existing dependencies (@types/multer, @vitest/coverage-v8, vitest). [1] [2] [3]

Configuration Enhancements:

Testing Improvements:

Middleware Adjustments:

Does this PR introduce a breaking change?

No

What needs to be documented once your changes are merged?

Nothing

Additional Comments

No

@adamrefaey adamrefaey self-assigned this Apr 3, 2025
@adamrefaey adamrefaey force-pushed the ADE-152 branch 3 times, most recently from e57b4e4 to ce15609 Compare April 4, 2025 19:06
@adamrefaey adamrefaey requested a review from Copilot April 8, 2025 20:32
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot reviewed 20 out of 22 changed files in this pull request and generated no comments.

Files not reviewed (2)
  • backend/package.json: Language not supported
  • package.json: Language not supported
Comments suppressed due to low confidence (3)

backend/src/utils/security.utils.ts:93

  • The error message excludes 'application/pdf', which is defined in MAX_FILE_SIZES. Update the message to include PDF files or restrict allowed MIME types accordingly.
if (!ALLOWED_MIME_TYPES.has(mimeType)) { throw new BadRequestException('Only JPEG, PNG, and HEIC/HEIF images are allowed'); }

backend/src/services/document-processor.service.spec.ts:222

  • Since processBatch is an async function that returns a promise, use 'await expect(testService.processBatch([], userId)).rejects.toThrow(BadRequestException)' to correctly test for promise rejections.
expect(() => { testService.processBatch([], userId); }).toThrow(BadRequestException);

backend/src/config/configuration.ts:26

  • The radix provided in parseInt is 20, which might cause unexpected results. It is recommended to use 10 as the radix for proper decimal conversion.
AWS_BEDROCK_REQUESTS_PER_MINUTE: parseInt(process.env.AWS_BEDROCK_REQUESTS_PER_MINUTE || '20', 20),

@adamrefaey adamrefaey requested a review from Copilot April 9, 2025 14:24
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot reviewed 18 out of 20 changed files in this pull request and generated no comments.

Files not reviewed (2)
  • backend/package.json: Language not supported
  • package.json: Language not supported

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot reviewed 18 out of 19 changed files in this pull request and generated no comments.

Files not reviewed (1)
  • backend/package.json: Language not supported

Test User and others added 18 commits April 9, 2025 16:49
…erage-v8 and vitest, and adjust peer dependencies for compatibility with Node.js 18 and above.
…extraction, adding validation for medical reports and handling of missing information and low confidence scenarios in tests.
…nd consistency in test cases for medical information extraction
…e limiting, and enhance medical information extraction process with better error logging and validation checks.
…ce to use 'anthropic.claude-3-7-sonnet-20250219-v1:0'
…formation extraction, including validation for image types, improved error handling, and updated test cases for various image scenarios.
…upport HEIC/HEIF formats, update error messages, and add new test cases for JPEG and HEIC/HEIF images.
…d model handling and image processing capabilities

Add a controller for testing AwsBedrockService
… mock implementations, enhance image processing tests, and ensure better handling of medical information extraction scenarios.
…r improved readability and consistency, including adjustments to mock implementations and error handling in medical information extraction scenarios.
…orts

- Introduced AwsTextractService for handling interactions with AWS Textract API.
- Added TextractModule to encapsulate Textract service functionality.
- Implemented file validation and rate limiting for document processing.
- Created documentation for AWS Textract integration detailing implementation and error handling.
- Updated package.json and package-lock.json to include AWS Textract dependencies.
- Enhanced security utilities to support PDF file validation.
…uding associated DTOs, module, and tests. This cleanup eliminates unused components related to AWS Bedrock testing, streamlining the codebase.
…ns and improve code formatting. Consolidated controller array into a single line and adjusted middleware exclusion for better readability.
…on and image processing capabilities

- Refactored AwsBedrockService to remove unused dependencies and streamline the model invocation process.
- Updated the mock implementation in app.module.spec.ts to reflect changes in model response handling.
- Enhanced test coverage in aws-bedrock.service.spec.ts by removing outdated tests and improving mock setups for medical information extraction.
- Increased the maximum allowed file size for PDF uploads in security.utils.ts to accommodate larger documents.
…nality

- Increased the document requests per minute limit in backend/src/config/configuration.ts from 10 to 20.
- Imported RateLimiter from security.utils in backend/src/services/aws-textract.service.ts to enhance request management.
- Removed the RateLimiter class definition from aws-textract.service.ts as it is now imported from the utility module.
…e limiting

- Added requestsPerMinute configuration in backend/src/config/configuration.ts to manage API request limits.
- Refactored AwsBedrockService to include methods for initializing the Bedrock client, creating credentials, and configuring model ID and inference profile ARN.
- Implemented a rate limiter to control the number of requests sent to AWS Bedrock, ensuring compliance with usage limits.
- Improved error handling during Bedrock model invocation for better debugging and user feedback.
… ARN configuration

- Removed fallback values for model ID and inference profile ARN in backend/src/services/aws-bedrock.service.ts, ensuring that configuration values are directly retrieved from the config service.
- Updated logging to reflect the current configuration state without hardcoded defaults.
…response parsing

- Eliminated metadata properties such as documentType, pageCount, and isLabReport from the ExtractedTextResult interface in backend/src/services/aws-textract.service.ts.
- Updated the parseTextractResponse method to no longer require pageCount as a parameter and removed related logic for determining document type and lab report status.
- Adjusted unit tests in backend/src/services/aws-textract.service.spec.ts to reflect the removal of metadata checks, ensuring tests focus on essential response validation.
- Introduced a private method createTextractClient in backend/src/services/aws-textract.service.ts to streamline the initialization of the AWS Textract client.
- Removed redundant code from the constructor, enhancing readability and maintainability.
- Improved logging for client initialization without exposing sensitive credentials.
- Renamed configService to mockConfigService for clarity in backend/src/services/aws-textract.service.spec.ts.
- Simplified the setup of mock dependencies by directly creating the mockConfigService instance.
- Enhanced readability by removing unnecessary async/await in the beforeEach setup.
- Introduced MedicalDocumentAnalysis interface in backend/src/services/aws-bedrock.service.ts to define the structure of medical analysis results.
- Implemented analyzeMedicalDocument method to analyze medical documents and return structured data, including key medical terms, lab values, and diagnoses.
- Added comprehensive mock responses for various scenarios in backend/src/services/aws-bedrock.service.spec.ts to improve unit test coverage.
- Included validation for response structure and error handling for invalid or empty responses, ensuring robustness in medical document processing.
…ate limiting

- Updated AwsBedrockService to include user ID as a parameter in analyzeMedicalDocument and generateResponse methods for improved rate limiting.
- Refactored AwsTextractService to replace client IP with user ID in extractText and processBatch methods, ensuring consistent rate limiting across services.
- Enhanced unit tests in aws-bedrock.service.spec.ts and aws-textract.service.spec.ts to validate the new user ID-based rate limiting functionality, including handling of rate limit exceptions.
…king

- Added a cleanupOldEntries method in backend/src/utils/security.utils.ts to remove old entries from the requests map when it exceeds a defined threshold.
- Enhanced the RateLimiter class to maintain efficient tracking of user requests by cleaning up inactive user IDs, ensuring optimal memory usage and performance.
…document processing

- Introduced DocumentProcessorModule in backend/src/modules/document-processor.module.ts to encapsulate the document processing logic.
- Implemented DocumentProcessorService in backend/src/services/document-processor.service.ts, integrating AWS Textract for text extraction and AWS Bedrock for medical analysis.
- Added unit tests for DocumentProcessorService in backend/src/services/document-processor.service.spec.ts to ensure functionality and error handling.
- Updated app.module.ts to include DocumentProcessorModule, enhancing the application's capability to process medical documents efficiently.
- Consolidated PDF and image processing into a single method, processDocument, in backend/src/services/aws-textract.service.ts for improved maintainability.
- Updated logging to differentiate between PDF and image processing within the new method.
- Removed redundant code related to separate processing methods for images and PDFs, enhancing code clarity.
…t processing

- Introduced DocumentProcessorController in backend/src/controllers/document-processor.controller.ts to handle document upload and processing.
- Implemented endpoints for uploading documents and retrieving a test form, enhancing the document processing functionality.
- Updated backend/README.md to include detailed information about the new endpoints and usage instructions for the medical document processor.
…tractModule

- Removed TextractModule from backend/src/app.module.ts as it is no longer needed.
- Updated providers in app.module.ts to exclude AwsBedrockService.
- Enhanced document-processor.module.ts to export AwsTextractService and AwsBedrockService, ensuring proper service availability for document processing.
- Removed the AWS Textract integration documentation from backend/docs/aws-textract-integration.md as it is no longer relevant.
- Updated import paths in backend/src/app.module.ts and backend/src/app.module.spec.ts to reflect the new directory structure for document processing services.
- Introduced backend/src/document-processor/document-processor.module.ts to encapsulate document processing logic, including AWS Textract and Bedrock services.
- Added backend/src/document-processor/controllers/document-processor.controller.ts to handle document uploads and processing requests.
- Implemented backend/src/document-processor/services/aws-textract.service.ts and backend/src/document-processor/services/aws-bedrock.service.ts for text extraction and medical analysis, respectively.
- Enhanced unit tests for the new services and controller to ensure functionality and error handling.
…lified explanations

- Added PerplexityService to backend/src/document-processor/document-processor.module.ts for generating simplified explanations of medical documents.
- Updated DocumentProcessorService in backend/src/document-processor/services/document-processor.service.ts to include logic for generating simplified explanations during document processing.
- Modified DocumentProcessorController in backend/src/document-processor/controllers/document-processor.controller.ts to return simplified explanations alongside analysis results.
- Enhanced unit tests in backend/src/document-processor/services/document-processor.service.spec.ts to validate the integration of PerplexityService and the new simplified explanation feature.
…; update README.md to streamline document processing instructions.
@adamrefaey adamrefaey changed the title [ADE-152] Add AWS Bedrock service with medical information extraction functionality [ADE-152] Add AWS Textract, AWS Bedrock services with medical information extraction functionality Apr 9, 2025
@adamrefaey adamrefaey changed the title [ADE-152] Add AWS Textract, AWS Bedrock services with medical information extraction functionality [ADE-152] Add AWS Textract, AWS Bedrock services with medical information extraction & analysis functionalities Apr 9, 2025
@adamrefaey adamrefaey merged commit 5c5cbbf into main Apr 9, 2025
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants