- 
                Notifications
    
You must be signed in to change notification settings  - Fork 41
 
Description
Is your feature request related to a problem? Please describe.
Fine-tuning and customizing blueprints requires a lot of effort. Few-Shot prompting could allow to improve results based on historic documents.
Describe the solution you'd like
Add few-shot prompting approach in which we utilize a set of historic document/json groundtruth pairs for learning on the fly how to perform the extraction.
Main Tasks:
- 
implement either knowledge_base or other similarity function to identify a most similar document from a corpus
- a) S3 vector store with normal embeddings, or multi-modal embeddings (or multi-vector embeddings like ColQwen)
 - b) Bedrock Knowledge base
 - c) FAISS in memory database
 - d) simple tf-idf metrics using the Textract or BDA standard output results
 
 - 
Implement new Pattern 4: Few-shot context engineering with LLM
 - 
When extracting information search for other relevant documents and take the top3 for few-shot prompting
 - 
Run Image to JSON using structured output of LLM, e.g. Nova model with constrained decoding
 - 
Add the top3 docs to the context of the inference.
 - 
We can also allow a hook in the lambda to have custom way of identifying relevant documents.
 
Describe alternatives you've considered
A clear and concise description of any alternative solutions or features you've considered.
Fine-tuning approach like: Pattern 3, or https://github.com/aws-samples/sample-for-multi-modal-document-to-json-with-sagemaker-ai
Additional context
Add any other context or screenshots about the feature request here.