aws-solutions
diff --git a/‎CHANGELOG.md‎
Lines changed: 15 additions & 0 deletions b/‎CHANGELOG.md‎
Lines changed: 15 additions & 0 deletions
diff --git a/‎NOTICE.txt‎
Lines changed: 3 additions & 0 deletions b/‎NOTICE.txt‎
Lines changed: 3 additions & 0 deletions
diff --git a/‎README.md‎
Lines changed: 84 additions & 47 deletions b/‎README.md‎
Lines changed: 84 additions & 47 deletions
diff --git a/‎deployment/build-s3-dist.sh‎
Lines changed: 1 addition & 1 deletion b/‎deployment/build-s3-dist.sh‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎docs/architecture/deployment_dashboard_architecture.png‎
117 KB b/‎docs/architecture/deployment_dashboard_architecture.png‎
117 KB
diff --git a/‎docs/architecture/usecase_architecture.png‎
118 KB b/‎docs/architecture/usecase_architecture.png‎
118 KB
diff --git a/‎docs/modify-prompt-input-restrictions-example-ddb.png‎
166 KB b/‎docs/modify-prompt-input-restrictions-example-ddb.png‎
166 KB
diff --git a/‎docs/modify-prompt-input-restrictions.md‎
Lines changed: 25 additions & 0 deletions b/‎docs/modify-prompt-input-restrictions.md‎
Lines changed: 25 additions & 0 deletions
diff --git a/‎docs/sagemaker-payload-examples/alexaTM-20B.md‎
Lines changed: 48 additions & 0 deletions b/‎docs/sagemaker-payload-examples/alexaTM-20B.md‎
Lines changed: 48 additions & 0 deletions
diff --git a/‎docs/sagemaker-payload-examples/huggingace-mistral-7B-instruct.md‎
Lines changed: 52 additions & 0 deletions b/‎docs/sagemaker-payload-examples/huggingace-mistral-7B-instruct.md‎
Lines changed: 52 additions & 0 deletions
@@ -5,6 +5,21 @@ All notable changes to this project will be documented in this file.
 The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
 and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
 
+## [1.3.0] - 2024-02-22
+
+### Added
+
+-   Support for SageMaker as an LLM provider through SageMaker inference endpoints.
+-   Ability to deploy both the deployment dashboard and use cases within a VPC, including bringing an existing VPC and allowing the solution to deploy one.
+-   Option to return and display the source documents that were referenced when generating a response in RAG use cases.
+-   New model-info API in the deployment dashboard stack which can retrieve available providers, models, and model info. Default parameters are now stored for each model and provider combination and are used to pre-populate values in the wizard.
+
+### Changed
+
+-   Refactoring of UI components in the deployment dashboard.
+-   Switch to poetry for Python package management, replacing requirements.txt files.
+-   Updates to Node and Python package versions.
+
 ## [1.2.3] - 2024-02-06
 
 ### Fixed
 
@@ -44,6 +44,8 @@ This software includes third party software subject to the following copyrights:
 @smithy/types                                               Apache-2.0
 @tabler/icons-react                                         MIT
 @tailwindcss/typography                                     MIT
+@tanstack/react-query                                       MIT
+@tanstack/react-query-devtools                              MIT
 @testing-library/jest-dom                                   MIT
 @testing-library/react                                      MIT
 @testing-library/user-event                                 MIT
@@ -64,6 +66,7 @@ MarkupSafe                                                  BSD-3-Clause
 PyYAML                                                      MIT
 SQLAlchemy                                                  MIT
 Werkzeug                                                    BSD-3-Clause
+ace-builds                                                  BSD-3-Clause
 aiohttp                                                     Apache-2.0
 aiosignal                                                   Apache-2.0
 annotated-types                                             MIT
 
@@ -40,7 +40,7 @@ set -e
 # Check to see if input has been provided:
 if [ -z "$1" ] || [ -z "$2" ] || [ -z "$3" ] || [ -z "$4" ]; then
     echo "Please provide all required parameters for the build script"
-    echo "For example: ./build-s3-dist.sh solutions trademarked-solution-name v1.2.2 template-bucket-name"
+    echo "For example: ./build-s3-dist.sh solutions trademarked-solution-name v1.3.0 template-bucket-name"
     exit 1
 fi
 
 
@@ -0,0 +1,25 @@
+# Changing input restrictions of Prompt Template and Text use case chat messages
+
+To change/increase the max length of the prompt template, browse to the model-info folder at [source/model-info](../source/model-info) and look for the appropriate model configuration file you require. The files are organized by use-case (e.g. chat, ragchat, etc.), by model provider (e.g. bedrock), and model-id (e.g. ai21-j2-mid, llama2-13b-chat-v1, etc.).
+  - For example, if you want to update the configuration for non-RAG Text use cases using the Bedrock Anthropic claude-2 model, navigate to the [chat-bedrock-anthropic-claude-v2.json](../source/model-info/chat-bedrock-anthropic-claude-v2.json)
+  
+  
+After you've identified the models you want to modify, update the appropriate keys such as `MaxPromptSize`, `MaxChatMessageSize`, etc. to the new values you wish to use.
+  - `MaxPromptSize` controls the maximum number of characters allowed to be used by Admin/Business users when defining their prompt template
+  - `MaxChatMessageSize` controls the maximum number of characters allowed to be used by Business users when sending messages using the chat interface
+
+Once the values have been changed for all models of interest, you will need to build and deploy the solution. Instructions for that can be found in the [README.md](../README.md#deployment)
+
+
+## Manual updates for experimentation only
+> __*Note: use at your own discretion. Manually modifying values maintained by the solution causes a drift and may get overwritten by solution upgrades or other forms of stack updates. Use with caution and only with non-production deployments.*__
+
+If you want to just quickly test some changes on an existing deployment and don't want to rebuild the entire solution, you can modify some values directly in DynamoDB to achieve a similar affect.
+
+The values in the model-info files are stored in a DynamoDB table at deployment time. DynamoDB is then used by the Deployment dashboard and associated use cases at runtime. If you want to modify these values you can do the following:
+  1. Go to the DynamoDB service page in your AWS console and search for a table with a name containing `ModelInfoStorage`
+  2. Once you've found the right table for your Deployment dashboard, open the table and explore the table items
+  3. Find the right row for your usecase, model provider, and model combination (e.g. - UseCase = `Chat`, SortKey = `Bedrock#anthropic.claude-v2`)
+  4. Edit the default values of interest (e.g. `MaxChatMessageSize` or `MaxPromptSize`)
+
+![Example Edit on DynamoDb](./modify-prompt-input-restrictions-example-ddb.png)
@@ -0,0 +1,48 @@
+# AlexaTM 20B
+
+Sample values for the model are:
+
+<table>
+
+<tr>
+<td> Model Id </td> <td> Model Input Schema </td> <td> Model Output JSONPath </td>
+</tr>
+
+<tr>
+<td> pytorch-textgeneration1-alexa20b </td>
+<td>
+
+```json
+{
+    "text_inputs": "<<prompt>>",
+    "num_beams": "<<num_beams>>",
+    "no_repeat_ngram_size": "<<no_repeat_ngram_size>>"
+}
+```
+
+</td>
+<td>
+
+```json
+$.generated_texts[0]
+```
+
+</td>
+</tr>
+
+</table>
+
+
+## Model Payload
+
+The input schemas provided here are inferred from model payloads to replace the actual values supplied at run time. For example, sample model payload for the input schema provided above is:
+
+```json
+{
+    "text_inputs": "[CLM] My name is Lewis and I like to",
+    "num_beams": 5,
+    "no_repeat_ngram_size": 2
+}
+```
+
+Please refer to model documentation and SageMaker JumpStart jupyter notebook to see the most up-to-date supported parameters.
@@ -0,0 +1,52 @@
+# Mistral 7B Instruct
+
+<table>
+
+<tr>
+<td> Model Id </td> <td> Model Input Schema </td> <td> Model Output JSONPath </td>
+</tr>
+
+<tr>
+<td> huggingface-llm-mistral-7b-instruct </td>
+<td>
+
+```json
+{
+    "inputs": "<<prompt>>",
+    "parameters": {
+        "temperature": "<<temperature>>",
+        "max_new_tokens": "<<max_new_tokens>>",
+        "do_sample": "<<do_sample>>"
+    }
+}
+```
+
+</td>
+<td>
+
+```json
+$[0].generated_text
+```
+
+</td>
+</tr>
+
+</table>
+
+
+## Model Payload
+
+The input schemas provided here are inferred from model payloads to replace the actual values supplied at run time. For example, sample model payload for the input schema provided above is:
+
+```json
+{
+    "inputs": "Write the code to compute factorial of a number in Python.",
+    "parameters": {
+        "temperature": 0.4,
+        "max_new_tokens": 200,
+        "do_sample": true
+    }
+}
+```
+
+Please refer to model documentation and SageMaker JumpStart jupyter notebook to see the most up-to-date supported parameters.