-
Notifications
You must be signed in to change notification settings - Fork 1.1k
add ai concepts section #8087
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
add ai concepts section #8087
Changes from 6 commits
Commits
Show all changes
7 commits
Select commit
Hold shift + click to select a range
48d11c4
add ai concepts section
dbanksdesign f49a258
Update src/pages/[platform]/ai/concepts/models/index.mdx
dbanksdesign e79f05a
Update src/pages/[platform]/ai/concepts/architecture/index.mdx
dbanksdesign 3adc3c9
Update src/pages/[platform]/ai/concepts/architecture/index.mdx
dbanksdesign ca8086e
Update src/pages/[platform]/ai/concepts/prompting/index.mdx
dbanksdesign 47fccde
adding links to default inference configuration
dbanksdesign 44c542c
Update src/pages/[platform]/ai/concepts/architecture/index.mdx
dbanksdesign File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,52 @@ | ||
import { getCustomStaticPath } from "@/utils/getCustomStaticPath"; | ||
|
||
export const meta = { | ||
title: "Architecture", | ||
description: | ||
"Amplify AI Kit fullstack architecture", | ||
platforms: [ | ||
"javascript", | ||
"react-native", | ||
"angular", | ||
"nextjs", | ||
"react", | ||
"vue", | ||
], | ||
}; | ||
|
||
export const getStaticPaths = async () => { | ||
return getCustomStaticPath(meta.platforms); | ||
}; | ||
|
||
export function getStaticProps(context) { | ||
return { | ||
props: { | ||
platform: context.params.platform, | ||
meta, | ||
}, | ||
}; | ||
} | ||
|
||
|
||
|
||
The Amplify AI kit is built around the idea of routes. An AI route is like an API endpoint for interacting with backend AI functionality. AI routes are configured in an Amplify backend where you can define the authorization rules, what type of route (generation or conversation), AI model and inference configuration like temperature, what are the inputs and outputs, and what data it has access to. There are currently 2 types of AI routes: | ||
|
||
* **Conversation:** A conversation route is an asynchronous, multi-turn API. Conversations and messages are automatically stored in DynamoDB. Examples of this are any chat-based AI experience or conversational UI. | ||
* **Generation:** A single synchronous request-response API. A generation route is an AppSync Query that generates structured data according to your route definition. Common uses include generating structured data from unstructured input and summarization. | ||
|
||
|
||
## Cloud infrastructure | ||
|
||
When you create an AI route with the Amplify AI kit, it is using these services: | ||
|
||
### AWS AppSync | ||
Serverless API layer to authorize and route requests from the browser to AWS services. | ||
|
||
### Amazon DynamoDB | ||
Serverless database for storing conversation history. | ||
|
||
### AWS Lambda | ||
Serverless execution for conversations. | ||
|
||
### Amazon Bedrock | ||
Serverless foundation models | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,33 @@ | ||
import { getChildPageNodes } from '@/utils/getChildPageNodes'; | ||
import { getCustomStaticPath } from "@/utils/getCustomStaticPath"; | ||
|
||
export const meta = { | ||
title: "Concepts", | ||
description: | ||
"Learn about what Amplify AI provisions and get an overview about generative AI concepts and terminology.", | ||
route: '/[platform]/ai-kit/overview', | ||
platforms: [ | ||
"javascript", | ||
"react-native", | ||
"angular", | ||
"nextjs", | ||
"react", | ||
"vue", | ||
], | ||
}; | ||
|
||
export const getStaticPaths = async () => { | ||
return getCustomStaticPath(meta.platforms); | ||
}; | ||
|
||
export function getStaticProps(context) { | ||
const childPageNodes = getChildPageNodes(meta.route); | ||
return { | ||
props: { | ||
meta, | ||
childPageNodes | ||
} | ||
}; | ||
} | ||
|
||
<Overview childPageNodes={props.childPageNodes} /> |
95 changes: 95 additions & 0 deletions
95
src/pages/[platform]/ai/concepts/inference-configuration/index.mdx
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,95 @@ | ||
import { getCustomStaticPath } from "@/utils/getCustomStaticPath"; | ||
|
||
export const meta = { | ||
title: "Inference Configuration", | ||
description: | ||
"Learn about inference configuration", | ||
platforms: [ | ||
"javascript", | ||
"react-native", | ||
"angular", | ||
"nextjs", | ||
"react", | ||
"vue", | ||
], | ||
}; | ||
|
||
export const getStaticPaths = async () => { | ||
return getCustomStaticPath(meta.platforms); | ||
}; | ||
|
||
export function getStaticProps(context) { | ||
return { | ||
props: { | ||
platform: context.params.platform, | ||
meta, | ||
}, | ||
}; | ||
} | ||
|
||
|
||
|
||
|
||
LLMs have parameters that can be configured to change how the model behaves. This is called inference configuration or inference parameters. LLMs are actually *predicting* text based on the text input. This prediction is probabilistic, and can be tweaked by adjusting the inference configuration to allow for more creative or deterministic outputs. The proper configuration will depend on your use case. | ||
|
||
[Bedrock documentation on inference configuration](https://docs.aws.amazon.com/bedrock/latest/userguide/inference-parameters.html) | ||
|
||
<Accordion title='What is inference?'> | ||
|
||
Inference refers to the process of using a model to generate or predict output based on input data. Inference is using a model after it has been trained on a data set. | ||
|
||
</Accordion> | ||
|
||
|
||
|
||
|
||
|
||
## Setting inference configuration | ||
|
||
All generative AI routes in Amplify accept inference configuration as optional parameters. If you do not provide any inference configuration options, Bedrock will use [default ones for that particular model](#default-values). | ||
|
||
```ts | ||
a.generation({ | ||
aiModel: a.ai.model("Claude 3 Haiku"), | ||
systemPrompt: `You are a helpful assistant`, | ||
inferenceConfiguration: { | ||
temperature: 0.2, | ||
topP: 0.2, | ||
maxTokens: 1000, | ||
} | ||
}) | ||
``` | ||
|
||
## Definitions | ||
|
||
### Temperature | ||
|
||
Affects the shape of the probability distribution for the predicted output and influences the likelihood of the model selecting lower-probability outputs. Temperature is usually* number from 0 to 1, where a lower value will influence the model to select higher-probability options. Another way to think about temperature is to think about creativity. A low number (close to zero) would produce the least creative and most deterministic response. | ||
|
||
-* AI21 Labs Jamba models use a temperature range of 0 – 2.0 | ||
|
||
### Top P | ||
|
||
Top p refers to the percentage of token candidates the model can choose from for the next token in the response. A lower value will decrease the size of the pool and limit the options to more likely outputs. A higher value will increase the size of the pool and allow for lower-probability tokens. | ||
|
||
|
||
### Max Tokens | ||
|
||
This parameter is used to limit the maximum response a model can give. | ||
|
||
|
||
## Default values | ||
|
||
|
||
| Model | Temperature | Top P | Max Tokens | | ||
| ----- | ----------- | ----- | ---------- | | ||
| [AI21 Labs Jamba](https://docs.aws.amazon.com/bedrock/latest/userguide/model-parameters-jamba.html#model-parameters-jamba-request-response) | 1.0* | 0.5 | 4096 | | ||
| [Meta Llama](https://docs.aws.amazon.com/bedrock/latest/userguide/model-parameters-meta.html#model-parameters-meta-request-response) | 0.5 | 0.9 | 512 | | ||
| [Amazon Titan](https://docs.aws.amazon.com/bedrock/latest/userguide/model-parameters-titan-text.html) | 0.7 | 0.9 | 512 | | ||
| [Anthropic Claude](https://docs.aws.amazon.com/bedrock/latest/userguide/model-parameters-anthropic-claude-messages.html#model-parameters-anthropic-claude-messages-request-response) | 1 | 0.999 | 512 | | ||
| [Cohere Command R](https://docs.aws.amazon.com/bedrock/latest/userguide/model-parameters-cohere-command-r-plus.html#model-parameters-cohere-command-request-response) | 0.3 | 0.75 | 512 | | ||
| [Mistral Large](https://docs.aws.amazon.com/bedrock/latest/userguide/model-parameters-mistral-chat-completion.html#model-parameters-mistral-chat-completion-request-response) | 0.7 | 1 | 8192 | | ||
|
||
[Bedrock documentation on model default inference configuration](https://docs.aws.amazon.com/bedrock/latest/userguide/model-parameters.html) | ||
|
||
-* AI21 Labs Jamba models use a temperature range of 0 – 2.0 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,135 @@ | ||
import { getCustomStaticPath } from "@/utils/getCustomStaticPath"; | ||
|
||
export const meta = { | ||
title: "Models", | ||
description: | ||
"Learn about foundation models provided by Amazon Bedrock used for generative AI", | ||
platforms: [ | ||
"javascript", | ||
"react-native", | ||
"angular", | ||
"nextjs", | ||
"react", | ||
"vue", | ||
], | ||
}; | ||
|
||
export const getStaticPaths = async () => { | ||
return getCustomStaticPath(meta.platforms); | ||
}; | ||
|
||
export function getStaticProps(context) { | ||
return { | ||
props: { | ||
platform: context.params.platform, | ||
meta, | ||
}, | ||
}; | ||
} | ||
|
||
|
||
A foundation model is a large, general-purpose machine learning model that has been pre-trained on a vast amount of data. These models are trained in an unsupervised or self-supervised manner, meaning they learn patterns and representations from the unlabeled training data without being given specific instructions or labels. | ||
|
||
Foundation models are useful because they are general-purpose and you don't need to train the models yourself, but are powerful enough to take on a range of applications. | ||
|
||
Foundation Models, which Large Language Models are a part of, are inherently stateless. They take input in the form of text or images and generate text or images. They are also inherently non-deterministic. Providing the same input can generate different output. | ||
|
||
|
||
|
||
## Getting model access | ||
|
||
Before you can invoke a foundation model on Bedrock you will need to [request access to the models in the AWS console](https://console.aws.amazon.com/bedrock/home#/modelaccess). | ||
|
||
Be sure to check the region you are building your Amplify app in! | ||
|
||
## Pricing and Limits | ||
|
||
Each foundation model in Amazon Bedrock has its own pricing and throughput limits for on-demand use. On-demand use is serverless, you don't need to provision any AWS resources to use and you only pay for what you use. The Amplify AI kit uses on-demand use for Bedrock. | ||
|
||
The cost for using foundation models is calculated by token usage. A token in generative AI refers to chunks of data that were sent as input and how much data was generated. A token is roughly equal to a word, but depends on the model being used. Each foundation model in Bedrock has its own pricing based on input and output tokens used. | ||
|
||
When you use the Amplify AI Kit, inference requests are charged to your AWS account based on Bedrock pricing. There is no Amplify markup, you are just using AWS resources in your own account. | ||
|
||
Always refer to [Bedrock pricing](https://aws.amazon.com/bedrock/pricing/) for the most up-to-date information on running generative AI with Amplify AI Kit. | ||
|
||
|
||
## Supported Providers and Models | ||
|
||
The Amplify AI Kit uses Bedrock's [Converse API](https://docs.aws.amazon.com/bedrock/latest/userguide/conversation-inference.html) to leverage a unified API across models. Most models have different structures to how they best work with input and how they format their output. For example, ... | ||
|
||
### AI21 Labs | ||
* [Jamba 1.5 Large](https://aws.amazon.com/blogs/aws/jamba-1-5-family-of-models-by-ai21-labs-is-now-available-in-amazon-bedrock/) | ||
* [Jamba 1.5 Mini](https://aws.amazon.com/blogs/aws/jamba-1-5-family-of-models-by-ai21-labs-is-now-available-in-amazon-bedrock/) | ||
|
||
|
||
### Anthropic | ||
* Claude 3 Haiku | ||
* Claude 3 Sonnet | ||
* Claude 3 Opus | ||
* Claude 3.5 Sonnet | ||
https://docs.anthropic.com/en/docs/about-claude/models | ||
|
||
### Cohere | ||
* Command R | ||
* Command R+ | ||
|
||
### Meta | ||
* Llama 3.1 | ||
|
||
### Mistral | ||
* Large | ||
* Large 2 | ||
|
||
|
||
The Amplify AI Kit makes use of ["tools"](/[platform]/ai/concepts/tools) for both generation and conversation routes. [The models it supports must support tool use in the Converse API](https://docs.aws.amazon.com/bedrock/latest/userguide/conversation-inference-supported-models-features.html). | ||
|
||
Using the Converse API makes it easy to swap different models without having to drastically change how you interact with them. | ||
|
||
## Choosing a model | ||
|
||
Each model and model provider has their own strengths and weaknesses. We encourage you to try different models for different use-cases to find the right fit. Things to consider when choosing a model: | ||
|
||
### Context window | ||
|
||
Each model has its own context window size. The context window is how much information you can send to the model. FMs are stateless, but conversation routes manage message history, so the context window can continue to grow as you "chat" with a model. The context window for models is defined by the number of tokens it can receive. | ||
|
||
### Latency | ||
|
||
Smaller models tend to have a lower latency than larger models, but can also sometimes be less powerful. | ||
|
||
### Cost | ||
|
||
Each model has its own price and throughput. | ||
|
||
### Use-case fit | ||
|
||
Some models are trained to be better at certain tasks or with certain languages. | ||
|
||
Choosing the right model for your use case is balancing latency, cost, and performance. | ||
|
||
|
||
## Using different models | ||
|
||
Using the Amplify AI Kit you can easily use different models for different functionality in your application. Each AI route definition will have an `aiModel` attribute you define in your schema. To use different foundation models in your Amplify AI backend, update the `aiModel` using `a.ai.model()`: | ||
|
||
```ts | ||
const schema = a.schema({ | ||
summarizer: a.generation({ | ||
aiModel: a.ai.model("Claude 3 Haiku") | ||
}) | ||
}) | ||
``` | ||
|
||
The `a.ai.model()` function gives you access to friendly names for the Bedrock models. We will keep this function up-to-date as new models are added to Bedrock. In case there is a new model that has not yet been added, you can always use the model ID which can be found in the Bedrock console or documentation: | ||
|
||
```ts | ||
const schema = a.schema({ | ||
summarizer: a.generation({ | ||
aiModel: { | ||
resourcePath: 'meta.llama3-1-405b-instruct-v1:0' | ||
} | ||
}) | ||
}) | ||
``` | ||
|
||
|
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.