You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: CONTRIBUTING.md
+5-7Lines changed: 5 additions & 7 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -6,7 +6,6 @@ documentation, we greatly value feedback and contributions from our community.
6
6
Please read through this document before submitting any issues or pull requests to ensure we have all the necessary
7
7
information to effectively respond to your bug report or contribution.
8
8
9
-
10
9
## Reporting Bugs/Feature Requests
11
10
12
11
We welcome you to use the GitHub issue tracker to report bugs or suggest features.
@@ -19,8 +18,8 @@ reported the issue. Please try to include as much information as you can. Detail
19
18
* Any modifications you've made relevant to the bug
20
19
* Anything unusual about your environment or deployment
21
20
22
-
23
21
## Contributing via Pull Requests
22
+
24
23
Contributions via pull requests are much appreciated. Before sending us a pull request, please ensure that:
25
24
26
25
1. You are working against the latest source on the *main* branch.
@@ -39,20 +38,19 @@ To send us a pull request, please:
39
38
GitHub provides additional document on [forking a repository](https://help.github.com/articles/fork-a-repo/) and
40
39
[creating a pull request](https://help.github.com/articles/creating-a-pull-request/).
41
40
42
-
43
41
## Finding contributions to work on
44
-
Looking at the existing issues is a great way to find something to contribute on. As our projects, by default, use the default GitHub issue labels (enhancement/bug/duplicate/help wanted/invalid/question/wontfix), looking at any 'help wanted' issues is a great place to start.
45
42
43
+
Looking at the existing issues is a great way to find something to contribute on. As our projects, by default, use the default GitHub issue labels (enhancement/bug/duplicate/help wanted/invalid/question/wontfix), looking at any 'help wanted' issues is a great place to start.
46
44
47
45
## Code of Conduct
46
+
48
47
This project has adopted the [Amazon Open Source Code of Conduct](https://aws.github.io/code-of-conduct).
49
48
For more information see the [Code of Conduct FAQ](https://aws.github.io/code-of-conduct-faq) or contact
If you discover a potential security issue in this project we ask that you notify AWS/Amazon Security via our [vulnerability reporting page](http://aws.amazon.com/security/vulnerability-reporting/). Please do **not** create a public github issue.
55
52
53
+
If you discover a potential security issue in this project we ask that you notify AWS/Amazon Security via our [vulnerability reporting page](http://aws.amazon.com/security/vulnerability-reporting/). Please do **not** create a public github issue.
Copy file name to clipboardExpand all lines: README.md
+55-6Lines changed: 55 additions & 6 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,11 +1,61 @@
1
-
## My Project
1
+
#RAG using LangChain with Amazon Bedrock Titan text, and embedding, using OpenSearch vector engine
2
2
3
-
TODO: Fill this README out!
3
+
This sample repository provides a sample code for using RAG (Retrieval augmented generation) method relaying on [Amazon Bedrock](https://aws.amazon.com/bedrock/)[Titan text embedding](https://aws.amazon.com/bedrock/titan/) LLM (Large Language Model), for creating text embedding that will be stored in [Amazon OpenSearch](https://aws.amazon.com/opensearch-service/) with [vector engine support](https://aws.amazon.com/about-aws/whats-new/2023/07/vector-engine-amazon-opensearch-serverless-preview/) for assisting with the prompt engineering task for more accurate response from LLMs.
4
4
5
-
Be sure to:
5
+
After we successfully loaded embeddings into OpenSearch, we will then start querying our LLM, by using [LangChain](https://www.langchain.com/). We will ask questions, retrieving similar embedding for a more accurate prompt.
6
6
7
-
* Change the title in this README
8
-
* Edit your repository description on GitHub
7
+
## Prerequisites
8
+
9
+
1. This was tested on Python 3.11.4
10
+
2. It is advise to work on a clean environment, use `virtualenv` or any other virtual environment packages.
11
+
12
+
```bash
13
+
pip install virtualenv
14
+
python -m virtualenv venv
15
+
source ./venv/bin/activate
16
+
```
17
+
18
+
3. Run `./download-beta-sdk.sh` to download the beta SDK for using Amazon Bedrock
5. Install [terraform](https://developer.hashicorp.com/terraform/downloads?product_intent=terraform) to create the OpenSearch cluster
21
+
22
+
```bash
23
+
brew tap hashicorp/tap
24
+
brew install hashicorp/tap/terraform
25
+
```
26
+
27
+
## Steps for using this sample code
28
+
29
+
1. In the first step we will launch an OpenSearch cluster using Terraform.
30
+
31
+
```bash
32
+
cd ./terraform
33
+
terraform init
34
+
terraform apply -auto-approve
35
+
```
36
+
37
+
>>This cluster configuration is for testing proposes only, as it's endpoint is public for simplifying the use of this sample code.
38
+
39
+
2. Now that we have a running OpenSearch cluster with vector engine support we will start uploading our data that will help us with prompt engineering. For this sample, we will use a data source from [Hugging Face](https://huggingface.co) [embedding-training-data](https://huggingface.co/datasets/sentence-transformers/embedding-training-data) [gooaq_pairs](https://huggingface.co/datasets/sentence-transformers/embedding-training-data/resolve/main/gooaq_pairs.jsonl.gz), we will download it, and invoke Titan embedding to get a text embedding, that we will store in OpenSearch for next steps.
prompt_template="""Use the following pieces of context to answer the question at the end. If you don't know the answer, just say that you don't know, don't try to make up an answer. don't include harmful content
0 commit comments