adding aws-simple-rag sample app #1806

zoharsan · 2025-11-07T16:37:22Z

I confirm that this contribution is made under the terms of the license found in the root directory of this repository's source tree and that I have the authority necessary to make this contribution on behalf of its copyright owner.

Copilot

Pull Request Overview

This pull request introduces a new sample application demonstrating Retrieval Augmented Generation (RAG) using AWS Bedrock-hosted LLM models integrated with Vespa. The application showcases how to leverage AWS Bedrock's OpenAI-compatible chat completions API for generation while using Vespa's hybrid search capabilities (combining BM25 text search and vector similarity) for retrieval.

Key Changes:

New sample application aws-simple-rag with configuration for AWS Bedrock LLM integration
Hybrid search implementation combining text search (BM25) and vector similarity search using E5 embeddings
Comprehensive documentation covering AWS Bedrock setup, Vespa deployment, and structured output generation

Reviewed Changes

Copilot reviewed 5 out of 7 changed files in this pull request and generated 11 comments.

Show a summary per file

File	Description
aws-simple-rag/services.xml	Configures Vespa services including E5 embedder component, AWS Bedrock OpenAI client with secret management, and RAGSearcher for generation
aws-simple-rag/schemas/passage.sd	Defines the passage document schema with text indexing, vector embeddings, and hybrid ranking profile
aws-simple-rag/README.md	Provides comprehensive setup instructions for AWS Bedrock configuration, Vespa deployment, and query examples including structured output
aws-simple-rag/.vespaignore	Specifies files to exclude from the Vespa application package
aws-simple-rag/ext/edit-app-access.png	Screenshot illustrating the secret access configuration in Vespa Cloud console
README.md	Adds reference to the new AWS Bedrock RAG sample application in the main repository index

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2025-11-07T16:47:17Z

aws-simple-rag/schemas/passage.sd

+            expression: 2 * closeness(field, embedding) + bm25(text) / 11 
+        }
+        match-features: bm25(text) closeness(field, embedding)


The closeness(field, embedding) function appears to be using incorrect syntax. The closeness function typically takes the field name as a string or just the field name directly, not the literal word "field". This should likely be closeness(embedding) or the correct field reference.

Suggested change

expression: 2 * closeness(field, embedding) + bm25(text) / 11

}

match-features: bm25(text) closeness(field, embedding)

expression: 2 * closeness(embedding) + bm25(text) / 11

}

match-features: bm25(text) closeness(embedding)

Copilot · 2025-11-07T16:47:18Z

aws-simple-rag/schemas/passage.sd

+
+  document passage {
+
+     field id type string {


Inconsistent indentation: the field id block uses irregular spacing with 1 space before field while the field text block uses proper indentation. The opening brace is also on a different line than expected for the id field.

Suggested change

field id type string {

field id type string {

Copilot · 2025-11-07T16:47:18Z

aws-simple-rag/services.xml

+        <component id="e5_small" type="hugging-face-embedder">
+            <transformer-model url="https://github.com/vespa-engine/sample-apps/raw/master/examples/model-exporting/model/e5-small-v2-int8.onnx"/>
+            <tokenizer-model url="https://raw.githubusercontent.com/vespa-engine/sample-apps/master/examples/model-exporting/model/tokenizer.json"/>
+            <prepend> <!-- E5 prompt instructions -->
+                <query>query:</query>
+                <document>passage:</document>
+            </prepend>
+        </component>


Inconsistent indentation: the component definition uses 5 spaces while the rest of the file uses standard 2-space indentation. The component should align with other elements at the same level.

Suggested change

<component id="e5_small" type="hugging-face-embedder">

<transformer-model url="https://github.com/vespa-engine/sample-apps/raw/master/examples/model-exporting/model/e5-small-v2-int8.onnx"/>

<tokenizer-model url="https://raw.githubusercontent.com/vespa-engine/sample-apps/master/examples/model-exporting/model/tokenizer.json"/>

<prepend> 

<query>query:</query>

<document>passage:</document>

</prepend>

</component>

<component id="e5_small" type="hugging-face-embedder">

<transformer-model url="https://github.com/vespa-engine/sample-apps/raw/master/examples/model-exporting/model/e5-small-v2-int8.onnx"/>

<tokenizer-model url="https://raw.githubusercontent.com/vespa-engine/sample-apps/master/examples/model-exporting/model/tokenizer.json"/>

<prepend> 

<query>query:</query>

<document>passage:</document>

</prepend>

</component>

Copilot · 2025-11-07T16:47:18Z

aws-simple-rag/README.md

+# Retrieval Augmented Generation (RAG) in Vespa using AWS BedRock models
+
+This sample application demonstrates an end-to-end Retrieval Augmented
+Generation application in Vespa, leveraging [AWS BedRock](https://aws.amazon.com/bedrock/) hosted models.


Incorrect capitalization: "BedRock" should be "Bedrock" (lowercase 'r'). The official AWS product name is "Amazon Bedrock" not "BedRock".

Suggested change

# Retrieval Augmented Generation (RAG) in Vespa using AWS BedRock models

This sample application demonstrates an end-to-end Retrieval Augmented

Generation application in Vespa, leveraging [AWS BedRock](https://aws.amazon.com/bedrock/) hosted models.

# Retrieval Augmented Generation (RAG) in Vespa using AWS Bedrock models

This sample application demonstrates an end-to-end Retrieval Augmented

Generation application in Vespa, leveraging [AWS Bedrock](https://aws.amazon.com/bedrock/) hosted models.

Copilot · 2025-11-07T16:47:19Z

aws-simple-rag/README.md

+# Retrieval Augmented Generation (RAG) in Vespa using AWS BedRock models
+
+This sample application demonstrates an end-to-end Retrieval Augmented
+Generation application in Vespa, leveraging [AWS BedRock](https://aws.amazon.com/bedrock/) hosted models.


Incorrect capitalization: "BedRock" should be "Bedrock" (lowercase 'r'). The official AWS product name is "Amazon Bedrock" not "BedRock".

Suggested change

# Retrieval Augmented Generation (RAG) in Vespa using AWS BedRock models

This sample application demonstrates an end-to-end Retrieval Augmented

Generation application in Vespa, leveraging [AWS BedRock](https://aws.amazon.com/bedrock/) hosted models.

# Retrieval Augmented Generation (RAG) in Vespa using AWS Bedrock models

This sample application demonstrates an end-to-end Retrieval Augmented

Generation application in Vespa, leveraging [AWS Bedrock](https://aws.amazon.com/bedrock/) hosted models.

Copilot · 2025-11-07T16:47:19Z

aws-simple-rag/README.md

+
+```xml
+<secrets>
+    <bedrock-api-key vault=">my-vault-name>" name="<my-secret-name>"/>


Missing opening angle bracket in the vault attribute. The value should be vault="<my-vault-name>" instead of vault=">my-vault-name>" to match the pattern used for <my-secret-name>.

Suggested change

<bedrock-api-key vault=">my-vault-name>" name="<my-secret-name>"/>

<bedrock-api-key vault="<my-vault-name>" name="<my-secret-name>"/>

Copilot · 2025-11-07T16:47:20Z

aws-simple-rag/schemas/passage.sd

+            indexing: input text  | embed e5_small | attribute | index
+            attribute {
+              distance-metric: angular
+            }
+            index {
+                    hnsw {
+                          max-links-per-node: 32
+                          neighbors-to-explore-at-insert: 100
+                    }
+    }


Inconsistent indentation in the embedding field definition. The field content uses excessive indentation (12 spaces) instead of the standard 2 or 4 spaces used elsewhere in the schema.

Suggested change

indexing: input text | embed e5_small | attribute | index

attribute {

distance-metric: angular

}

index {

hnsw {

max-links-per-node: 32

neighbors-to-explore-at-insert: 100

}

}

indexing: input text | embed e5_small | attribute | index

attribute {

distance-metric: angular

}

index {

hnsw {

max-links-per-node: 32

neighbors-to-explore-at-insert: 100

}

}

Copilot · 2025-11-07T16:47:20Z

aws-simple-rag/schemas/passage.sd

+        inputs {
+          query(e) tensor<bfloat16>(x[384])
+        }
+        first-phase {
+            expression: 2 * closeness(field, embedding) + bm25(text) / 11 
+        }
+        match-features: bm25(text) closeness(field, embedding)
+    }


Inconsistent indentation in the rank-profile definition. Lines use 8 spaces for indentation instead of the standard 2 or 4 spaces used in the document field definitions.

Suggested change

inputs {

query(e) tensor<bfloat16>(x[384])

}

first-phase {

expression: 2 * closeness(field, embedding) + bm25(text) / 11

}

match-features: bm25(text) closeness(field, embedding)

}

inputs {

query(e) tensor<bfloat16>(x[384])

}

first-phase {

expression: 2 * closeness(field, embedding) + bm25(text) / 11

}

match-features: bm25(text) closeness(field, embedding)

}

Copilot · 2025-11-07T16:47:20Z

aws-simple-rag/services.xml

+
+    <!-- Setup the client to Bedrock -->
+    <component id="openai" class="ai.vespa.llm.clients.OpenAI">
+      <config name = "ai.vespa.llm.clients.llm-client">


Extra space in the config name attribute. Should be name="ai.vespa.llm.clients.llm-client" instead of name = "ai.vespa.llm.clients.llm-client" to follow standard XML formatting conventions.

Suggested change

<config name = "ai.vespa.llm.clients.llm-client">

<config name="ai.vespa.llm.clients.llm-client">

Copilot · 2025-11-07T16:47:20Z

aws-simple-rag/README.md

+
+### Set-up an AWS Bedrock API Key
+
+Create an [AWS BedRock API key](https://docs.aws.amazon.com/bedrock/latest/userguide/api-keys.html).


Incorrect capitalization: "BedRock" should be "Bedrock" (lowercase 'r'). The official AWS product name is "Amazon Bedrock" not "BedRock".

adding aws-simple-rag sample app

da56cc7

zoharsan requested review from Copilot and kkraune November 7, 2025 16:37

zoharsan temporarily deployed to Vespa Cloud CD November 7, 2025 16:40 — with GitHub Actions Inactive

Copilot AI reviewed Nov 7, 2025

View reviewed changes

zoharsan closed this Nov 7, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

adding aws-simple-rag sample app #1806

adding aws-simple-rag sample app #1806

Uh oh!

zoharsan commented Nov 7, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Nov 7, 2025

Uh oh!

Copilot AI Nov 7, 2025

Uh oh!

Copilot AI Nov 7, 2025

Uh oh!

Copilot AI Nov 7, 2025

Uh oh!

Copilot AI Nov 7, 2025

Uh oh!

Copilot AI Nov 7, 2025

Uh oh!

Copilot AI Nov 7, 2025

Uh oh!

Copilot AI Nov 7, 2025

Uh oh!

Copilot AI Nov 7, 2025

Uh oh!

Copilot AI Nov 7, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

	<bedrock-api-key vault=">my-vault-name>" name="<my-secret-name>"/>
	<bedrock-api-key vault="<my-vault-name>" name="<my-secret-name>"/>

	<config name = "ai.vespa.llm.clients.llm-client">
	<config name="ai.vespa.llm.clients.llm-client">


		### Set-up an AWS Bedrock API Key

		Create an [AWS BedRock API key](https://docs.aws.amazon.com/bedrock/latest/userguide/api-keys.html).

adding aws-simple-rag sample app #1806

adding aws-simple-rag sample app #1806

Uh oh!

Conversation

zoharsan commented Nov 7, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Copilot AI Nov 7, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 7, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 7, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 7, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 7, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 7, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 7, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 7, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 7, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 7, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants