Skip to content

Commit 29258d4

Browse files
committed
synch and some updates to custom rag agent
1 parent 6844b46 commit 29258d4

File tree

7 files changed

+148
-2
lines changed

7 files changed

+148
-2
lines changed

ai/gen-ai-agents/README.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -18,6 +18,7 @@ Oracle’s Generative AI Agents is a fully managed service that combines the pow
1818
## Reusable Assets Overview
1919
- [HCM agent created by partner Conneqtion Group which contains agents to connect to Fusion HCM, Expense and many others](https://www.youtube.com/watch?v=OhZcWx_H_tQ)
2020
- [Finance analytics agent created by our partner TPX impact](https://bit.ly/genai4analyst)
21+
- [Custom RAG agent, based on Langgraph](./custom-rag-agent)
2122

2223
# Useful Links
2324

@@ -38,3 +39,4 @@ Copyright (c) 2025 Oracle and/or its affiliates.
3839
Licensed under the Universal Permissive License (UPL), Version 1.0.
3940

4041
See [LICENSE](https://github.com/oracle-devrel/technology-engineering/blob/main/LICENSE) for more details.
42+
Lines changed: 49 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,49 @@
1+
![UI](images/ui_image.png)
2+
3+
# Custom RAG agent
4+
This repository contains the code for the development of a **custom RAG Agent**, based on **OCI Generative AI**, **Oracle 23AI** Vector Store and **LangGraph**
5+
6+
**Author**: L. Saetta
7+
**Last updated**: 09/09/2025
8+
9+
## Design and implementation
10+
* The agent is implemented using **LangGraph**
11+
* Vector Search is implemented, using Langchain, on top of Oracle 23AI
12+
* A **reranker** can be used to refine the search
13+
14+
### Design decisions:
15+
* For every node of the graph there is a dedicated Python class (a **Runnable**, as QueryRewriter...)
16+
* **Reranker** is implemented using a LLM. As other option, it is easy to plug-in, for example, Cohere reranker
17+
* The agent is integrated with **OCI APM**, for **Observability**; Integration using **py-zipkin**
18+
* UI implemented using **Streamlit**
19+
* **Semantic Search** is also exposed as a [MCP server](./mcp_semantic_search_with_iam.py)
20+
21+
### Streaming:
22+
* Support for streaming events from the agent: as soon as a step is completed (Vector Search, Reranking, ...) the UI is updated.
23+
For example, links to the documentation' chunks are displayed before the final answer is ready.
24+
* Streaming of the final answer.
25+
26+
### MCP support:
27+
(07/2025) I have added an implementation of an **MCP** server that exposes the Semantic Search feature.
28+
Security can be handled in two ways:
29+
* custom: generate the **JWT token** using the library **PyJWT**
30+
* **OCI**: generate the JWT token using **OCI IAM**
31+
32+
## Status
33+
It is **WIP**.
34+
35+
## References
36+
* [Integration with OCI APM](https://luigi-saetta.medium.com/enhancing-observability-in-rag-solutions-with-oracle-cloud-6f93b2675f40)
37+
38+
## Advantages of the Agentic approach
39+
One of the primary advantages of the agentic approach is its **modularity**.
40+
Customer requirements often surpass the simplicity of typical Retrieval-Augmented Generation (RAG) demonstrations. Implementing a framework like **LangGraph** necessitates organizing code into a modular sequence of steps, facilitating the seamless integration of additional features at appropriate places.​
41+
42+
For example, to ensure that final responses do not disclose Personally Identifiable Information (PII) present in the knowledge base, one can simply append a node at the end of the graph. This node would process the generated answer, detect any PII, and anonymize it accordingly.
43+
44+
## Configuration
45+
* use Python 3.11
46+
* use the requirements.txt
47+
* create your config_private.py using the template provided
48+
* for MCP server: create a confidential application in OCI IAM
49+

ai/gen-ai-agents/custom-rag-agent/assistant_ui_langgraph.py

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -158,6 +158,9 @@ def register_feedback():
158158
"Select the Chat Model",
159159
config.MODEL_LIST,
160160
)
161+
162+
st.sidebar.text_input(label="Embed Model", value=config.EMBED_MODEL_ID, disabled=True)
163+
161164
st.session_state.enable_reranker = st.sidebar.checkbox(
162165
"Enable Reranker", value=True, disabled=False
163166
)

ai/gen-ai-agents/custom-rag-agent/config.py

Lines changed: 8 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -33,8 +33,11 @@
3333
# added this to distinguish between Cohere end REST NVIDIA models
3434
# can be OCI or NVIDIA
3535
EMBED_MODEL_TYPE = "OCI"
36+
# EMBED_MODEL_TYPE = "NVIDIA"
3637
EMBED_MODEL_ID = "cohere.embed-multilingual-v3.0"
3738
# EMBED_MODEL_ID = "cohere.embed-multilingual-image-v3.0"
39+
40+
# this one needs to specify the dimension, default is 1536
3841
# EMBED_MODEL_ID = "cohere.embed-v4.0"
3942

4043
# to support NVIDIA NIM
@@ -64,21 +67,25 @@
6467
"xai.grok-4",
6568
"openai.gpt-4.1",
6669
"openai.gpt-4o",
70+
"openai.gpt-5",
6771
"meta.llama-3.3-70b-instruct",
6872
"cohere.command-a-03-2025",
6973
]
7074
else:
7175
MODEL_LIST = [
7276
"meta.llama-3.3-70b-instruct",
7377
"cohere.command-a-03-2025",
78+
"openai.gpt-4.1",
79+
"openai.gpt-4o",
80+
"openai.gpt-5",
7481
]
7582

7683
ENABLE_USER_FEEDBACK = True
7784

7885
# semantic search
7986
TOP_K = 6
8087
# COLLECTION_LIST = ["BOOKS", "CNAF"]
81-
COLLECTION_LIST = ["BOOKS", "BOOKS2", "AMPLIFON", "AMPLIFON_EXT"]
88+
COLLECTION_LIST = ["BOOKS", "NVIDIA_BOOKS2"]
8289
DEFAULT_COLLECTION = "BOOKS"
8390

8491

138 KB
Loading

ai/gen-ai-agents/custom-rag-agent/oci_models.py

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -20,7 +20,7 @@
2020
This is a part of a demo showing how to implement an advanced
2121
RAG solution as a LangGraph agent.
2222
23-
modifiued to support xAI and OpenAI models through Langchain
23+
modified to support xAI and OpenAI models through Langchain
2424
2525
Warnings:
2626
This module is in development, may change in future versions.
@@ -50,7 +50,9 @@
5050

5151
ALLOWED_EMBED_MODELS_TYPE = {"OCI", "NVIDIA"}
5252

53+
# for gpt5, since max tokens is not supported
5354
MODELS_WITHOUT_KWARGS = {
55+
"openai.gpt-5",
5456
"openai.gpt-4o-search-preview",
5557
"openai.gpt-4o-search-preview-2025-03-11",
5658
}
@@ -126,6 +128,8 @@ def get_embedding_model(model_type="OCI"):
126128
api_url=NVIDIA_EMBED_MODEL_URL, model=EMBED_MODEL_ID
127129
)
128130

131+
logger.info("Embedding model is: %s", EMBED_MODEL_ID)
132+
129133
return embed_model
130134

131135

Lines changed: 81 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,81 @@
1+
"""
2+
File name: trasnport.py
3+
Author: Luigi Saetta
4+
Date last modified: 2025-03-31
5+
Python Version: 3.11
6+
7+
Description:
8+
This code provide the http transport support for integration with OCI APM.
9+
10+
Usage:
11+
Import this module into other scripts to use its functions.
12+
Example:
13+
...
14+
15+
16+
License:
17+
This code is released under the MIT License.
18+
19+
Notes:
20+
This is a part of a demo showing how to implement an advanced
21+
RAG solution as a LangGraph agent.
22+
23+
Warnings:
24+
This module is in development, may change in future versions.
25+
"""
26+
27+
import requests
28+
from utils import get_console_logger
29+
30+
# changed to handle ENABLE_TRACING from UI
31+
import config
32+
from config_private import APM_PUBLIC_KEY
33+
34+
35+
logger = get_console_logger()
36+
37+
38+
def http_transport(encoded_span):
39+
"""
40+
Sends encoded tracing data to OCI APM using py-zipkin.
41+
42+
Args:
43+
encoded_span (bytes): The encoded span data to send.
44+
45+
Returns:
46+
requests.Response or None: The response from the APM service or None if tracing is disabled.
47+
"""
48+
try:
49+
# Load config inside the function to avoid global dependency issues
50+
base_url = config.APM_BASE_URL
51+
content_type = config.APM_CONTENT_TYPE
52+
53+
# Validate configuration
54+
if not base_url:
55+
raise ValueError("APM base URL is not configured")
56+
if not APM_PUBLIC_KEY:
57+
raise ValueError("APM public key is missing")
58+
59+
# If tracing is disabled, do nothing
60+
if not config.ENABLE_TRACING:
61+
logger.info("Tracing is disabled. No data sent to APM.")
62+
return None
63+
64+
# Construct endpoint dynamically
65+
apm_url = f"{base_url}/observations/public-span?dataFormat=zipkin&dataFormatVersion=2&dataKey={APM_PUBLIC_KEY}"
66+
67+
response = requests.post(
68+
apm_url,
69+
data=encoded_span,
70+
headers={"Content-Type": content_type},
71+
timeout=30,
72+
)
73+
response.raise_for_status() # Raise exception for HTTP errors
74+
75+
return response
76+
except requests.RequestException as e:
77+
logger.error("Failed to send span to APM: %s", str(e))
78+
return None
79+
except Exception as e:
80+
logger.error("Unexpected error in http_transport: %s", str(e))
81+
return None

0 commit comments

Comments
 (0)