|
| 1 | + |
| 2 | + |
| 3 | +# Custom RAG agent |
| 4 | +This repository contains the code for the development of a **custom RAG Agent**, based on **OCI Generative AI**, **Oracle 23AI** Vector Store and **LangGraph** |
| 5 | + |
| 6 | +**Author**: L. Saetta |
| 7 | +**Last updated**: 09/09/2025 |
| 8 | + |
| 9 | +## Design and implementation |
| 10 | +* The agent is implemented using **LangGraph** |
| 11 | +* Vector Search is implemented, using Langchain, on top of Oracle 23AI |
| 12 | +* A **reranker** can be used to refine the search |
| 13 | + |
| 14 | +### Design decisions: |
| 15 | +* For every node of the graph there is a dedicated Python class (a **Runnable**, as QueryRewriter...) |
| 16 | +* **Reranker** is implemented using a LLM. As other option, it is easy to plug-in, for example, Cohere reranker |
| 17 | +* The agent is integrated with **OCI APM**, for **Observability**; Integration using **py-zipkin** |
| 18 | +* UI implemented using **Streamlit** |
| 19 | +* **Semantic Search** is also exposed as a [MCP server](./mcp_semantic_search_with_iam.py) |
| 20 | + |
| 21 | +### Streaming: |
| 22 | +* Support for streaming events from the agent: as soon as a step is completed (Vector Search, Reranking, ...) the UI is updated. |
| 23 | +For example, links to the documentation' chunks are displayed before the final answer is ready. |
| 24 | +* Streaming of the final answer. |
| 25 | + |
| 26 | +### MCP support: |
| 27 | +(07/2025) I have added an implementation of an **MCP** server that exposes the Semantic Search feature. |
| 28 | +Security can be handled in two ways: |
| 29 | +* custom: generate the **JWT token** using the library **PyJWT** |
| 30 | +* **OCI**: generate the JWT token using **OCI IAM** |
| 31 | + |
| 32 | +## Status |
| 33 | +It is **WIP**. |
| 34 | + |
| 35 | +## References |
| 36 | +* [Integration with OCI APM](https://luigi-saetta.medium.com/enhancing-observability-in-rag-solutions-with-oracle-cloud-6f93b2675f40) |
| 37 | + |
| 38 | +## Advantages of the Agentic approach |
| 39 | +One of the primary advantages of the agentic approach is its **modularity**. |
| 40 | +Customer requirements often surpass the simplicity of typical Retrieval-Augmented Generation (RAG) demonstrations. Implementing a framework like **LangGraph** necessitates organizing code into a modular sequence of steps, facilitating the seamless integration of additional features at appropriate places. |
| 41 | + |
| 42 | +For example, to ensure that final responses do not disclose Personally Identifiable Information (PII) present in the knowledge base, one can simply append a node at the end of the graph. This node would process the generated answer, detect any PII, and anonymize it accordingly. |
| 43 | + |
| 44 | +## Configuration |
| 45 | +* use Python 3.11 |
| 46 | +* use the requirements.txt |
| 47 | +* create your config_private.py using the template provided |
| 48 | +* for MCP server: create a confidential application in OCI IAM |
| 49 | + |
0 commit comments