Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
84 changes: 84 additions & 0 deletions ai/gen-ai-agents/mcp-oci-integration/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,84 @@
# MCP Oracle OCI integrations
This repository contains code and examples to help in the following tasks:
* **Develop** MCP servers in **Python**
* **Run** MCP servers on **Oracle OCI**
* **Integrate** MCP servers with **AI Agents**
* **Integrate** MCP servers with other **OCI resources** (ADB, Select AI, ...)
* **Integrate** MCP Servers running on OCI with AI Assistants like **ChatGPT**, Claude.ai, MS Copilot
* **Integrate** MCP Servers with OCI **APM** for **Observability**

![MCP console](./images/mcp_cli.png)

## What is MCP?
**MCP (Model Context Protocol)** is an **open-source standard** that lets AI models (e.g. LLMs or agents) connect bidirectionally with external tools, data sources, and services via a unified interface.

It replaces the “N×M” integration problem (where each AI × data source requires custom code) with one standard protocol.

MCP supports **dynamic discovery** of available tools and context, enabling:
* AI Assistants to get access to relevant information, available in Enterprise Knowledge base.
* Agents to reason and chain actions across disparate systems.

It’s quickly gaining traction: major players like OpenAI, Google DeepMind, Oracle are adopting it to make AI systems more composable and interoperable.

In today’s landscape of agentic AI, MCP is critical because it allows models to act meaningfully in real-world systems rather than remaining isolated black boxes.

## Develop MCP Servers in Python
The easiest way is to use the [FastMCP](https://gofastmcp.com/getting-started/welcome) library.

**Examples**:
* in [Minimal MCP Server](./minimal_mcp_server.py) you'll find a **good, minimal example** of a server exposing two tools, with the option to protect them using [JWT](https://www.jwt.io/introduction#what-is-json-web-token).

If you want to start with **something simpler**, have a look at [how to start developing MCP](./how_to_start_mcp.md). It is simpler, with no support for JWT tokens.

## How to test
If you want to quickly test the MCP server you developed (or the minimal example provided here) you can use the [Streamlit UI](./ui_mcp_agent.py).

In the Streamlit application, you can:
* Specify the URL of the MCP server (default is in [mcp_servers_config.py](./mcp_servers_config.py))
* Select one of models available in OCI Generative AI
* Test making questions answered using the tools exposed by the MCP server.

In [llm_with_mcp.py](./llm_with_mcp.py) there is the complete implementation of the **tool-calling** loop.

## Semantic Search
In this repository there is a **complete implementation of an MCP server** implementing **Semantic Search** on top of **Oracle 23AI**.
To use it, you need only:
* To load the documents in the Oracle DB
* To put the right configuration, to connect to DB, in config_private.py.

The code is available [here](./mcp_semantic_search_with_iam.py).

Access to Oracle 23AI Vector Search is through the **new** [langchain-oci integration library](https://github.com/oracle/langchain-oracle)

## Adding security
If you want to put your **MCP** server in production, you need to add security, at several levels.

Just to mention few important points:
* You don't want to expose directly the MCP server over Internet
* The communication with the MCP server must be encrypted (i.e: using TLS)
* You want to authenticate and authorize the clients

Using **OCI services** there are several things you can do to get the right level of security:
* You can put an **OCI API Gateway** in front, using it as TLS termination
* You can enable authentication using **JWT** tokens
* You can use **OCI IAM** to generate **JWT** tokens
* You can use OCI network security

More details in a dedicate page.

## Integrate MCP Semantic Search with ChatGPT
If you deploy the [MCP Semantic Search](./mcp_semantic_search_with_iam.py) server you can test the integration with **ChatGPT** in **Developer Mode**. It provides a **search** tool, compliant with **OpenAI** specs.

Soon, we'll add a server fully compliant with **OpenAI** specifications, that can be integrated in **Deep Research**. The server must implement two methods (**search** and **fetch**) with a different behaviour, following srictly OpenAI specs.

An initial implementation is available [here](./mcp_deep_research_with_iam.py)

Details available [here](./integrate_chatgpt.md)

## Integrate OCI ADB Select AI
Another option is to use an MCP server to be able to integrate OCI **SelectAI** in ChatGPT or other assistants supporting MCP.
In this way you have an option to do full **Text2SQL** search, over your database schema. Then, the AI assistant can process your retrieved data.

An example is [here](./mcp_selectai.py)


6 changes: 6 additions & 0 deletions ai/gen-ai-agents/mcp-oci-integration/check_code.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
# format code
black *.py

# check
pylint *.py

107 changes: 107 additions & 0 deletions ai/gen-ai-agents/mcp-oci-integration/config.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,107 @@
"""
File name: config.py
Author: Luigi Saetta
Date last modified: 2025-07-02
Python Version: 3.11

Description:
This module provides general configurations


Usage:
Import this module into other scripts to use its functions.
Example:
import config

License:
This code is released under the MIT License.

Notes:
This is a part of a demo showing how to implement an advanced
RAG solution as a LangGraph agent.

Warnings:
This module is in development, may change in future versions.
"""

DEBUG = False

# type of OCI auth
AUTH = "API_KEY"

# embeddings
# added this to distinguish between Cohere end REST NVIDIA models
# can be OCI or NVIDIA
EMBED_MODEL_TYPE = "OCI"
# EMBED_MODEL_TYPE = "NVIDIA"
EMBED_MODEL_ID = "cohere.embed-multilingual-v3.0"

# this one needs to specify the dimension, default is 1536
# EMBED_MODEL_ID = "cohere.embed-v4.0"
# used only for NVIDIA models
NVIDIA_EMBED_MODEL_URL = ""


# LLM
# this is the default model
LLM_MODEL_ID = "meta.llama-3.3-70b-instruct"
TEMPERATURE = 0.1
MAX_TOKENS = 4000

# OCI general
# REGION = "eu-frankfurt-1"
REGION = "us-chicago-1"
SERVICE_ENDPOINT = f"https://inference.generativeai.{REGION}.oci.oraclecloud.com"

if REGION == "us-chicago-1":
# for now only available in chicago region
MODEL_LIST = [
"xai.grok-3",
"xai.grok-4",
"openai.gpt-4.1",
"openai.gpt-4o",
"openai.gpt-5",
"meta.llama-3.3-70b-instruct",
"cohere.command-a-03-2025",
]
else:
MODEL_LIST = [
"openai.gpt-4.1",
"openai.gpt-5",
"meta.llama-3.3-70b-instruct",
"cohere.command-a-03-2025",
]

# semantic search
TOP_K = 6
COLLECTION_LIST = ["BOOKS", "NVIDIA_BOOKS2"]
DEFAULT_COLLECTION = "BOOKS"


# history management (put -1 if you want to disable trimming)
# consider that we have pair (human, ai) so use an even (ex: 6) value
MAX_MSGS_IN_HISTORY = 6

# reranking enabled or disabled from UI

# for loading
CHUNK_SIZE = 4000
CHUNK_OVERLAP = 100

# for MCP server
TRANSPORT = "streamable-http"
# bind to all interfaces
HOST = "0.0.0.0"
PORT = 9000

# with this we can toggle JWT token auth
ENABLE_JWT_TOKEN = False
# for JWT token with OCI
# put your domain URL here
IAM_BASE_URL = "https://idcs-930d7b2ea2cb46049963ecba3049f509.identity.oraclecloud.com"
# these are used during the verification of the token
ISSUER = "https://identity.oraclecloud.com/"
AUDIENCE = ["urn:opc:lbaas:logicalguid=idcs-930d7b2ea2cb46049963ecba3049f509"]

# for Select AI
SELECT_AI_PROFILE = "OCI_GENERATIVE_AI_PROFILE"
33 changes: 33 additions & 0 deletions ai/gen-ai-agents/mcp-oci-integration/config_private_template.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
"""
Private config
"""

#
VECTOR_DB_USER = "your-db-user"
VECTOR_DB_PWD = "your-db-pwd"

VECTOR_WALLET_PWD = "wallet-pwd"
VECTOR_DSN = "db-psn"
VECTOR_WALLET_DIR = "/Users/xxx/yyy"

CONNECT_ARGS = {
"user": VECTOR_DB_USER,
"password": VECTOR_DB_PWD,
"dsn": VECTOR_DSN,
"config_dir": VECTOR_WALLET_DIR,
"wallet_location": VECTOR_WALLET_DIR,
"wallet_password": VECTOR_WALLET_PWD,
}

COMPARTMENT_ID = "ocid1.compartment.oc1.your-compartment-ocid"

# to add JWT to MCP server
JWT_SECRET = "secret"
# using this in the demo, make it simpler.
# In production should switch to RS256 and use a key-pair
JWT_ALGORITHM = "HS256"

# if using IAM to generate JWT token
OCI_CLIENT_ID = "client-id"
# th ocid of the secret in the vault
SECRET_OCID = "ocid1.vaultsecret.oc1.eu-frankfurt-1.secret-ocid"
123 changes: 123 additions & 0 deletions ai/gen-ai-agents/mcp-oci-integration/custom_rest_embeddings.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,123 @@
"""
Custom class to support Embeddings model deployed using NVIDIA E.

License: MIT
"""

from typing import List
from langchain_core.embeddings import Embeddings
import requests
from utils import get_console_logger

# list of allowed values for dims, input_type and truncate parms
ALLOWED_DIMS = {384, 512, 768, 1024, 2048}
ALLOWED_INPUT_TYPES = {"passage", "query"}
ALLOWED_TRUNCATE_VALUES = {"NONE", "START", "END"}

# list of models with tunable dimensions
MATRIOSKA_MODELS = {"nvidia/llama-3.2-nv-embedqa-1b-v2"}

logger = get_console_logger()


class CustomRESTEmbeddings(Embeddings):
"""
Custom class to wrap an embedding model with rest interface from NVIDIA NIM

see:
https://docs.api.nvidia.com/nim/reference/nvidia-llama-3_2-nv-embedqa-1b-v2-infer
"""

def __init__(self, api_url: str, model: str, batch_size: int = 10, dimensions=2048):
"""
Init

as of now, no security
args:
api_url: the endpoint
model: the model id string
batch_size
dimensions: dim of the embedding vector
"""
self.api_url = api_url
self.model = model
self.batch_size = batch_size

if self.model in MATRIOSKA_MODELS:
self.dimensions = dimensions
else:
# changing dimensions is not supported
self.dimensions = None

# Validation at init time
if self.dimensions is not None and self.dimensions not in ALLOWED_DIMS:
raise ValueError(
f"Invalid dimensions {self.dimensions!r}: must be one of {sorted(ALLOWED_DIMS)}"
)

def embed_documents(
self,
texts: List[str],
# must be passage and not document
input_type: str = "passage",
truncate: str = "NONE",
) -> List[List[float]]:
"""
Embed a list of documents using batching.
"""
# normalize
truncate = truncate.upper()

logger.info("Calling NVIDIA embeddings, embed_documents...")

if input_type not in ALLOWED_INPUT_TYPES:
raise ValueError(
f"Invalid value for input_types: must be one of {ALLOWED_INPUT_TYPES}"
)
if truncate not in ALLOWED_TRUNCATE_VALUES:
raise ValueError(
f"Invalid value for truncate: must be one of {ALLOWED_TRUNCATE_VALUES}"
)

all_embeddings: List[List[float]] = []

for i in range(0, len(texts), self.batch_size):
batch = texts[i : i + self.batch_size]
# process a single batch
if self.model in MATRIOSKA_MODELS:
json_request = {
"model": self.model,
"input": batch,
"input_type": input_type,
"truncate": truncate,
"dimensions": self.dimensions,
}
else:
json_request = {
"model": self.model,
"input": batch,
"input_type": input_type,
"truncate": truncate,
"dimensions": self.dimensions,
}

resp = requests.post(
self.api_url,
json=json_request,
timeout=30,
)
resp.raise_for_status()
data = resp.json().get("data", [])

if len(data) != len(batch):
raise ValueError(f"Expected {len(batch)} embeddings, got {len(data)}")
all_embeddings.extend(item["embedding"] for item in data)
return all_embeddings

def embed_query(self, text: str) -> List[float]:
"""
Embed the query (a str)
"""
logger.info("Calling NVIDIA embeddings, embed_query...")

return self.embed_documents([text], input_type="query")[0]
Loading