oracle-devrel · luigisaetta · Sep 30, 2025 · Sep 30, 2025
diff --git a/ai/gen-ai-agents/mcp-oci-integration/README.md b/ai/gen-ai-agents/mcp-oci-integration/README.md
@@ -0,0 +1,84 @@
+# MCP Oracle OCI integrations
+This repository contains code and examples to help in the following tasks:
+* **Develop** MCP servers in **Python**
+* **Run** MCP servers on **Oracle OCI**
+* **Integrate** MCP servers with **AI Agents**
+* **Integrate** MCP servers with other **OCI resources** (ADB, Select AI, ...)
+* **Integrate** MCP Servers running on OCI with AI Assistants like **ChatGPT**, Claude.ai, MS Copilot
+* **Integrate** MCP Servers with OCI **APM** for **Observability**
+
+![MCP console](./images/mcp_cli.png)
+
+## What is MCP?
+**MCP (Model Context Protocol)** is an **open-source standard** that lets AI models (e.g. LLMs or agents) connect bidirectionally with external tools, data sources, and services via a unified interface. 
+
+It replaces the “N×M” integration problem (where each AI × data source requires custom code) with one standard protocol. 
+
+MCP supports **dynamic discovery** of available tools and context, enabling:
+* AI Assistants to get access to relevant information, available in Enterprise Knowledge base.
+* Agents to reason and chain actions across disparate systems. 
+
+It’s quickly gaining traction: major players like OpenAI, Google DeepMind, Oracle are adopting it to make AI systems more composable and interoperable. 
+
+In today’s landscape of agentic AI, MCP is critical because it allows models to act meaningfully in real-world systems rather than remaining isolated black boxes.
+
+## Develop MCP Servers in Python
+The easiest way is to use the [FastMCP](https://gofastmcp.com/getting-started/welcome) library.
+
+**Examples**:
+* in [Minimal MCP Server](./minimal_mcp_server.py) you'll find a **good, minimal example** of a server exposing two tools, with the option to protect them using [JWT](https://www.jwt.io/introduction#what-is-json-web-token).
+
+If you want to start with **something simpler**, have a look at [how to start developing MCP](./how_to_start_mcp.md). It is simpler, with no support for JWT tokens.
+
+## How to test
+If you want to quickly test the MCP server you developed (or the minimal example provided here) you can use the [Streamlit UI](./ui_mcp_agent.py).
+
+In the Streamlit application, you can:
+* Specify the URL of the MCP server (default is in [mcp_servers_config.py](./mcp_servers_config.py))
+* Select one of models available in OCI Generative AI
+* Test making questions answered using the tools exposed by the MCP server.
+
+In [llm_with_mcp.py](./llm_with_mcp.py) there is the complete implementation of the **tool-calling** loop.
+
+## Semantic Search
+In this repository there is a **complete implementation of an MCP server** implementing **Semantic Search** on top of **Oracle 23AI**.
+To use it, you need only:
+* To load the documents in the Oracle DB
+* To put the right configuration, to connect to DB, in config_private.py.
+
+The code is available [here](./mcp_semantic_search_with_iam.py). 
+
+Access to Oracle 23AI Vector Search is through the **new** [langchain-oci integration library](https://github.com/oracle/langchain-oracle)
+
+## Adding security
+If you want to put your **MCP** server in production, you need to add security, at several levels.
+
+Just to mention few important points:
+* You don't want to expose directly the MCP server over Internet
+* The communication with the MCP server must be encrypted (i.e: using TLS)
+* You want to authenticate and authorize the clients
+
+Using **OCI services** there are several things you can do to get the right level of security:
+* You can put an **OCI API Gateway** in front, using it as TLS termination
+* You can enable authentication using **JWT** tokens
+* You can use **OCI IAM** to generate **JWT** tokens
+* You can use OCI network security
+
+More details in a dedicate page.
+
+## Integrate MCP Semantic Search with ChatGPT
+If you deploy the [MCP Semantic Search](./mcp_semantic_search_with_iam.py) server you can test the integration with **ChatGPT** in **Developer Mode**. It provides a **search** tool, compliant with **OpenAI** specs. 
+
+Soon, we'll add a server fully compliant with **OpenAI** specifications, that can be integrated in **Deep Research**. The server must implement two methods (**search** and **fetch**) with a different behaviour, following srictly OpenAI specs.
+
+An initial implementation is available [here](./mcp_deep_research_with_iam.py)
+
+Details available [here](./integrate_chatgpt.md)
+
+## Integrate OCI ADB Select AI
+Another option is to use an MCP server to be able to integrate OCI **SelectAI** in ChatGPT or other assistants supporting MCP.
+In this way you have an option to do full **Text2SQL** search, over your database schema. Then, the AI assistant can process your retrieved data.
+
+An example is [here](./mcp_selectai.py)
+
+
diff --git a/ai/gen-ai-agents/mcp-oci-integration/check_code.sh b/ai/gen-ai-agents/mcp-oci-integration/check_code.sh
@@ -0,0 +1,6 @@
+# format code
+black *.py
+
+# check
+pylint *.py
+
diff --git a/ai/gen-ai-agents/mcp-oci-integration/config.py b/ai/gen-ai-agents/mcp-oci-integration/config.py
@@ -0,0 +1,107 @@
+"""
+File name: config.py
+Author: Luigi Saetta
+Date last modified: 2025-07-02
+Python Version: 3.11
+
+Description:
+    This module provides general configurations
+
+
+Usage:
+    Import this module into other scripts to use its functions.
+    Example:
+        import config
+
+License:
+    This code is released under the MIT License.
+
+Notes:
+    This is a part of a demo showing how to implement an advanced
+    RAG solution as a LangGraph agent.
+
+Warnings:
+    This module is in development, may change in future versions.
+"""
+
+DEBUG = False
+
+# type of OCI auth
+AUTH = "API_KEY"
+
+# embeddings
+# added this to distinguish between Cohere end REST NVIDIA models
+# can be OCI or NVIDIA
+EMBED_MODEL_TYPE = "OCI"
+# EMBED_MODEL_TYPE = "NVIDIA"
+EMBED_MODEL_ID = "cohere.embed-multilingual-v3.0"
+
+# this one needs to specify the dimension, default is 1536
+# EMBED_MODEL_ID = "cohere.embed-v4.0"
+# used only for NVIDIA models
+NVIDIA_EMBED_MODEL_URL = ""
+
+
+# LLM
+# this is the default model
+LLM_MODEL_ID = "meta.llama-3.3-70b-instruct"
+TEMPERATURE = 0.1
+MAX_TOKENS = 4000
+
+# OCI general
+# REGION = "eu-frankfurt-1"
+REGION = "us-chicago-1"
+SERVICE_ENDPOINT = f"https://inference.generativeai.{REGION}.oci.oraclecloud.com"
+
+if REGION == "us-chicago-1":
+    # for now only available in chicago region
+    MODEL_LIST = [
+        "xai.grok-3",
+        "xai.grok-4",
+        "openai.gpt-4.1",
+        "openai.gpt-4o",
+        "openai.gpt-5",
+        "meta.llama-3.3-70b-instruct",
+        "cohere.command-a-03-2025",
+    ]
+else:
+    MODEL_LIST = [
+        "openai.gpt-4.1",
+        "openai.gpt-5",
+        "meta.llama-3.3-70b-instruct",
+        "cohere.command-a-03-2025",
+    ]
+
+# semantic search
+TOP_K = 6
+COLLECTION_LIST = ["BOOKS", "NVIDIA_BOOKS2"]
+DEFAULT_COLLECTION = "BOOKS"
+
+
+# history management (put -1 if you want to disable trimming)
+# consider that we have pair (human, ai) so use an even (ex: 6) value
+MAX_MSGS_IN_HISTORY = 6
+
+# reranking enabled or disabled from UI
+
+# for loading
+CHUNK_SIZE = 4000
+CHUNK_OVERLAP = 100
+
+# for MCP server
+TRANSPORT = "streamable-http"
+# bind to all interfaces
+HOST = "0.0.0.0"
+PORT = 9000
+
+# with this we can toggle JWT token auth
+ENABLE_JWT_TOKEN = False
+# for JWT token with OCI
+# put your domain URL here
+IAM_BASE_URL = "https://idcs-930d7b2ea2cb46049963ecba3049f509.identity.oraclecloud.com"
+# these are used during the verification of the token
+ISSUER = "https://identity.oraclecloud.com/"
+AUDIENCE = ["urn:opc:lbaas:logicalguid=idcs-930d7b2ea2cb46049963ecba3049f509"]
+
+# for Select AI
+SELECT_AI_PROFILE = "OCI_GENERATIVE_AI_PROFILE"
diff --git a/ai/gen-ai-agents/mcp-oci-integration/config_private_template.py b/ai/gen-ai-agents/mcp-oci-integration/config_private_template.py
@@ -0,0 +1,33 @@
+"""
+Private config
+"""
+
+#
+VECTOR_DB_USER = "your-db-user"
+VECTOR_DB_PWD = "your-db-pwd"
+
+VECTOR_WALLET_PWD = "wallet-pwd"
+VECTOR_DSN = "db-psn"
+VECTOR_WALLET_DIR = "/Users/xxx/yyy"
+
+CONNECT_ARGS = {
+    "user": VECTOR_DB_USER,
+    "password": VECTOR_DB_PWD,
+    "dsn": VECTOR_DSN,
+    "config_dir": VECTOR_WALLET_DIR,
+    "wallet_location": VECTOR_WALLET_DIR,
+    "wallet_password": VECTOR_WALLET_PWD,
+}
+
+COMPARTMENT_ID = "ocid1.compartment.oc1.your-compartment-ocid"
+
+# to add JWT to MCP server
+JWT_SECRET = "secret"
+# using this in the demo, make it simpler.
+# In production should switch to RS256 and use a key-pair
+JWT_ALGORITHM = "HS256"
+
+# if using IAM to generate JWT token
+OCI_CLIENT_ID = "client-id"
+# th ocid of the secret in the vault
+SECRET_OCID = "ocid1.vaultsecret.oc1.eu-frankfurt-1.secret-ocid"
diff --git a/ai/gen-ai-agents/mcp-oci-integration/custom_rest_embeddings.py b/ai/gen-ai-agents/mcp-oci-integration/custom_rest_embeddings.py
@@ -0,0 +1,123 @@
+"""
+Custom class to support Embeddings model deployed using NVIDIA E.
+
+License: MIT
+"""
+
+from typing import List
+from langchain_core.embeddings import Embeddings
+import requests
+from utils import get_console_logger
+
+# list of allowed values for dims, input_type and truncate parms
+ALLOWED_DIMS = {384, 512, 768, 1024, 2048}
+ALLOWED_INPUT_TYPES = {"passage", "query"}
+ALLOWED_TRUNCATE_VALUES = {"NONE", "START", "END"}
+
+# list of models with tunable dimensions
+MATRIOSKA_MODELS = {"nvidia/llama-3.2-nv-embedqa-1b-v2"}
+
+logger = get_console_logger()
+
+
+class CustomRESTEmbeddings(Embeddings):
+    """
+    Custom class to wrap an embedding model with rest interface from NVIDIA NIM
+
+    see:
+        https://docs.api.nvidia.com/nim/reference/nvidia-llama-3_2-nv-embedqa-1b-v2-infer
+    """
+
+    def __init__(self, api_url: str, model: str, batch_size: int = 10, dimensions=2048):
+        """
+        Init
+
+        as of now, no security
+        args:
+            api_url: the endpoint
+            model: the model id string
+            batch_size
+            dimensions: dim of the embedding vector
+        """
+        self.api_url = api_url
+        self.model = model
+        self.batch_size = batch_size
+
+        if self.model in MATRIOSKA_MODELS:
+            self.dimensions = dimensions
+        else:
+            # changing dimensions is not supported
+            self.dimensions = None
+
+        # Validation at init time
+        if self.dimensions is not None and self.dimensions not in ALLOWED_DIMS:
+            raise ValueError(
+                f"Invalid dimensions {self.dimensions!r}: must be one of {sorted(ALLOWED_DIMS)}"
+            )
+
+    def embed_documents(
+        self,
+        texts: List[str],
+        # must be passage and not document
+        input_type: str = "passage",
+        truncate: str = "NONE",
+    ) -> List[List[float]]:
+        """
+        Embed a list of documents using batching.
+        """
+        # normalize
+        truncate = truncate.upper()
+
+        logger.info("Calling NVIDIA embeddings, embed_documents...")
+
+        if input_type not in ALLOWED_INPUT_TYPES:
+            raise ValueError(
+                f"Invalid value for input_types: must be one of {ALLOWED_INPUT_TYPES}"
+            )
+        if truncate not in ALLOWED_TRUNCATE_VALUES:
+            raise ValueError(
+                f"Invalid value for truncate: must be one of {ALLOWED_TRUNCATE_VALUES}"
+            )
+
+        all_embeddings: List[List[float]] = []
+
+        for i in range(0, len(texts), self.batch_size):
+            batch = texts[i : i + self.batch_size]
+            # process a single batch
+            if self.model in MATRIOSKA_MODELS:
+                json_request = {
+                    "model": self.model,
+                    "input": batch,
+                    "input_type": input_type,
+                    "truncate": truncate,
+                    "dimensions": self.dimensions,
+                }
+            else:
+                json_request = {
+                    "model": self.model,
+                    "input": batch,
+                    "input_type": input_type,
+                    "truncate": truncate,
+                    "dimensions": self.dimensions,
+                }
+
+            resp = requests.post(
+                self.api_url,
+                json=json_request,
+                timeout=30,
+            )
+            resp.raise_for_status()
+            data = resp.json().get("data", [])
+
+            if len(data) != len(batch):
+                raise ValueError(f"Expected {len(batch)} embeddings, got {len(data)}")
+            all_embeddings.extend(item["embedding"] for item in data)
+        return all_embeddings
+
+    def embed_query(self, text: str) -> List[float]:
+        """
+        Embed the query (a str)
+        """
+        logger.info("Calling NVIDIA embeddings, embed_query...")
+
+        return self.embed_documents([text], input_type="query")[0]