Skip to content

Commit 50b208c

Browse files
committed
Add OCI LangChain support for hosted Nemotron workflows
Signed-off-by: Federico Kamelhar <federico.kamelhar@oracle.com>
1 parent f51c41c commit 50b208c

File tree

19 files changed

+1586
-818
lines changed

19 files changed

+1586
-818
lines changed

docs/source/build-workflows/llms/index.md

Lines changed: 41 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -28,6 +28,7 @@ NVIDIA NeMo Agent Toolkit supports the following LLM providers:
2828
| [OpenAI](https://openai.com) | `openai` | OpenAI API |
2929
| [AWS Bedrock](https://aws.amazon.com/bedrock/) | `aws_bedrock` | AWS Bedrock API |
3030
| [Azure OpenAI](https://learn.microsoft.com/en-us/azure/ai-foundry/openai/quickstart) | `azure_openai` | Azure OpenAI API |
31+
| [OCI Generative AI](https://docs.oracle.com/en-us/iaas/Content/generative-ai/home.htm) | `oci` | OCI Generative AI via the OCI SDK-backed `langchain-oci` integration |
3132
| [LiteLLM](https://github.com/BerriAI/litellm) | `litellm` | LiteLLM API |
3233
| [Hugging Face](https://huggingface.co) | `huggingface` | Hugging Face API |
3334
| [Hugging Face Inference](https://huggingface.co/docs/api-inference) | `huggingface_inference` | Hugging Face Inference API, Endpoints, and TGI |
@@ -52,6 +53,15 @@ llms:
5253
azure_openai_llm:
5354
_type: azure_openai
5455
azure_deployment: gpt-4o-mini
56+
oci_llm:
57+
_type: oci
58+
model_name: nvidia/Llama-3.1-Nemotron-Nano-8B-v1
59+
endpoint: https://inference.generativeai.us-chicago-1.oci.oraclecloud.com
60+
compartment_id: ocid1.compartment.oc1..example
61+
auth_type: API_KEY
62+
auth_profile: DEFAULT
63+
auth_file_location: ~/.oci/config
64+
provider: meta
5565
litellm_llm:
5666
_type: litellm
5767
model_name: gpt-4o
@@ -118,6 +128,37 @@ The AWS Bedrock LLM provider is defined by the {py:class}`~nat.llm.aws_bedrock_l
118128
* `credentials_profile_name` - The credentials profile name to use for the model
119129
* `max_retries` - The maximum number of retries for the request
120130

131+
### OCI Generative AI
132+
133+
You can use the following fields to configure the OCI Generative AI LLM provider:
134+
135+
* `endpoint` - The OCI Generative AI regional or dedicated endpoint URL
136+
* `compartment_id` - The OCI compartment OCID used for inference requests
137+
* `auth_type` - OCI SDK auth mode such as `API_KEY`, `SECURITY_TOKEN`, `INSTANCE_PRINCIPAL`, or `RESOURCE_PRINCIPAL`
138+
* `auth_profile` - OCI config profile name for file-backed auth
139+
* `auth_file_location` - Path to the OCI config file
140+
* `provider` - Optional provider override such as `meta`, `google`, `cohere`, or `openai`
141+
142+
The OCI Generative AI LLM provider is defined by the {py:class}`~nat.llm.oci_llm.OCIModelConfig` class.
143+
144+
* `model_name` - The name of the model to use
145+
* `endpoint` - The OCI Generative AI endpoint URL
146+
* `compartment_id` - OCI compartment OCID
147+
* `auth_type` - OCI SDK auth type
148+
* `auth_profile` - OCI profile name for file-backed auth
149+
* `auth_file_location` - Path to the OCI config file
150+
* `provider` - Optional OCI provider override such as `meta`, `google`, `cohere`, or `openai`
151+
* `temperature` - The temperature to use for the model
152+
* `top_p` - The top-p value to use for the model
153+
* `max_tokens` - The maximum number of tokens to generate
154+
* `seed` - The seed to use for the model
155+
* `max_retries` - The maximum number of retries for the request
156+
* `request_timeout` - HTTP request timeout in seconds
157+
158+
:::{note}
159+
This provider targets OCI Generative AI through the OCI SDK-backed `langchain-oci` path and does not enable the Responses API.
160+
:::
161+
121162
### Azure OpenAI
122163

123164
You can use the following environment variables to configure the Azure OpenAI LLM provider:

docs/source/components/integrations/index.md

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -23,4 +23,5 @@ limitations under the License.
2323
./frameworks.md
2424
./a2a.md
2525
AWS Bedrock <./integrating-aws-bedrock-models.md>
26-
```
26+
OCI Generative AI <./integrating-oci-generative-ai-models.md>
27+
```
Lines changed: 98 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,98 @@
1+
<!--
2+
SPDX-FileCopyrightText: Copyright (c) 2025-2026, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
3+
SPDX-License-Identifier: Apache-2.0
4+
5+
Licensed under the Apache License, Version 2.0 (the "License");
6+
you may not use this file except in compliance with the License.
7+
You may obtain a copy of the License at
8+
9+
http://www.apache.org/licenses/LICENSE-2.0
10+
11+
Unless required by applicable law or agreed to in writing, software
12+
distributed under the License is distributed on an "AS IS" BASIS,
13+
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
14+
See the License for the specific language governing permissions and
15+
limitations under the License.
16+
-->
17+
18+
# NVIDIA NeMo Agent Toolkit OCI Integration
19+
20+
The NeMo Agent Toolkit supports integration with multiple [LLM](../../build-workflows/llms/index.md) providers, including OCI Generative AI. The `oci` provider uses OCI SDK authentication and is designed for OCI Generative AI model and endpoint access. For workflow parity with the AWS Bedrock path, the toolkit also includes a LangChain wrapper built on `langchain-oci`.
21+
22+
To view the full list of supported LLM providers, run `nat info components -t llm_provider`.
23+
24+
## Configuration
25+
26+
### Prerequisites
27+
Before integrating OCI, ensure you have:
28+
29+
- access to OCI Generative AI in the target region
30+
- a valid OCI auth method such as `API_KEY`, `SECURITY_TOKEN`, `INSTANCE_PRINCIPAL`, or `RESOURCE_PRINCIPAL`
31+
- the target compartment OCID
32+
- the Generative AI service endpoint for the region or a custom endpoint URL
33+
34+
Common deployment patterns include:
35+
36+
- OCI Generative AI regional endpoints
37+
- custom OCI Generative AI endpoints
38+
- OCI-hosted inference for NVIDIA Nemotron used as a live integration target
39+
40+
### Example Configuration
41+
Add the OCI LLM configuration to your workflow config file:
42+
43+
```yaml
44+
llms:
45+
oci_llm:
46+
_type: oci
47+
model_name: nvidia/Llama-3.1-Nemotron-Nano-8B-v1
48+
endpoint: https://inference.generativeai.us-chicago-1.oci.oraclecloud.com
49+
compartment_id: ocid1.compartment.oc1..example
50+
auth_type: API_KEY
51+
auth_profile: DEFAULT
52+
temperature: 0.0
53+
max_tokens: 1024
54+
top_p: 1.0
55+
request_timeout: 60
56+
```
57+
58+
### Configurable Options
59+
* `model_name`: The name of the OCI-hosted model to use (required)
60+
* `endpoint`: The OCI Generative AI service endpoint or custom endpoint URL
61+
* `compartment_id`: OCI compartment OCID
62+
* `auth_type`: OCI SDK auth type
63+
* `auth_profile`: OCI profile name for file-backed auth
64+
* `auth_file_location`: Path to the OCI config file
65+
* `provider`: Optional OCI provider override such as `meta`, `google`, `cohere`, or `openai`
66+
* `temperature`: Controls randomness in the output (0.0 to 1.0)
67+
* `max_tokens`: Maximum number of tokens to generate
68+
* `top_p`: Top-p sampling parameter (0.0 to 1.0)
69+
* `seed`: Optional random seed
70+
* `max_retries`: Maximum number of retries for the request
71+
* `request_timeout`: HTTP request timeout in seconds
72+
73+
### Limitations
74+
* This provider targets OCI Generative AI through the OCI SDK-backed `langchain-oci` path.
75+
* The Responses API is not enabled for this provider in the current release.
76+
77+
## Nemotron On OCI
78+
79+
One strong OCI deployment pattern is NVIDIA Nemotron hosted on OCI and exposed through an OpenAI-compatible route. In that setup, the toolkit can validate live integration behavior against the OCI-hosted Nemotron endpoint while the official provider and LangChain wrapper cover the OCI Generative AI path.
80+
81+
## Usage
82+
Reference the OCI LLM in your configuration:
83+
84+
```yaml
85+
llms:
86+
oci_llm:
87+
_type: oci
88+
model_name: nvidia/Llama-3.1-Nemotron-Nano-8B-v1
89+
endpoint: https://inference.generativeai.us-chicago-1.oci.oraclecloud.com
90+
compartment_id: ocid1.compartment.oc1..example
91+
auth_profile: DEFAULT
92+
```
93+
94+
## Troubleshooting
95+
* `401 Unauthorized`: verify the OCI profile, signer, and IAM permissions for Generative AI.
96+
* `404 Not Found`: confirm the regional endpoint or custom endpoint URL is correct.
97+
* `Connection errors`: verify OCI networking and regional endpoint reachability.
98+
* `Tool calling issues`: verify the served model supports tool calling and that the serving stack is configured for it.

docs/source/conf.py

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -379,6 +379,8 @@ def _build_api_tree() -> Path:
379379
'/extend/custom-components/gated-fields.html',
380380
'extend/integrating-aws-bedrock-models':
381381
'/components/integrations/integrating-aws-bedrock-models.html',
382+
'extend/integrating-oci-generative-ai-models':
383+
'/components/integrations/integrating-oci-generative-ai-models.html',
382384
'extend/memory':
383385
'/extend/custom-components/memory.html',
384386
'extend/object-store':

docs/source/get-started/installation.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -27,6 +27,7 @@ The following [LLM](../build-workflows/llms/index.md) API providers are supporte
2727
- OpenAI
2828
- AWS Bedrock
2929
- Azure OpenAI
30+
- OCI Generative AI
3031

3132
## Packages
3233

examples/frameworks/agno_personal_finance/pyproject.toml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -34,7 +34,7 @@ classifiers = ["Programming Language :: Python"]
3434
[tool.setuptools_dynamic_dependencies]
3535
dependencies = [
3636
"nvidia-nat[agno,test] == {version}",
37-
"openai~=1.106",
37+
"openai>=1.106,<3.0.0",
3838
]
3939

4040
[tool.uv.sources]

examples/frameworks/multi_frameworks/pyproject.toml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -38,7 +38,7 @@ dependencies = [
3838
"beautifulsoup4~=4.13",
3939
"markdown-it-py~=3.0",
4040
"nvidia-haystack~=0.3.0",
41-
"openai~=1.106",
41+
"openai>=1.106,<3.0.0",
4242
]
4343

4444
[tool.uv.sources]

packages/nvidia_nat_agno/pyproject.toml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -57,7 +57,7 @@ dependencies = [
5757
"agno>=1.2.3,<2.0.0",
5858
"google-search-results>=2.4.2,<3.0.0",
5959
"litellm~=1.74",
60-
"openai~=1.106",
60+
"openai>=1.106,<3.0.0",
6161
]
6262

6363
[tool.setuptools_dynamic_dependencies.optional-dependencies]
Lines changed: 89 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,89 @@
1+
# SPDX-FileCopyrightText: Copyright (c) 2024-2026, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
2+
# SPDX-License-Identifier: Apache-2.0
3+
#
4+
# Licensed under the Apache License, Version 2.0 (the "License");
5+
# you may not use this file except in compliance with the License.
6+
# You may obtain a copy of the License at
7+
#
8+
# http://www.apache.org/licenses/LICENSE-2.0
9+
#
10+
# Unless required by applicable law or agreed to in writing, software
11+
# distributed under the License is distributed on an "AS IS" BASIS,
12+
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
13+
# See the License for the specific language governing permissions and
14+
# limitations under the License.
15+
16+
from collections.abc import AsyncIterator
17+
18+
from pydantic import AliasChoices
19+
from pydantic import ConfigDict
20+
from pydantic import Field
21+
22+
from nat.builder.builder import Builder
23+
from nat.builder.llm import LLMProviderInfo
24+
from nat.cli.register_workflow import register_llm_provider
25+
from nat.data_models.llm import LLMBaseConfig
26+
from nat.data_models.optimizable import OptimizableField
27+
from nat.data_models.optimizable import OptimizableMixin
28+
from nat.data_models.optimizable import SearchSpace
29+
from nat.data_models.retry_mixin import RetryMixin
30+
from nat.data_models.ssl_verification_mixin import SSLVerificationMixin
31+
from nat.data_models.thinking_mixin import ThinkingMixin
32+
33+
class OCIModelConfig(LLMBaseConfig, RetryMixin, OptimizableMixin, ThinkingMixin, SSLVerificationMixin, name="oci"):
34+
"""OCI Generative AI LLM provider."""
35+
36+
model_config = ConfigDict(protected_namespaces=(), extra="allow")
37+
38+
endpoint: str | None = Field(
39+
default=None,
40+
validation_alias=AliasChoices("endpoint", "service_endpoint", "base_url"),
41+
description="OCI Generative AI service endpoint URL.",
42+
)
43+
compartment_id: str | None = Field(default=None, description="OCI compartment OCID for Generative AI requests.")
44+
auth_type: str = Field(default="API_KEY",
45+
description="OCI SDK authentication type: API_KEY, SECURITY_TOKEN, INSTANCE_PRINCIPAL, "
46+
"or RESOURCE_PRINCIPAL.")
47+
auth_profile: str = Field(default="DEFAULT",
48+
description="OCI config profile to use for API_KEY or SECURITY_TOKEN auth.")
49+
auth_file_location: str = Field(default="~/.oci/config",
50+
description="Path to the OCI config file used for SDK authentication.")
51+
model_name: str = OptimizableField(validation_alias=AliasChoices("model_name", "model"),
52+
serialization_alias="model",
53+
description="The OCI Generative AI model ID.")
54+
provider: str | None = Field(default=None,
55+
description="Optional OCI provider override such as cohere, google, meta, or openai.")
56+
context_size: int | None = Field(
57+
default=1024,
58+
gt=0,
59+
description="The maximum number of tokens available for input.",
60+
)
61+
seed: int | None = Field(default=None, description="Random seed to set for generation.")
62+
max_retries: int = Field(default=10, description="The max number of retries for the request.")
63+
max_tokens: int | None = Field(default=None, gt=0, description="Maximum number of output tokens.")
64+
temperature: float | None = OptimizableField(
65+
default=None,
66+
ge=0.0,
67+
description="Sampling temperature to control randomness in the output.",
68+
space=SearchSpace(high=0.9, low=0.1, step=0.2))
69+
top_p: float | None = OptimizableField(default=None,
70+
ge=0.0,
71+
le=1.0,
72+
description="Top-p for distribution sampling.",
73+
space=SearchSpace(high=1.0, low=0.5, step=0.1))
74+
request_timeout: float | None = Field(default=None, gt=0.0, description="HTTP request timeout in seconds.")
75+
76+
77+
@register_llm_provider(config_type=OCIModelConfig)
78+
async def oci_llm(config: OCIModelConfig, _builder: Builder) -> AsyncIterator[LLMProviderInfo]:
79+
"""Yield provider metadata for an OCI Generative AI model.
80+
81+
Args:
82+
config: OCI model configuration.
83+
_builder: Builder instance.
84+
85+
Yields:
86+
LLMProviderInfo describing the configured OCI model.
87+
"""
88+
89+
yield LLMProviderInfo(config=config, description="An OCI Generative AI model for use with an LLM client.")

packages/nvidia_nat_core/src/nat/llm/register.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -27,4 +27,5 @@
2727
from . import huggingface_llm
2828
from . import litellm_llm
2929
from . import nim_llm
30+
from . import oci_llm
3031
from . import openai_llm

0 commit comments

Comments
 (0)