feat: add oci genai service as chat inference provider #3876

dkennetzoracle · 2025-10-21T17:45:06Z

What does this PR do?

Adds OCI GenAI PaaS models for openai chat completion endpoints.

Test Plan

In an OCI tenancy with access to GenAI PaaS, perform the following steps:

Ensure you have IAM policies in place to use service (check docs included in this PR)
For local development, setup OCI cli and configure the CLI with your region, tenancy, and auth here
Once configured, go through llama-stack setup and run llama-stack (uses config based auth) like:

OCI_AUTH_TYPE=config_file OCI_CLI_PROFILE=CHICAGO OCI_REGION=us-chicago-1 OCI_COMPARTMENT_OCID=ocid1.compartment.oc1..aaaaaaaa5...5a llama stack run oci

Hit the models endpoint to list models after server is running:

curl http://localhost:8321/v1/models | jq
...
{
      "identifier": "meta.llama-4-scout-17b-16e-instruct",
      "provider_resource_id": "ocid1.generativeaimodel.oc1.us-chicago-1.am...q",
      "provider_id": "oci",
      "type": "model",
      "metadata": {
        "display_name": "meta.llama-4-scout-17b-16e-instruct",
        "capabilities": [
          "CHAT"
        ],
        "oci_model_id": "ocid1.generativeaimodel.oc1.us-chicago-1.a...q"
      },
      "model_type": "llm"
},
   ...

Use the "display_name" field to use the model in a /chat/completions request:

# Streaming result
curl -X POST http://localhost:8321/v1/chat/completions   -H "Content-Type: application/json"   -d '{
        "model": "meta.llama-4-scout-17b-16e-instruct",
       "stream": true,
       "temperature": 0.9,
      "messages": [
         {
           "role": "system",
           "content": "You are a funny comedian. You can be crass."
         },
          {
           "role": "user",
          "content": "Tell me a funny joke about programming."
         }
       ]
}'

# Non-streaming result
curl -X POST http://localhost:8321/v1/chat/completions   -H "Content-Type: application/json"   -d '{
        "model": "meta.llama-4-scout-17b-16e-instruct",
       "stream": false,
       "temperature": 0.9,
      "messages": [
         {
           "role": "system",
           "content": "You are a funny comedian. You can be crass."
         },
          {
           "role": "user",
          "content": "Tell me a funny joke about programming."
         }
       ]
}'

Try out other models from the /models endpoint.

ashwinb · 2025-10-21T18:58:03Z

@github-actions run precommit

github-actions · 2025-10-21T18:58:19Z

⏳ Running pre-commit hooks on PR #3876...

🤖 Applied by @github-actions bot via pre-commit workflow

github-actions · 2025-10-21T18:59:41Z

✅ Pre-commit hooks completed successfully!

🔧 Changes have been committed and pushed to the PR branch.

dkennetzoracle · 2025-10-21T20:21:27Z

Removing docs additions at request of @raghotham

ashwinb · 2025-10-22T19:48:30Z

cc @mattf for a review since this touches the inference system

dkennetzoracle · 2025-10-27T02:23:49Z

Any updates here?

leseb · 2025-10-27T14:31:33Z

llama_stack/providers/remote/inference/oci/config.py

+
+
+@json_schema_type
+class OCIConfig(BaseModel):


Suggested change

class OCIConfig(BaseModel):

class OCIConfig(RemoteInferenceProviderConfig):

leseb · 2025-10-27T14:33:49Z

llama_stack/providers/remote/inference/oci/oci.py

+            # log_probs=params.get("log_probs", 0),
+            # tool_choice=params.get("tool_choice", {}), # Unsupported
+            # tools=params.get("tools", {}), # Unsupported
+            # web_search_options=params.get("web_search_options", {}), # Unsupported
+            # stop=params.get("stop", []),


Why comment out? Are all of them unsupported?

leseb · 2025-10-27T14:36:15Z

llama_stack/providers/remote/inference/oci/oci.py

+        )
+        return chat_details
+
+    async def chat_completion(


We have OpenAIMixin Class that exposes a lot of knobs, cna you look at it and see if you can use it? Instead of using "custom" code to do the completion requests?

mattf

@dkennetzoracle -

does oci provide an openai compatible endpoint?
please include output of the inference tests against the remote::oci provider

ashwinb · 2025-10-27T17:22:17Z

I think it would be much preferable if we can work against an OpenAI compatible endpoint. Otherwise, at the very least we need a set of recorded tests against the provider. But before recordings, let's make sure at least the tests pass "live". Here's a command to run (roughly):

pytest -sv tests/integration/inference/  \
   --stack-config <your_distro> \
   --text-model <oci/...>   \
   --embedding-model sentence-transformers/nomic-ai/... \
   --inference-mode live

dkennetzoracle · 2025-10-27T18:28:37Z

@leseb @ashwinb @mattf thanks for reviewing. If it would be strongly preferable for me to use an OpenAI compatible endpoint, I can make those changes. I'll refactor and re-request when this is done.

Sorry also, I started the PR a few weeks ago before a conference, and when I got back inference providers had changed significantly, although it seems like for the better. I'll align on the changes and re-request!

dkennetzoracle requested review from ashwinb, bbrowning, ehhuang, franciscojavierarceo, hardikjshah, leseb, mattf, raghotham, reluctantfuturist, slekkala1, terrytangyuan and yanxi0830 as code owners October 21, 2025 17:45

meta-cla bot added the CLA Signed This label is managed by the Meta Open Source bot. label Oct 21, 2025

feat: add oci genai service as chat inference provider

acd1008

dkennetzoracle force-pushed the oci_inference_provider branch from e17bf00 to acd1008 Compare October 21, 2025 17:57

dkennetzoracle changed the title ~~Oci inference provider~~ feat: add oci genai service as chat inference provider Oct 21, 2025

style: apply pre-commit fixes

95f5ffb

🤖 Applied by @github-actions bot via pre-commit workflow

dkennetzoracle added 2 commits October 21, 2025 20:32

feat: add oci genai service as chat inference provider

0efca42

Removed doc updates.

e7e7c8c

leseb requested changes Oct 27, 2025

View reviewed changes

mattf reviewed Oct 27, 2025

View reviewed changes

	class OCIConfig(BaseModel):
	class OCIConfig(RemoteInferenceProviderConfig):

Uh oh!

feat: add oci genai service as chat inference provider #3876

Are you sure you want to change the base?

feat: add oci genai service as chat inference provider #3876

Uh oh!

Conversation

dkennetzoracle commented Oct 21, 2025

What does this PR do?

Test Plan

Uh oh!

ashwinb commented Oct 21, 2025

Uh oh!

github-actions bot commented Oct 21, 2025

Uh oh!

github-actions bot commented Oct 21, 2025

Uh oh!

dkennetzoracle commented Oct 21, 2025

Uh oh!

ashwinb commented Oct 22, 2025

Uh oh!

dkennetzoracle commented Oct 27, 2025

Uh oh!

leseb Oct 27, 2025

Choose a reason for hiding this comment

Uh oh!

leseb Oct 27, 2025

Choose a reason for hiding this comment

Uh oh!

leseb Oct 27, 2025

Choose a reason for hiding this comment

Uh oh!

mattf left a comment

Choose a reason for hiding this comment

Uh oh!

ashwinb commented Oct 27, 2025

Uh oh!

dkennetzoracle commented Oct 27, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants