Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions docs/docs/providers/agents/index.mdx
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
---
description: "Agents

APIs for creating and interacting with agentic systems."
APIs for creating and interacting with agentic systems."
sidebar_label: Agents
title: Agents
---
Expand All @@ -12,6 +12,6 @@ title: Agents

Agents

APIs for creating and interacting with agentic systems.
APIs for creating and interacting with agentic systems.

This section contains documentation for all available providers for the **agents** API.
24 changes: 12 additions & 12 deletions docs/docs/providers/batches/index.mdx
Original file line number Diff line number Diff line change
@@ -1,14 +1,14 @@
---
description: "The Batches API enables efficient processing of multiple requests in a single operation,
particularly useful for processing large datasets, batch evaluation workflows, and
cost-effective inference at scale.
particularly useful for processing large datasets, batch evaluation workflows, and
cost-effective inference at scale.

The API is designed to allow use of openai client libraries for seamless integration.
The API is designed to allow use of openai client libraries for seamless integration.

This API provides the following extensions:
- idempotent batch creation
This API provides the following extensions:
- idempotent batch creation

Note: This API is currently under active development and may undergo changes."
Note: This API is currently under active development and may undergo changes."
sidebar_label: Batches
title: Batches
---
Expand All @@ -18,14 +18,14 @@ title: Batches
## Overview

The Batches API enables efficient processing of multiple requests in a single operation,
particularly useful for processing large datasets, batch evaluation workflows, and
cost-effective inference at scale.
particularly useful for processing large datasets, batch evaluation workflows, and
cost-effective inference at scale.

The API is designed to allow use of openai client libraries for seamless integration.
The API is designed to allow use of openai client libraries for seamless integration.

This API provides the following extensions:
- idempotent batch creation
This API provides the following extensions:
- idempotent batch creation

Note: This API is currently under active development and may undergo changes.
Note: This API is currently under active development and may undergo changes.

This section contains documentation for all available providers for the **batches** API.
4 changes: 2 additions & 2 deletions docs/docs/providers/eval/index.mdx
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
---
description: "Evaluations

Llama Stack Evaluation API for running evaluations on model and agent candidates."
Llama Stack Evaluation API for running evaluations on model and agent candidates."
sidebar_label: Eval
title: Eval
---
Expand All @@ -12,6 +12,6 @@ title: Eval

Evaluations

Llama Stack Evaluation API for running evaluations on model and agent candidates.
Llama Stack Evaluation API for running evaluations on model and agent candidates.

This section contains documentation for all available providers for the **eval** API.
4 changes: 2 additions & 2 deletions docs/docs/providers/files/index.mdx
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
---
description: "Files

This API is used to upload documents that can be used with other Llama Stack APIs."
This API is used to upload documents that can be used with other Llama Stack APIs."
sidebar_label: Files
title: Files
---
Expand All @@ -12,6 +12,6 @@ title: Files

Files

This API is used to upload documents that can be used with other Llama Stack APIs.
This API is used to upload documents that can be used with other Llama Stack APIs.

This section contains documentation for all available providers for the **files** API.
16 changes: 8 additions & 8 deletions docs/docs/providers/inference/index.mdx
Original file line number Diff line number Diff line change
@@ -1,11 +1,11 @@
---
description: "Inference

Llama Stack Inference API for generating completions, chat completions, and embeddings.
Llama Stack Inference API for generating completions, chat completions, and embeddings.

This API provides the raw interface to the underlying models. Two kinds of models are supported:
- LLM models: these models generate \"raw\" and \"chat\" (conversational) completions.
- Embedding models: these models generate embeddings to be used for semantic search."
This API provides the raw interface to the underlying models. Two kinds of models are supported:
- LLM models: these models generate \"raw\" and \"chat\" (conversational) completions.
- Embedding models: these models generate embeddings to be used for semantic search."
sidebar_label: Inference
title: Inference
---
Expand All @@ -16,10 +16,10 @@ title: Inference

Inference

Llama Stack Inference API for generating completions, chat completions, and embeddings.
Llama Stack Inference API for generating completions, chat completions, and embeddings.

This API provides the raw interface to the underlying models. Two kinds of models are supported:
- LLM models: these models generate "raw" and "chat" (conversational) completions.
- Embedding models: these models generate embeddings to be used for semantic search.
This API provides the raw interface to the underlying models. Two kinds of models are supported:
- LLM models: these models generate "raw" and "chat" (conversational) completions.
- Embedding models: these models generate embeddings to be used for semantic search.

This section contains documentation for all available providers for the **inference** API.
4 changes: 2 additions & 2 deletions docs/docs/providers/safety/index.mdx
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
---
description: "Safety

OpenAI-compatible Moderations API."
OpenAI-compatible Moderations API."
sidebar_label: Safety
title: Safety
---
Expand All @@ -12,6 +12,6 @@ title: Safety

Safety

OpenAI-compatible Moderations API.
OpenAI-compatible Moderations API.

This section contains documentation for all available providers for the **safety** API.
9 changes: 9 additions & 0 deletions llama_stack/distributions/oci/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
# Copyright (c) Meta Platforms, Inc. and affiliates.
# All rights reserved.
#
# This source code is licensed under the terms described in the LICENSE file in
# the root directory of this source tree.

from .oci import get_distribution_template

__all__ = ["get_distribution_template"]
34 changes: 34 additions & 0 deletions llama_stack/distributions/oci/build.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
version: 2
distribution_spec:
description: Use Oracle Cloud Infrastructure (OCI) Generative AI for running LLM
inference with scalable cloud services
providers:
inference:
- provider_type: remote::oci
vector_io:
- provider_type: inline::faiss
- provider_type: remote::chromadb
- provider_type: remote::pgvector
safety:
- provider_type: inline::llama-guard
agents:
- provider_type: inline::meta-reference
eval:
- provider_type: inline::meta-reference
datasetio:
- provider_type: remote::huggingface
- provider_type: inline::localfs
scoring:
- provider_type: inline::basic
- provider_type: inline::llm-as-judge
- provider_type: inline::braintrust
tool_runtime:
- provider_type: remote::brave-search
- provider_type: remote::tavily-search
- provider_type: remote::model-context-protocol
files:
- provider_type: inline::localfs
image_type: venv
additional_pip_packages:
- aiosqlite
- sqlalchemy[asyncio]
123 changes: 123 additions & 0 deletions llama_stack/distributions/oci/oci.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,123 @@
# Copyright (c) Meta Platforms, Inc. and affiliates.
# All rights reserved.
#
# This source code is licensed under the terms described in the LICENSE file in
# the root directory of this source tree.

from pathlib import Path

from llama_stack.core.datatypes import BuildProvider, Provider, ToolGroupInput
from llama_stack.distributions.template import DistributionTemplate, RunConfigSettings
from llama_stack.providers.inline.files.localfs.config import LocalfsFilesImplConfig
from llama_stack.providers.inline.vector_io.faiss.config import FaissVectorIOConfig
from llama_stack.providers.remote.inference.oci.config import OCIConfig


def get_distribution_template(name: str = "oci") -> DistributionTemplate:
providers = {
"inference": [BuildProvider(provider_type="remote::oci")],
"vector_io": [
BuildProvider(provider_type="inline::faiss"),
BuildProvider(provider_type="remote::chromadb"),
BuildProvider(provider_type="remote::pgvector"),
],
"safety": [BuildProvider(provider_type="inline::llama-guard")],
"agents": [BuildProvider(provider_type="inline::meta-reference")],
"eval": [BuildProvider(provider_type="inline::meta-reference")],
"datasetio": [
BuildProvider(provider_type="remote::huggingface"),
BuildProvider(provider_type="inline::localfs"),
],
"scoring": [
BuildProvider(provider_type="inline::basic"),
BuildProvider(provider_type="inline::llm-as-judge"),
BuildProvider(provider_type="inline::braintrust"),
],
"tool_runtime": [
BuildProvider(provider_type="remote::brave-search"),
BuildProvider(provider_type="remote::tavily-search"),
BuildProvider(provider_type="remote::model-context-protocol"),
],
"files": [BuildProvider(provider_type="inline::localfs")],
}

inference_provider = Provider(
provider_id="oci",
provider_type="remote::oci",
config=OCIConfig.sample_run_config(),
)

files_provider = Provider(
provider_id="meta-reference-files",
provider_type="inline::localfs",
config=LocalfsFilesImplConfig.sample_run_config(f"~/.llama/distributions/{name}"),
)
vector_io_provider = Provider(
provider_id="faiss",
provider_type="inline::faiss",
config=FaissVectorIOConfig.sample_run_config(f"~/.llama/distributions/{name}"),
)

default_tool_groups = [
ToolGroupInput(
toolgroup_id="builtin::websearch",
provider_id="tavily-search",
),
]

return DistributionTemplate(
name=name,
distro_type="remote_hosted",
description="Use Oracle Cloud Infrastructure (OCI) Generative AI for running LLM inference with scalable cloud services",
container_image=None,
template_path=Path(__file__).parent / "doc_template.md",
providers=providers,
run_configs={
"run.yaml": RunConfigSettings(
provider_overrides={
"inference": [inference_provider],
"vector_io": [vector_io_provider],
"files": [files_provider],
},
default_tool_groups=default_tool_groups,
),
},
run_config_env_vars={
"OCI_AUTH_TYPE": (
"instance_principal",
"OCI authentication type (instance_principal or config_file)",
),
"OCI_USER_OCID": (
"",
"OCI user OCID for authentication",
),
"OCI_TENANCY_OCID": (
"",
"OCI tenancy OCID for authentication",
),
"OCI_FINGERPRINT": (
"",
"OCI API key fingerprint for authentication",
),
"OCI_PRIVATE_KEY": (
"",
"OCI private key for authentication",
),
"OCI_REGION": (
"",
"OCI region (e.g., us-ashburn-1, us-chicago-1, us-phoenix-1, eu-frankfurt-1)",
),
"OCI_COMPARTMENT_OCID": (
"",
"OCI compartment ID for the Generative AI service",
),
"OCI_CONFIG_FILE_PATH": (
"~/.oci/config",
"OCI config file path (required if OCI_AUTH_TYPE is config_file)",
),
"OCI_CLI_PROFILE": (
"DEFAULT",
"OCI CLI profile name to use from config file",
),
},
)
Loading
Loading