diff --git a/docs/cody/clients/enable-cody-enterprise.mdx b/docs/cody/clients/enable-cody-enterprise.mdx
index 905cd2f25..245bdfb62 100644
--- a/docs/cody/clients/enable-cody-enterprise.mdx
+++ b/docs/cody/clients/enable-cody-enterprise.mdx
@@ -1,4 +1,4 @@
-# Cody on Sourcegraph Enterprise
+# Cody for Enterprise
Cody enhances your coding experience by providing intelligent code suggestions, context-aware completions, and advanced code analysis. These docs will help you use Cody on your Sourcegraph Enterprise instance.
@@ -6,96 +6,112 @@
-## Cody Enterprise features
+## Setting up Cody Enterprise
-To cater to your Enterprise requirements, Cody offers the following features:
+You can set up Cody for your Enterprise instance by two methods:
-### IDE token expiry
+1. Sourcegraph Cloud
+2. Self-hosted Sourcegraph
-Site administrators can set the duration of access tokens for users connecting Cody from their IDEs (VS Code, JetBrains, etc.). This can be configured from the **Site admin** page of the Sourcegraph Enterprise instance. Available options include **7, 14, 30, 60, and 90 days**.
+## Cody on Sourcegraph Cloud
-
+With [Sourcegraph Cloud](/cloud/), you get Cody as a managed service, and you **do not** need to enable Cody as is required for self-hosted setup. However, by contacting your account manager, Cody can still be enabled or disabled on-demand on your Sourcegraph instance.
-### Guardrails
+## Self-hosted Sourcegraph Enterprise
-Guardrails for public code is currently in Beta and is supported with VS Code and JetBrains IDEs extensions.
+### Prerequisites
-Open source attribution guardrails for public code, commonly called copyright guardrails, reduce the exposure to copyrighted code. This involves implementing a verification mechanism within Cody to ensure that any code generated by the platform does not replicate open source code.
+- You have Sourcegraph version `5.1.0` or more
+- A Sourcegraph Enterprise subscription with [Cody Gateway](/cody/core-concepts/cody-gateway) or an account with a third-party LLM provider
-Guardrails for public code are available to all Sourcegraph Enterprise instances and are **disabled** by default. You can enable them from the Site configuration section by setting `attribution.enabled` to `true`.
+### Enable Cody on your Sourcegraph instance
-Guardrails don't differentiate between license types. It matches any code snippet that is at least **ten lines** long from the **290,000** indexed open source repositories.
+Site admins can only enable Cody on the Sourcegraph instance. To do so,
-### Admin controls
+- First, configure your desired LLM provider either by [using Sourcegraph Cody Gateway](/cody/core-concepts/cody-gateway) or by directly using a third-party LLM provider
+- Next, go to **Site admin > Site configuration** (`/site-admin/configuration`) on your instance and set:
-Admin controls are supported with VS Code and JetBrains IDE extension.
+```json
+ {
+ // [...]
+ "cody.enabled": true,
+ "completions": {
+ "provider": "sourcegraph"
+ }
+ }
+```
-Site administrators have selective control over users' access to Cody Enterprise, which is managed via the Sourcegraph role-based access control system. This provides a more intuitive user interface for assigning permission to use Cody.
+- Cody is enabled on your self-hosted Sourcegraph enterprise instance
-### Analytics
+## Disable Cody
-Cody Analytics are supported with VS Code IDE extension and on the latest versions of JetBrains IDEs.
+To turn Cody off:
-Cody Enterprise users can view analytics for their instance. A separately managed cloud service for Cody analytics handles user auth, gets metrics data from Sourcegraph's BigQuery instance, and visualizes the metrics data.
+- Go to **Site admin > Site configuration** (`/site-admin/configuration`) on your instance and set:
-The following metrics are available for Cody Enterprise users:
+```json
+ {
+ // [...]
+ "cody.enabled": false
+ }
+```
-| **Metric Type** | **What is measured?** |
-| --------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
-| Active users | - Total active users
- Average daily users
- Average no. of days each user used Cody (of last 30 days)
- Cody users by day (last 30 days)
- Cody users by month (last two months)
- Cody users by number of days used |
-| Completions | - Total accepted completions
- Minutes saved per completion
- Hours saved by completions
- Cody completions by day
- Completions acceptance rate
- Weighted completions acceptance rate
- Average completion latency
- Acceptance rate by language
|
-| Chat | - Total chat events
- Minutes saved per chat
- Hours saved by chats
- Cody chats by day |
-| Commands | - Total command events
- Minutes saved per command
- Hours saved by commands
- Cody commands by day
- Most used commands |
+## Enable Cody only for some users
-To enable Cody Analytics:
+How to enable Cody only for _some_ users depends on what version of Sourcegraph you are running.
-- Create an account on [Sourcegraph Accounts](https://accounts.sourcegraph.com/)
-- A user already having a Sourcegraph.com account gets automatically migrated to Sourcegraph Accounts. Users can sign in to Cody Analytics using their email and password
-- Users without a Sourcegraph.com account should contact one of our team members. They can help with both the account setup and assigning instances to specific users
-- Map your user account to a Sourcegraph instance, and this gives you access to Cody analytics
+### Sourcegraph v5.3+
-### Multi-repository context
+In Sourcegraph v5.3+, access to Cody is managed via user roles. By default, all users have access.
-Cody supports multi-repository context, allowing you to search up to 10 repositories simultaneously for relevant information. Open a new chat, type `@`, and select `Remote Repositories.`
+First, ensure Cody is enabled in your site configuration. Go to **Site admin > Site configuration** (`/site-admin/configuration`) on your instance and set:
-Keep @-mentioning repos that you want to include in your context. This flexibility lets you get more comprehensive and accurate responses by leveraging information across multiple codebases.
+```json
+ {
+ // [...]
+ "cody.enabled": true,
+ // Make sure cody.restrictUsersFeatureFlag is not in your configuration! If it is, remove it.
+ }
+```
-### @-mention directory
+ Ensure `cody.restrictUsersFeatureFlag` is **not** in your site configuration. If it is, remove it or else the old feature-flag approach from Sourcegraph 5.2 and earlier will be used.
-To better support teams working with large monorepos, Enterprise users can `@-mention` directories when chatting with Cody. This helps you define more specific directories and sub-directories within that monorepo to give more precise context.
+Next, go to **Site admin > Users & Auth > Roles** (`/site-admin/roles`) on your instance. On that page, you can:
-To do this, type `@` in the chat, and then select **Directories** to search other repositories for context in your codebase.
+- Control whether users **by default** have access to Cody (expand `User [System]` and toggle **Cody** > **Access** as desired)
+- Control whether groups of users have access to Cody (`+Create role` and enable the **Cody** > **Access** toggle as desired)
-
+### Sourcegraph v5.2 and earlier
-Please note that you can only `@-mention` remote directories (i.e., directories in your Sourcegraph instance) but not local directories. This means any recent changes to your directories can't be utilized as context until your Sourcegraph instance re-indexes any changes.
+In Sourcegraph v5.2 and earlier, you should use the feature flag `cody` to turn Cody on selectively for some users. To do so:
-If you want to include recent changes that haven't been indexed in your Sourcegraph instance, you can `@-mention` specific files, lines of code, or symbols.
+- Go to **Site admin > Site configuration** (`/site-admin/configuration`) on your instance and set:
-## Supported LLM models
+```json
+ {
+ // [...]
+ "cody.enabled": true,
+ "cody.restrictUsersFeatureFlag": true
+ }
+```
-Sourcegraph Enterprise supports different LLM providers and models, such as models from Anthropic and OpenAI. You can do this by adjusting your Sourcegraph instance configuration.
+- Next, go to **Site admin > Feature flags** (`/site-admin/feature-flags`)
+- Add a feature flag called `cody`
+- Select the `boolean` type and set it to `false`
+- Once added, click on the feature flag and use **add overrides** to pick users that will have access to Cody
-
-For the supported LLM models listed above, refer to the following notes:
+
-1. Microsoft Azure is planning to deprecate the APIs used in Sourcegraph version `>5.3.3` on July 1, 2024 [Source](https://learn.microsoft.com/en-us/azure/ai-services/openai/api-version-deprecation)
-2. Claude 2.1 is not recommended
-3. Sourcegraph doesn’t recommend using the GPT-4 (non-Turbo), Claude 1, or Claude 2 models anymore
-4. Only supported through legacy completions API
-5. BYOK (Bring Your Own Key) with managed services are only supported for Self-hosted Sourcegraph instances
-6. GPT-4 and GPT-4o for completions have a bug that is resulting in many failed completions
+## Configure Cody for LLM providers
-### Supported model configuration
+Cody supports several LLM providers and models. You can access these models via the Cody Gateway, directly using your own model provider account or infrastructure.
-Use the drop-down menu to make your desired selection and get a detailed breakdown of the supported model configuration for each provider on Cody Enterprise. This is an on-site configuration. Admins should pick a value from the table for `chatModel` to configure their chat model.
+There are two ways of configuring Cody for LLM providers:
-
+
-For the supported LLM model configuration listed above, refer to the following notes:
+
-1. Microsoft Azure is planning to deprecate the APIs used in Sourcegraph version `>5.3.3` on July 1, 2024 [Source](https://learn.microsoft.com/en-us/azure/ai-services/openai/api-version-deprecation)
-2. Claude 2.1 is not recommended
-3. Sourcegraph doesn't recommend GPT-4 non-turbo, Claude 1 or 2 models
-4. Only supported through legacy completions API
-5. BYOK (Bring Your Own Key) with managed services are only supported for Self-hosted Sourcegraph instances
+
+
+
diff --git a/docs/cody/clients/model-configuration.mdx b/docs/cody/clients/model-configuration.mdx
deleted file mode 100644
index 799f383b6..000000000
--- a/docs/cody/clients/model-configuration.mdx
+++ /dev/null
@@ -1,536 +0,0 @@
-# LLM Model Configuration
-
-This guide will walk you through the steps to customize the LLM models available from your Sourcegraph Enterprise instance.
-
-For Sourcegraph Cloud customers, configuring the available LLM models requires contacting your Sourcegraph account team representative
-
-Cody Enterprise can be configured using one of two methods:
-1. "Completions" Configuration
-2. Model Configuration (Early Access Program)
-
-The Model Configuration method is in Early Access Program and only avaiable on Sourcegraph v5.6.0 or later. In the future when Model Configuration exits EAP, configuring Cody via your Sourcegraph Enterprise instance via the "Completions" Configuration site configuration section will be deprecated. For now, both methods remain supported. We recommend you continue to use the Completions Configuration unless you have specific reason to do otherwise.
-
-## "Completions" Configuration
-
-
-## Setting up Cody Enterprise
-
-You can set up Cody for your Enterprise instance in one of the following ways:
-
-- [Self-hosted Sourcegraph](#cody-on-self-hosted-sourcegraph-enterprise)
-- [Sourcegraph Cloud](#cody-on-sourcegraph-cloud)
-
-## Cody on self-hosted Sourcegraph Enterprise
-
-### Prerequisites
-
-- You have Sourcegraph version 5.1.0 or above
-- A Sourcegraph Enterprise subscription with [Cody Gateway access](/cody/core-concepts/cody-gateway) or [an account with a third-party LLM provider](#supported-models-and-model-providers)
-
-### Enable Cody on your Sourcegraph instance
-
-Cody uses one or more third-party LLM (Large Language Model) providers. Make sure you review Cody's usage and privacy notice. Code snippets are sent to a third-party language model provider when you use the Cody extension.
-
-This requires site-admin privileges. To do so,
-
-1. First, configure your desired LLM provider either by [Using Sourcegraph Cody Gateway](/cody/core-concepts/cody-gateway#using-cody-gateway-in-sourcegraph-enterprise) (recommended) or [Using a third-party LLM provider directly](#supported-models-and-model-providers)
-
- If you are a Sourcegraph Cloud customer, skip directly to step 3.
-
-2. Next, go to **Site admin > Site configuration** (`/site-admin/configuration`) on your instance and set:
-
-```json
- {
- // [...]
- "cody.enabled": true,
- "completions": {
- "provider": "sourcegraph"
- }
- }
-```
-
-Cody is now fully enabled on your self-hosted Sourcegraph enterprise instance!
-
-## Cody on Sourcegraph Cloud
-
-- With [Sourcegraph Cloud](/cloud/), you get Cody as a managed service, and you **do not** need to [enable Cody as is required for self-hosted setup](#enable-cody-on-your-sourcegraph-instance)
-- However, by contacting your account manager, Cody can still be enabled on-demand on your Sourcegraph instance. The Sourcegraph team will refer to the handbook
-- Next, you can configure the [VS Code extension](#configure-the-vs-code-extension) by following the same steps as mentioned for the self-hosted environment
-- After which, you are all set to use Cody with Sourcegraph Cloud
-
-[Learn more about running Cody on Sourcegraph Cloud](/cloud/#cody).
-
-## Disable Cody
-
-To turn Cody off:
-
-- Go to **Site admin > Site configuration** (`/site-admin/configuration`) on your instance and set:
-
-```json
- {
- // [...]
- "cody.enabled": false
- }
-```
-
-- Next, remove `completions` configuration if they exist
-
-## Enable Cody only for some users
-
-To enable Cody only for some users, for example, when rolling out a Cody POC, follow all the steps mentioned in [Enabling Cody on your Sourcegraph instance](#enable-cody-on-your-sourcegraph-instance). Then, do the following:
-
-### Sourcegraph 5.3+
-
-In Sourcegraph 5.3+, access to Cody is managed via user roles. By default, all users have access.
-
-First, ensure Cody is enabled in your site configuration. Go to **Site admin > Site configuration** (`/site-admin/configuration`) on your instance and set:
-
-```json
- {
- // [...]
- "cody.enabled": true,
- // Make sure cody.restrictUsersFeatureFlag is not in your configuration! If it is, remove it.
- }
-```
-
- Ensure `cody.restrictUsersFeatureFlag` is **not** in your site configuration. If it is, remove it or else the old feature-flag approach from Sourcegraph 5.2 and earlier will be used.
-
-Next, go to **Site admin > Users & Auth > Roles** (`/site-admin/roles`) on your instance. On that page, you can:
-
-- Control whether users _by default_ have access to Cody (expand `User [System]` and toggle **Cody** > **Access** as desired)
-- Control whether groups of users have access to Cody (`+Create role` and enable the **Cody** > **Access** toggle as desired)
-
-### Sourcegraph 5.2 and earlier
-
-In Sourcegraph 5.2 and earlier, you should use the feature flag `cody` to turn Cody on selectively for some users. To do so:
-
-- Go to **Site admin > Site configuration** (`/site-admin/configuration`) on your instance and set:
-
-```json
- {
- // [...]
- "cody.enabled": true,
- "cody.restrictUsersFeatureFlag": true
- }
-```
-
-- Next, go to **Site admin > Feature flags** (`/site-admin/feature-flags`)
-- Add a feature flag called `cody`
-- Select the `boolean` type and set it to `false`
-- Once added, click on the feature flag and use **add overrides** to pick users that will have access to Cody
-
-
-
-## Supported models and model providers
-
-[Cody Enterprise](https://sourcegraph.com/enterprise) supports many models and model providers. You can configure Cody Enterprise to access models via Sourcegraph Cody Gateway or directly using your own model provider account or infrastructure.
-
-- Using [Sourcegraph Cody Gateway](/cody/core-concepts/cody-gateway):
- - Recommended for most organizations.
- - Supports [state-of-the-art models](/cody/capabilities/supported-models) from Anthropic, OpenAI, and more, without needing a separate account or incurring separate charges.
-- Using your organization's account with a model provider:
- - [Use your organization's Anthropic account](#use-your-organizations-anthropic-account)
- - [Use your organization's OpenAI account](#use-your-organizations-openai-account)
-- Using your organization's public cloud infrastructure:
- - [Use Amazon Bedrock (AWS)](#use-amazon-bedrock-aws)
- - [Use Azure OpenAI Service](#use-azure-openai-service)
- - *Use Vertex AI on Google Cloud (coming soon)*
-
-### Use your organization's Anthropic account
-
-First, [create your own key with Anthropic](https://console.anthropic.com/account/keys). Once you have the key, go to **Site admin > Site configuration** (`/site-admin/configuration`) on your instance and set:
-
-```json
-{
- // [...]
- "cody.enabled": true,
- "completions": {
- "provider": "anthropic",
- "chatModel": "claude-2.0", // Or any other model you would like to use
- "fastChatModel": "claude-instant-1.2", // Or any other model you would like to use
- "completionModel": "claude-instant-1.2", // Or any other model you would like to use
- "accessToken": ""
- }
-}
-```
-
-### Use your organization's OpenAI account
-
-First, [create your own key with OpenAI](https://beta.openai.com/account/api-keys). Once you have the key, go to **Site admin > Site configuration** (`/site-admin/configuration`) on your instance and set:
-
-```json
-{
- // [...]
- "cody.enabled": true,
- "completions": {
- "provider": "openai",
- "chatModel": "gpt-4", // Or any other model you would like to use
- "fastChatModel": "gpt-3.5-turbo", // Or any other model you would like to use
- "completionModel": "gpt-3.5-turbo-instruct", // Or any other model that supports the legacy completions endpoint
- "accessToken": ""
- }
-}
-```
-
-[Learn more about OpenAI models.](https://platform.openai.com/docs/models)
-
-### Use Amazon Bedrock (AWS)
-
-You can use Anthropic Claude models on [Amazon Bedrock](https://aws.amazon.com/bedrock/).
-
-First, make sure you can access Amazon Bedrock. Then, request access to the Anthropic Claude models in Bedrock.
-This may take some time to provision.
-
-Next, create an IAM user with programmatic access in your AWS account. Depending on your AWS setup, different ways may be required to provide access. All completion requests are made from the `frontend` service, so this service needs to be able to access AWS. You can use instance role bindings or directly configure the IAM user credentials in the configuration. Additionally, the `AWS_REGION` environment variable will need to be set in the `frontend` container for scoping the IAM credentials to the AWS region hosting the Bedrock endpoint.
-
-Once ready, go to **Site admin > Site configuration** (`/site-admin/configuration`) on your instance and set:
-
-```json
-{
- // [...]
- "cody.enabled": true,
- "completions": {
- "provider": "aws-bedrock",
- "chatModel": "anthropic.claude-3-opus-20240229-v1:0",
- "completionModel": "anthropic.claude-instant-v1",
- "endpoint": "",
- "accessToken": ""
- }
-}
-```
-
-For the `chatModel` and `completionModel` fields, see [Amazon's Bedrock documentation](https://docs.aws.amazon.com/bedrock/latest/userguide/model-ids.html) for an up-to-date list of supported model IDs, and cross reference against Sourcegraph's [supported LLM list](/cody/capabilities/supported-models) to verify compatibility with Cody.
-
-For `endpoint`, you can either:
-
-- For Pay-as-you-go, set it to an AWS region code (e.g., `us-west-2`) when using a public Amazon Bedrock endpoint
-- For Provisioned Throughput, set it to the provisioned VPC endpoint for the `bedrock-runtime` API (e.g., `"https://vpce-0a10b2345cd67e89f-abc0defg.bedrock-runtime.us-west-2.vpce.amazonaws.com"`)
-
-For `accessToken`, you can either:
-
-- Leave it empty and rely on instance role bindings or other AWS configurations in the `frontend` service
-- Set it to `:` if directly configuring the credentials
-- Set it to `::` if a session token is also required
-
-
-### Using GCP Vertex AI
-
-Right now, We only support Anthropic Claude models on [GCP Vertex](https://cloud.google.com/vertex-ai/generative-ai/docs/partner-models/use-claude).
-
-1. Enable the [Vertex AI API](https://console.cloud.google.com/marketplace/product/google/aiplatform.googleapis.com) in the GCP console. Once Vertex has been enabled in your project, navigate to the [Vertex Model Garden](https://console.cloud.google.com/vertex-ai/model-garden) to select & enable the Anthropic Claude model(s) which you wish to use with Cody. See [Supported LLM Models](../capabilities/supported-models) for an up-to-date list of Anthropic Claude models supported by Cody.
-
-It may take some time to enable Vertex and provision access to the models you plan to use
-
-2. **Create a Service Account**:
- - Create a [service account](https://cloud.google.com/iam/docs/service-account-overview).
- - Assign the `Vertex AI User` role to the service account.
- - Generate a JSON key for the service account and download it.
-
-3. **Convert JSON Key to Base64** by doing:
-```python
-cat | base64
-```
-
-Once ready, go to **Site admin > Site configuration** (`/site-admin/configuration`) on your instance and set:
-
-
-```json
-{
- // [...]
- "cody.enabled": true,
- "completions": {
- "chatModel": "claude-3-opus@20240229",
- "completionModel": "claude-3-haiku@20240307",
- "provider": "google",
- "endpoint": "",
- "accessToken": ""
- }
-}
-
-```
-
-For the `Endpoint`, you can
-1. Navigate to the Documentation Page:
- Go to the Claude 3 Haiku Documentation on the GCP Console Model garden
-2. Locate the Example: if you scroll enough through the page to find the example that shows how to use the cURL command with the Claude 3 Haiku model. The example will include a sample request JSON body and the necessary endpoint URL. Copy the URL in the site-admin config:
- The endpoint URL will look something like this:
- `https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/publishers/anthropic/models/`
-
-3. Example URL:
-`https://us-east5-aiplatform.googleapis.com/v1/projects/sourcegraph-vertex-staging/locations/us-east5/publishers/anthropic/models`
-
-
-### Use Azure OpenAI Service
-
-Create a project in the Azure OpenAI Service portal. Go to **Keys and Endpoint** from the project overview and get **one of the keys** on that page and the **endpoint**.
-
-Next, under **Model deployments**, click "manage deployments" and ensure you deploy the models you want, for example, `gpt-35-turbo`. Take note of the **deployment name**.
-
-Once done, go to **Site admin > Site configuration** (`/site-admin/configuration`) on your instance and set:
-
-```json
-{
- // [...]
- "cody.enabled": true,
- "completions": {
- "provider": "azure-openai",
- "chatModel": "",
- "fastChatModel": "",
- "completionModel": "", // the model must support the legacy completions endpoint such as gpt-3.5-turbo-instruct
- "endpoint": "",
- "accessToken": ""
- }
-}
-```
-
-For the access token, you can either:
-
-- As of 5.2.4 the access token can be left empty and it will rely on Environmental, Workload Identity or Managed Identity credentials configured for the `frontend` and `worker` services
-- Set it to `` if directly configuring the credentials using the API key specified in the Azure portal
-
-### Use StarCoder for Autocomplete
-
-When tested with other coder models for the autocomplete use case, [StarCoder](https://huggingface.co/blog/starcoder) offered significant improvements in quality and latency compared to our control groups for users on Sourcegraph.com. You can read more about the improvements in our [October 2023 release notes](https://sourcegraph.com/blog/feature-release-october-2023) and the [GA release notes](https://sourcegraph.com/blog/cody-is-generally-available).
-
-To ensure a fast and reliable experience, we are partnering with [Fireworks](https://fireworks.ai/) and have set up a dedicated hardware deployment for our Enterprise users. Sourcegraph supports StarCoder using the [Cody Gateway](/cody/core-concepts/cody-gateway).
-
-To enable StarCoder go to **Site admin > Site configuration** (`/site-admin/configuration`) and change the `completionModel`:
-
-```json
-{
- // [...]
- "cody.enabled": true,
- "completions": {
- "provider": "sourcegraph",
- "completionModel": "fireworks/starcoder"
- }
-}
-```
-
-Users of the Cody Extensions will automatically pick up this change when connected to your Enterprise instance.
-
-
-# Model Configuration
-
-Sourcegraph v5.6.0 or later supports the ability to choose between different LLM models, allowing developers to use the best model for Cody Chat as needed. This is accomplished exposing much more flexible configuration options for Cody when using Sourcegraph Enterprise. The newer style of configuration is described next. However, you can still use the [Older style "Completions" Configuration](#legacy-completions-configuration).
-
-## Quickstart
-
-Before you start, please note that the model configuration is an early access program (EAP) and we are working towards improving on its coverage of supported providers. If you are having any issues with this configuration, please reach out to your Sourcegraph Account Representative or roll back your configuration to the Legacy "Completions" configuration
-
-The simplest way to configure your Sourcegraph Enterprise would be to add the following configuration section to your instance's [site configuration](/admin/config/site_config):
-
-```json
- ...
- "cody.enabled": true,
- "modelConfiguration": {
- "sourcegraph": {}
- },
- ...
-```
-
-The `"modelConfiguration"` section defines which LLM models are supported by the Sourcegraph instance, and how to invoke them. The `"sourcegraph"` section defines how Sourcegraph-supplied LLM models should be configured. (That is, LLM models made available by the [Cody Gateway](/cody/core-concepts/cody-gateway) service.) The default settings will expose all current Cody Gateway models from your Sourcegraph instance, and make them available to users.
-
-However, if you are seeking more control and wish to restrict which LLM models are available, or if you wish to use your own API access key, you can expand upon the `"modelConfiguration"` section as needed.
-
-## Concepts
-
-The LLM models available for use from a Sourcegraph Enterprise instance are the union of "Sourcegraph-supplied models" and any custom models providers that you explicitly add to your Sourcegraph instance's site configuration. For most administrators, just relying on Sourcegraph-supplied models will ensure that you are using quality models without needing to worry about the specifics.
-
-### Sourcegraph-supplied Models
-
-The Sourcegraph-supplied models are those that are available from [Cody Gateway](/cody/core-concepts/cody-gateway), and your site configuration controls which of those models can be used.
-
-If you wish to not use _any_ Sourcegraph-supplied models, and instead _only_ rely on those you have explicitly defined in your site configuration, you can set the `"sourcegraph"` field to `null`.
-
-There are three top-level settings for configuring Sourcegraph-supplied LLM models:
-
-| Field | Description |
-| ----------- | ---------------------------------------------------------------------------------------- |
-| `endpoint` (optional) | The URL for connecting to Cody Gateway, defaults to the production instance. |
-| `accessToken` (optional) | The access token used to connect to Cody Gateway, defaulting to the current license key. |
-| `modelFilters` (optional) | Filters for which models to include from Cody Gateway. |
-
-**Model Filters**
-
-The `"modelFilters"` section is how you restrict which Cody Gateway models are made available to your Sourcegraph Enterprise instance's users.
-
-The first field is the `"statusFilter"`. Each LLM model is given a label by Sourcegraph as per its release, such as "stable", beta", or "experimental". By default, all models available on
-Cody Gateway are exposed. Using the category filter ensures that only models with a particular category are made available to your users.
-
-The `"allow"` and `"deny"` fields, are arrays of [model references](#model-configuration) for what models should or should not be included. These values accept wild cards.
-
-
-The following examples illustrate how to use all these settings in conjunction:
-
-```json
-"cody.enabled": true,
-"modelConfiguration": {
- "sourcegraph": {
- "modelFilters": {
- // Only allow "beta" and "stable" models.
- // Not "experimental" or "deprecated".
- "statusFilter": ["beta", "stable"],
-
- // Allow any models provided by Anthropic, OpenAI, Google and Fireworks.
- "allow": [
- "anthropic::*", // Anthropic models
- "openai::*", // OpenAI models
- "google::*", // Google Gemini models
- "fireworks::*", // Autocomplete models like StarCoder and DeepSeek-V2-Coder hosted on Fireworks
- ],
-
- // Do not include any models with the Model ID containing "turbo",
- // or any from AcmeCo.
- "deny": [
- "*turbo*",
- "acmeco::*"
- ]
- }
- }
-}
-```
-
-## Default Models
-
-The `"modelConfiguration"` setting also contains a `"defaultModels"` field that allows you to specify the LLM model used depending on the situation. If no default is specified, or refers to a model that isn't found, it will silently fallback to a suitable alternative.
-
-```json
- ...
- "cody.enabled": true,
- "modelConfiguration": {
- "defaultModels": {
- "chat": "anthropic::2023-06-01::claude-3.5-sonnet",
- "codeCompletion": "anthropic::2023-06-01::claude-3.5-sonnet",
- "fastChat": "anthropic::2023-06-01::claude-3-haiku"
- }
- }
- ...
-```
-
-The format of these strings is a "Model Reference", which is a format for uniquely identifying each LLM model exposed from your Sourcegraph instance.
-
-## Advanced Configuration
-
-For most administrators, relying on the LLM models made available by Cody Gateway is sufficient. However, if even more customization is required, you can configure your own LLM providers and models.
-
-Defining your own LLM providers and models is an advanced use-case and requires care to get the correct results. It also may bypass protections to ensure compatibility between your Sourcegraph instance and LLMs. If you need help contact your Sourcegraph account executive.
-
-### Overview
-
-The `"modelConfiguration"` section exposes two fields `"providerOverrides"` and `"modelOverrides"`. These may override any Sourcegraph-supplied data, or simply introduce new ones entirely.
-
-### Provider Configuration
-
-A "provider" is a way to organize LLM models. Typically a provider would be referring to the company that produced the model. Or the specific API/service being used to access the model. But conceptually, it's just a namespace.
-
-By defining a provider override in your Sourcegraph site configuration, you are introducing a new namespace to contain models. Or customize the existing provider namespace supplied by Sourcegraph. (e.g. all `"anthropic"` models.)
-
-The following configuration shippet defines a single provider override with the ID `"anthropic"`.
-
-```json
-"cody.enabled": true,
-"modelConfiguration": {
- // Do not use any Sourcegraph-supplied models.
- "sourcegraph": null,
-
- // Define a provider for "anthropic".
- "providerOverrides": [
- {
- "id": "anthropic",
- "displayName": "Anthropic models, sent directly to anthropic.com",
-
- // The server-side config section defines how this provider operates.
- "serverSideConfig": {
- "type": "anthropic",
- "accessToken": "sk-ant-api03-xxxxxxxxx",
- "endpoint": "https://api.anthropic.com/v1/messages"
- },
-
- // The default model configuration provides defaults for all LLM
- // models using this provider.
- "defaultModelConfig": {
- "capabilities": [
- "chat",
- "autocomplete"
- ],
- "contextWindow": {
- "maxInputTokens": 10000,
- "maxOutputTokens": 4000
- },
- "category": "balanced",
- "status": "stable"
- }
- }
- ],
- ...
-}
-```
-
-**Server-side Configuration**
-
-The most important part of a provider's configuration is the `"serverSideConfig"` field. That defines how the LLM model's should be invoked, i.e. which external service or API will be called to serve LLM requests.
-
-In the example, the `"type"` field was `"anthropic"`. Meaning that any interactions using the `"anthropic"` provider would be sent directly to Anthropic, at the supplied `endpoint` URL using the given `accessToken`.
-
-However, Sourcegraph supports several different types of LLM API providers natively. The current set of supported LLM API providers is:
-
-| Provider type | Description |
-| -------------------- | ------------ |
-| `"sourcegraph"` | [Cody Gateway](/cody/core-concepts/cody-gateway), which supports many different models from various services |
-| `"openaicompatible"` | Any OpenAI-compatible API implementation |
-| `"awsBedrock"` | [Amazon Bedrock](https://docs.aws.amazon.com/bedrock/) |
-| `"azureOpenAI"` | [Microsoft Azure OpenAI](https://azure.microsoft.com/en-us/products/ai-services/openai-service/) |
-| `"anthropic"` | [Anthropic](https://www.anthropic.com) |
-| `"fireworks"` | [Fireworks AI](https://fireworks.ai) |
-| `"google"` | [Google Gemini](http://cloud.google.com/gemini) and [Vertex](https://cloud.google.com/vertex-ai/) |
-| `"openai"` | [OpenAI](http://platform.openai.com) |
-| `"huggingface-tgi"` | [Hugging Face Text Generation Interface](https://huggingface.co/docs/text-generation-inference/en/index) |
-
-### Model Configuration
-
-With a provider defined, we can now specify custom models using that provider by adding them to the `"modelOverrides"` section.
-
-**Model Reference**
-
-The following configuration snippet defines a custom model, using the `"anthropic"` provider from the previous example.
-
-```json
-"cody.enabled": true,
-"modelConfiguration": {
- ...
- "modelOverrides": [
- {
- "modelRef": "anthropic::2024-06-20::claude-3-5-sonnet",
- "displayName": "Claude 3.5 Sonnet",
- "modelName": "claude-3-5-sonnet-20240620",
- "contextWindow": {
- "maxInputTokens": 45000,
- "maxOutputTokens": 4000
- },
- "capabilities": ["chat", "autocomplete"],
- "category": "balanced",
- "status": "stable"
- },
- ],
- ...
-}
-```
-
-Most of the configuration fields are self-explanatory, such as labeling the model's category ("stable", "beta") or category ("accuracy", "speed"). The more important fields are described below:
-
-**modelRef**. Each model is given a unique identifier, referred to as a model reference or "mref". This is a string of the form `${providerId}::${apiVersionId}::${modelId}`.
-
-In order to associate a model with your provider, the `${providerId}` must match. The `${modelId}` can be almost any URL-safe string.
-
-The `${apiVersionId}` is required in order to detect compatibility issues between newer models and Sourcegraph instances. In the following example, the string "2023-06-01" is used to clarify that this LLM model should formulate API requests using that version of the Anthropic API. If you are unsure, when defining your own models you can leave this as `"unknown"`.
-
-**contextWindow**. The context window is the number of "tokens" sent to an LLM. Either in the amount of contextual data sent in the LLM prompt (e.g. the question, relevant snippets, etc.) and the maximum size of the output allowed in the response. These values directly control factors such as the time it takes to respond to a prompt and the cost of the LLM request. And each LLM model or provider may have their own limits as well.
-
-**modelName**. The model _name_, is the value required by the LLM model's API provider. In this example, the `modelRef` defined the model's ID as `claude-3-sonnet` but the `modelName` was the more specific "claude-3-sonnet-20240229".
-
-**capabilities**. The capabilities of a model determine which situations the model can be used. For example, models only supported for "autocomplete" will not be available for Cody chats.
-
-
-
-It's recommended that every instance admin not using a third-party LLM provider makes this change and we are planning to make this the default in a future release.
diff --git a/docs/cody/enterprise/completions-configuration.mdx b/docs/cody/enterprise/completions-configuration.mdx
new file mode 100644
index 000000000..d543437d9
--- /dev/null
+++ b/docs/cody/enterprise/completions-configuration.mdx
@@ -0,0 +1,186 @@
+# Completions Configuration
+
+Learn how to configure Cody via `completions` on a Sourcegraph Enterprise instance.
+
+Configuring Cody via `completions` is legacy but it's still supported. We recommend using the new [`modelConfiguration`](/cody/enterprise/model-configuration) for flexible LLM model selection.
+
+[Cody Enterprise](https://sourcegraph.com/enterprise) supports many models and model providers. You can configure Cody Enterprise to access models via Sourcegraph Cody Gateway or directly using your own model provider account or infrastructure. Let's look at these options in more detail.
+
+## Using Sourcegraph Cody Gateway
+
+This is the recommended way to configure Cody Enterprise. It supports all the latest models from Anthropic, OpenAI, Mistral, and more without requiring a separate account or incurring separate charges. You can learn more about these in our [supported models](/cody/capabilities/supported-models) docs.
+
+## Using your organization's account with a model provider
+
+### Example: Your organization's Anthropic account
+
+First, [create your own key with Anthropic](https://console.anthropic.com/account/keys). Once you have the key, go to **Site admin > Site configuration** (`/site-admin/configuration`) on your instance and set:
+
+```json
+{
+ // [...]
+ "cody.enabled": true,
+ "completions": {
+ "provider": "anthropic",
+ "chatModel": "claude-2.0", // Or any other model you would like to use
+ "fastChatModel": "claude-instant-1.2", // Or any other model you would like to use
+ "completionModel": "claude-instant-1.2", // Or any other model you would like to use
+ "accessToken": ""
+ }
+}
+```
+
+### Example: Your organization's OpenAI account
+
+First, [create your own key with OpenAI](https://beta.openai.com/account/api-keys). Once you have the key, go to **Site admin > Site configuration** (`/site-admin/configuration`) on your instance and set:
+
+```json
+{
+ // [...]
+ "cody.enabled": true,
+ "completions": {
+ "provider": "openai",
+ "chatModel": "gpt-4", // Or any other model you would like to use
+ "fastChatModel": "gpt-3.5-turbo", // Or any other model you would like to use
+ "completionModel": "gpt-3.5-turbo-instruct", // Or any other model that supports the legacy completions endpoint
+ "accessToken": ""
+ }
+}
+```
+
+[Learn more about OpenAI models.](https://platform.openai.com/docs/models)
+
+## Using your organization's public cloud infrastructure
+
+### Example: Use Amazon Bedrock
+
+You can use Anthropic Claude models on [Amazon Bedrock](https://aws.amazon.com/bedrock/).
+
+First, make sure you can access Amazon Bedrock. Then, request access to the Anthropic Claude models in Bedrock. This may take some time to provision.
+
+Next, create an IAM user with programmatic access in your AWS account. Depending on your AWS setup, different ways may be required to provide access. All completion requests are made from the `frontend` service, so this service needs to be able to access AWS.
+
+You can use instance role bindings or directly configure the IAM user credentials in the configuration. The `AWS_REGION` environment variable must also be set in the `frontend` container to scope the IAM credentials for the AWS region hosting the Bedrock endpoint.
+
+Once ready, go to **Site admin > Site configuration** (`/site-admin/configuration`) on your instance and set:
+
+```json
+{
+ // [...]
+ "cody.enabled": true,
+ "completions": {
+ "provider": "aws-bedrock",
+ "chatModel": "anthropic.claude-3-opus-20240229-v1:0",
+ "completionModel": "anthropic.claude-instant-v1",
+ "endpoint": "",
+ "accessToken": ""
+ }
+}
+```
+
+For the `chatModel` and `completionModel` fields, see [Amazon's Bedrock docs](https://docs.aws.amazon.com/bedrock/latest/userguide/model-ids.html) for an up-to-date list of supported model IDs, and cross reference against Sourcegraph's [supported LLM list](/cody/capabilities/supported-models) to verify compatibility with Cody.
+
+For `endpoint`, you can either:
+
+- For **pay-as-you-go**, set it to an AWS region code (e.g., `us-west-2`) when using a public Amazon Bedrock endpoint
+- For **provisioned throughput**, set it to the provisioned VPC endpoint for the `bedrock-runtime` API (e.g., `"https://vpce-0a10b2345cd67e89f-abc0defg.bedrock-runtime.us-west-2.vpce.amazonaws.com"`)
+
+For `accessToken`, you can either:
+
+- Leave it empty and rely on instance role bindings or other AWS configurations in the `frontend` service
+- Set it to `:` if directly configuring the credentials
+- Set it to `::` if a session token is also required
+
+### Example: Using GCP Vertex AI
+
+On [GCP Vertex](https://cloud.google.com/vertex-ai/generative-ai/docs/partner-models/use-claude), we only support Anthropic Claude models.
+
+- Enable the [Vertex AI API](https://console.cloud.google.com/marketplace/product/google/aiplatform.googleapis.com) in the GCP console. Once Vertex has been enabled in your project, navigate to the [Vertex Model Garden](https://console.cloud.google.com/vertex-ai/model-garden) to select and enable the Anthropic Claude model(s) that you wish to use with Cody. See [Supported LLM Models](/capabilities/supported-models) for an up-to-date list of Anthropic Claude models supported by Cody.
+
+It may take some time to enable Vertex and provision access to the models you plan to use
+
+1. **Create a Service Account**:
+ - Create a [service account](https://cloud.google.com/iam/docs/service-account-overview)
+ - Assign the `Vertex AI User` role to the service account
+ - Generate a JSON key for the service account and download it
+
+2. **Convert JSON Key to Base64** by doing:
+
+```python
+cat | base64
+```
+
+Once ready, go to **Site admin > Site configuration** (`/site-admin/configuration`) on your instance and set:
+
+```json
+{
+ // [...]
+ "cody.enabled": true,
+ "completions": {
+ "chatModel": "claude-3-opus@20240229",
+ "completionModel": "claude-3-haiku@20240307",
+ "provider": "google",
+ "endpoint": "",
+ "accessToken": ""
+ }
+}
+
+```
+
+For the `endpoint`, you can:
+
+- Go to the Claude 3 Haiku docs on the GCP Console Model garden
+- Scroll through the page to find the example that shows how to use the `cURL` command with the Claude 3 Haiku model. The example will include a sample request JSON body and the necessary endpoint URL. Copy the URL in the site-admin config
+- The endpoint URL will look something like this:
+ `https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/publishers/anthropic/models/`
+- Example URL:
+`https://us-east5-aiplatform.googleapis.com/v1/projects/sourcegraph-vertex-staging/locations/us-east5/publishers/anthropic/models`
+
+### Example: Use Azure OpenAI service
+
+Create a project in the Azure OpenAI Service portal. From the project overview, go to **Keys and Endpoint and get one of the keys** on that page and the **endpoint**.
+
+Next, under **Model deployments**, click **manage deployments** and ensure you deploy the models you want, for example, `gpt-35-turbo`. Take note of the **deployment name**.
+
+Once done, go to **Site admin > Site configuration** (`/site-admin/configuration`) on your instance and set:
+
+```json
+{
+ // [...]
+ "cody.enabled": true,
+ "completions": {
+ "provider": "azure-openai",
+ "chatModel": "",
+ "fastChatModel": "",
+ "completionModel": "", // the model must support the legacy completions endpoint such as gpt-3.5-turbo-instruct
+ "endpoint": "",
+ "accessToken": ""
+ }
+}
+```
+
+For the access token,
+
+- For Sourcegraph `v5.2.4` or more, the access token can be left empty, and it will rely on Environmental, Workload Identity, or Managed Identity credentials configured for the `frontend` and `worker` services
+- Set it to `` if directly configuring the credentials using the API key specified in the Azure portal
+
+### Use StarCoder for autocomplete
+
+When tested with other coder models for the autocomplete use case, [StarCoder](https://huggingface.co/blog/starcoder) offered significant improvements in quality and latency compared to our control groups for users on Sourcegraph.com. You can read more about the improvements in our [October 2023 release notes](https://sourcegraph.com/blog/feature-release-october-2023) and the [GA release notes](https://sourcegraph.com/blog/cody-is-generally-available).
+
+To ensure a fast and reliable experience, we are partnering with [Fireworks](https://fireworks.ai/) and have set up a dedicated hardware deployment for our Enterprise users. Sourcegraph supports StarCoder using the [Cody Gateway](/cody/core-concepts/cody-gateway).
+
+To enable StarCoder, go to **Site admin > Site configuration** (`/site-admin/configuration`) and change the `completionModel`:
+
+```json
+{
+ // [...]
+ "cody.enabled": true,
+ "completions": {
+ "provider": "sourcegraph",
+ "completionModel": "fireworks/starcoder"
+ }
+}
+```
+
+Users of the Cody extensions will automatically pick up this change when connected to your Enterprise instance.
diff --git a/docs/cody/enterprise/features.mdx b/docs/cody/enterprise/features.mdx
new file mode 100644
index 000000000..76b842b6c
--- /dev/null
+++ b/docs/cody/enterprise/features.mdx
@@ -0,0 +1,113 @@
+# Cody Enterprise features
+
+Along with the core features, Cody Enterprise offers additional features to enhance your coding experience.
+
+## IDE token expiry
+
+Site administrators can set the duration of access tokens for users connecting Cody from their IDEs (VS Code, JetBrains, etc.). This can be configured from the **Site admin** page of the Sourcegraph Enterprise instance. Available options include **7, 14, 30, 60, and 90 days**.
+
+
+
+## Guardrails
+
+Guardrails for public code is currently in Beta and is supported with VS Code and JetBrains IDEs extensions.
+
+Open source attribution guardrails for public code, commonly called copyright guardrails, reduce the exposure to copyrighted code. This involves implementing a verification mechanism within Cody to ensure that any code generated by the platform does not replicate open source code.
+
+Guardrails for public code are available to all Sourcegraph Enterprise instances and are **disabled** by default. You can enable them from the Site configuration section by setting `attribution.enabled` to `true`.
+
+Guardrails don't differentiate between license types. It matches any code snippet that is at least **ten lines** long from the **290,000** indexed open source repositories.
+
+## Admin controls
+
+Admin controls are supported with VS Code and JetBrains IDE extension.
+
+Site administrators have selective control over users' access to Cody Enterprise, which is managed via the Sourcegraph role-based access control system. This provides a more intuitive user interface for assigning permission to use Cody.
+
+## Analytics
+
+Cody Analytics are supported with VS Code IDE extension and on the latest versions of JetBrains IDEs.
+
+Cody Enterprise users can view analytics for their instance. A separately managed cloud service for Cody analytics handles user auth, gets metrics data from Sourcegraph's BigQuery instance, and visualizes the metrics data.
+
+The following metrics are available for Cody Enterprise users:
+
+| **Metric Type** | **What is measured?** |
+| --------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
+| Active users | - Total active users
- Average daily users
- Average no. of days each user used Cody (of last 30 days)
- Cody users by day (last 30 days)
- Cody users by month (last two months)
- Cody users by number of days used |
+| Completions | - Total accepted completions
- Minutes saved per completion
- Hours saved by completions
- Cody completions by day
- Completions acceptance rate
- Weighted completions acceptance rate
- Average completion latency
- Acceptance rate by language
|
+| Chat | - Total chat events
- Minutes saved per chat
- Hours saved by chats
- Cody chats by day |
+| Commands | - Total command events
- Minutes saved per command
- Hours saved by commands
- Cody commands by day
- Most used commands |
+
+To enable Cody Analytics:
+
+- Create an account on [Sourcegraph Accounts](https://accounts.sourcegraph.com/)
+- A user already having a Sourcegraph.com account gets automatically migrated to Sourcegraph Accounts. Users can sign in to Cody Analytics using their email and password
+- Users without a Sourcegraph.com account should contact one of our team members. They can help with both the account setup and assigning instances to specific users
+- Map your user account to a Sourcegraph instance, and this gives you access to Cody analytics
+
+## Multi-repository context
+
+Cody supports multi-repository context, allowing you to search up to 10 repositories simultaneously for relevant information. Open a new chat, type `@`, and select `Remote Repositories.`
+
+Keep @-mentioning repos that you want to include in your context. This flexibility lets you get more comprehensive and accurate responses by leveraging information across multiple codebases.
+
+## @-mention directories
+
+To better support teams working with large monorepos, Enterprise users can `@-mention` directories when chatting with Cody. This helps you define more specific directories and sub-directories within that monorepo to give more precise context.
+
+To do this, type `@` in the chat, and then select **Directories** to search other repositories for context in your codebase.
+
+
+
+Please note that you can only `@-mention` remote directories (i.e., directories in your Sourcegraph instance) but not local directories. This means any recent changes to your directories can't be utilized as context until your Sourcegraph instance re-indexes any changes.
+
+If you want to include recent changes that haven't been indexed in your Sourcegraph instance, you can `@-mention` specific files, lines of code, or symbols.
+
+## Prompt pre-instructions
+
+Prompt pre-instructions is supported for Sourcegraph `v5.10` and more.
+
+**Site admins** can add and configure prompt pre-instructions for Cody. These are text instructions that Cody uses with every chat query, allowing organizations to configure how Cody responds to their users.
+
+For example, if you don’t want Cody to answer questions relating to sensitive non-code matters, you can pre-instruct Cody about it. In this case, if a user asks an unrelated question, Cody responds with a pre-instructed context.
+
+To configure pre-instructions, add the following to your site admin configuration file:
+
+```json
+{ ... "modelConfiguration": { "systemPreInstruction": "If the question is not directly related to software development, respond with \"I can only answer programming-related questions\"" } }
+```
+
+We cannot guarantee that these pre-instructions or their intended use case will be fully accurate. If your pre-instructions are not working as expected, please get in touch with us.
+
+## Supported LLM models
+
+Sourcegraph Enterprise supports different LLM providers and models, such as models from Anthropic and OpenAI. You can do this by adjusting your Sourcegraph instance configuration.
+
+
+For the supported LLM models listed above, refer to the following notes:
+
+1. Microsoft Azure is planning to deprecate the APIs used in Sourcegraph version `>5.3.3` on July 1, 2024 [Source](https://learn.microsoft.com/en-us/azure/ai-services/openai/api-version-deprecation)
+2. Claude 2.1 is not recommended
+3. Sourcegraph doesn’t recommend using the GPT-4 (non-Turbo), Claude 1, or Claude 2 models anymore
+4. Only supported through legacy completions API
+5. BYOK (Bring Your Own Key) with managed services are only supported for Self-hosted Sourcegraph instances
+6. GPT-4 and GPT-4o for completions have a bug that is resulting in many failed completions
+
+{/*Temporarily removed*/}
+
+ {/* ## Supported model configuration
+
+Use the drop-down menu to make your desired selection and get a detailed breakdown of the supported model configuration for each provider on Cody Enterprise. This is an on-site configuration. Admins should pick a value from the table for `chatModel` to configure their chat model.
+
+
+
+For the supported LLM model configuration listed above, refer to the following notes:
+
+1. Microsoft Azure is planning to deprecate the APIs used in Sourcegraph version `>5.3.3` on July 1, 2024 [Source](https://learn.microsoft.com/en-us/azure/ai-services/openai/api-version-deprecation)
+2. Claude 2.1 is not recommended
+3. Sourcegraph doesn't recommend GPT-4 non-turbo, Claude 1 or 2 models
+4. Only supported through legacy completions API
+5. BYOK (Bring Your Own Key) with managed services are only supported for Self-hosted Sourcegraph instances
+
+ */}
diff --git a/docs/cody/enterprise/model-config-examples.mdx b/docs/cody/enterprise/model-config-examples.mdx
new file mode 100644
index 000000000..6c936f4cf
--- /dev/null
+++ b/docs/cody/enterprise/model-config-examples.mdx
@@ -0,0 +1,685 @@
+# Model Configuration Examples
+
+
+ This section includes examples about how to configure Cody to use
+ Sourcegraph-provided models with `modelConfiguration`. These examples will
+ use the following:
+
+
+- [Minimal configuration](/cody/enterprise/model-configuration#configure-sourcegraph-provided-models)
+- [Using model filters](/cody/enterprise/model-configuration#model-filters)
+- [Change default models](/cody/enterprise/model-configuration#default-models)
+
+## Sourcegraph-provided models and BYOK (Bring Your Own Key)
+
+By default, Sourcegraph is fully aware of several models from the following providers:
+
+- "anthropic"
+- "google"
+- "fireworks"
+- "mistral"
+- "openai"
+
+### Override configuration of a model provider
+
+Instead of Sourcegraph using its own servers to make LLM requests, it is possible to bring your own API keys for a given model provider. For example, if you wish for all Anthropic API requests to go directly to your own Anthropic account and use your own API keys instead of going via Sourcegraph's servers, you could override the `anthropic` provider's configuration:
+
+```json
+{
+"cody.enabled": true,
+"modelConfiguration": {
+ "sourcegraph": {},
+ "providerOverrides": [
+ {
+ "id": "anthropic",
+ "displayName": "Anthropic BYOK",
+ "serverSideConfig": {
+ "type": "anthropic",
+ "accessToken": "token",
+ "endpoint": "https://api.anthropic.com/v1/messages"
+ }
+ }
+ ],
+ "defaultModels": {
+ "chat": "anthropic::2024-10-22::claude-3.5-sonnet",
+ "fastChat": "anthropic::2023-06-01::claude-3-haiku",
+ "codeCompletion": "fireworks::v1::deepseek-coder-v2-lite-base"
+ }
+}
+```
+
+In the configuration above:
+
+- Enable Sourcegraph-provided models and do not set any overrides (note that `"modelConfiguration.modelOverrides"` is not specified)
+- Route requests for Anthropic models directly to the Anthropic API (via the provider override specified for "anthropic")
+- Route requests for other models (such as the Fireworks model for "autocomplete") through Cody Gateway
+
+### Partially override provider config in the namespace
+
+If you want to override the provider config for some models in the namespace and use the Sourcegraph-configured provider config for the rest, you can route requests directly to the LLM provider (bypassing the Cody Gateway) for some models while using the Sourcegraph-configured provider config for the rest.
+
+Example configuration
+
+```json
+{
+"cody.enabled": true,
+"modelConfiguration": {
+ "sourcegraph": {},
+ "providerOverrides": [
+ {
+ "id": "anthropic-byok",
+ "displayName": "Anthropic BYOK",
+ "serverSideConfig": {
+ "type": "anthropic",
+ "accessToken": "token",
+ "endpoint": "https://api.anthropic.com/v1/messages"
+ }
+ }
+ ],
+ "modelOverrides": [
+ {
+ "modelRef": "anthropic-byok::2023-06-01::claude-3.5-sonnet",
+ "displayName": "Claude 3.5 Sonnet",
+ "modelName": "claude-3-5-sonnet-latest",
+ "capabilities": ["edit", "chat"],
+ "category": "accuracy",
+ "status": "stable",
+ "contextWindow": {
+ "maxInputTokens": 45000,
+ "maxOutputTokens": 4000
+ }
+ },
+ ],
+ "defaultModels": {
+ "chat": "anthropic-byok::2023-06-01::claude-3.5-sonnet",
+ "fastChat": "anthropic::2023-06-01::claude-3-haiku",
+ "codeCompletion": "fireworks::v1::deepseek-coder-v2-lite-base"
+ }
+}
+```
+
+In the configuration above, we:
+
+- Enable Sourcegraph-supplied models (the `sourcegraph` field is not empty or `null`)
+- Define a new provider with the ID `"anthropic-byok"` and configure it to use the Anthropic API
+- Since this provider is unknown to Sourcegraph, no Sourcegraph-supplied models are available. Therefore, we add a custom model in the `"modelOverrides"` section
+- Use the custom model configured in the previous step (`"anthropic-byok::2024-10-22::claude-3.5-sonnet"`) for `"chat"`. Requests are sent directly to the Anthropic API as set in the provider override
+- For `"fastChat"` and `"autocomplete"`, we use Sourcegraph-provided models via Cody Gateway
+
+## Config examples for various LLM providers
+
+Below are configuration examples for setting up various LLM providers using BYOK. These examples are applicable whether or not you are using Sourcegraph-supported models.
+
+- In this section, all configuration examples have Sourcegraph-provided models disabled. Please refer to the previous section to use a combination of Sourcegraph-provided models and BYOK.
+- Ensure that at least one model is available for each Cody feature ("chat" and "autocomplete"), regardless of the provider and model overrides configured. To verify this, [view the configuration](/cody/enterprise/model-configuration#view-configuration) and confirm that appropriate models are listed in the `"defaultModels"` section.
+
+
+
+```json
+{
+"cody.enabled": true,
+"modelConfiguration": {
+ "sourcegraph": null,
+ "providerOverrides": [
+ {
+ "id": "anthropic",
+ "displayName": "Anthropic",
+ "serverSideConfig": {
+ "type": "anthropic",
+ "accessToken": "token",
+ "endpoint": "https://api.anthropic.com/v1/messages"
+ }
+ }
+ ],
+ "modelOverrides": [
+ {
+ "modelRef": "anthropic::2024-10-22::claude-3.5-sonnet",
+ "displayName": "Claude 3.5 Sonnet",
+ "modelName": "claude-3-5-sonnet-latest",
+ "capabilities": ["chat"],
+ "category": "accuracy",
+ "status": "stable",
+ "contextWindow": {
+ "maxInputTokens": 45000,
+ "maxOutputTokens": 4000
+ }
+ },
+ {
+ "modelRef": "anthropic::2023-06-01::claude-3-haiku",
+ "displayName": "Claude 3 Haiku",
+ "modelName": "claude-3-haiku-20240307",
+ "capabilities": ["chat"],
+ "category": "speed",
+ "status": "stable",
+ "contextWindow": {
+ "maxInputTokens": 7000,
+ "maxOutputTokens": 4000
+ }
+ },
+ {
+ "modelRef": "anthropic::2023-06-01::claude-3-haiku",
+ "displayName": "Claude 3 Haiku",
+ "modelName": "claude-3-haiku-20240307",
+ "capabilities": ["edit", "chat"],
+ "category": "speed",
+ "status": "stable",
+ "contextWindow": {
+ "maxInputTokens": 7000,
+ "maxOutputTokens": 4000
+ }
+ }
+ ],
+ "defaultModels": {
+ "chat": "anthropic::2024-10-22::claude-3.5-sonnet",
+ "fastChat": "anthropic::2023-06-01::claude-3-haiku",
+ "codeCompletion": "fireworks::v1::deepseek-coder-v2-lite-base"
+ }
+}
+```
+
+In the configuration above,
+
+- Set up a provider override for Anthropic, routing requests for this provider directly to the specified Anthropic endpoint (bypassing Cody Gateway)
+- Add three Anthropic models:
+ - Two models with chat capabilities (`"anthropic::2024-10-22::claude-3.5-sonnet"` and `"anthropic::2023-06-01::claude-3-haiku"`), providing options for chat users
+ - One model with autocomplete capability (`"fireworks::v1::deepseek-coder-v2-lite-base"`)
+- Set the configured models as default models for Cody features in the `"defaultModels"` field
+
+
+
+
+```json
+"cody.enabled": true,
+"modelConfiguration": {
+ "sourcegraph": null,
+ "providerOverrides": [
+ {
+ "id": "fireworks",
+ "displayName": "Fireworks",
+ "serverSideConfig": {
+ "type": "fireworks",
+ "accessToken": "token",
+ "endpoint": "https://api.fireworks.ai/inference/v1/completions"
+ }
+ }
+ ],
+ "modelOverrides": [
+ {
+ "modelRef": "fireworks::v1::mixtral-8x7b-instruct",
+ "displayName": "Mixtral 8x7B",
+ "modelName": "accounts/fireworks/models/mixtral-8x7b-instruct",
+ "capabilities": ["chat"],
+ "category": "other",
+ "status": "stable",
+ "contextWindow": {
+ "maxInputTokens": 7000,
+ "maxOutputTokens": 4000
+ }
+ },
+ {
+ "modelRef": "fireworks::v1::starcoder-16b",
+ "modelName": "accounts/fireworks/models/starcoder-16b",
+ "displayName": "(Fireworks) Starcoder 16B",
+ "contextWindow": {
+ "maxInputTokens": 8192,
+ "maxOutputTokens": 4096
+ },
+ "capabilities": ["autocomplete"],
+ "category": "balanced",
+ "status": "stable"
+ }
+ ],
+ "defaultModels": {
+ "chat": "fireworks::v1::mixtral-8x7b-instruct",
+ "fastChat": "fireworks::v1::mixtral-8x7b-instruct",
+ "codeCompletion": "fireworks::v1::starcoder-16b"
+ }
+}
+```
+
+In the configuration above,
+
+- Set up a provider override for Fireworks, routing requests for this provider directly to the specified Fireworks endpoint (bypassing Cody Gateway)
+- Add two Fireworks models:
+ - `"fireworks::v1::mixtral-8x7b-instruct"` with "chat" capabiity - used for "chat" and "fastChat"
+ - `"fireworks::v1::starcoder-16b"` with "autocomplete" capability - used for "autocomplete"
+
+
+
+
+
+```json
+"modelConfiguration": {
+ "sourcegraph": null,
+ "providerOverrides": [
+ {
+ "id": "openai",
+ "displayName": "OpenAI",
+ "serverSideConfig": {
+ "type": "openai",
+ "accessToken": "token",
+ "endpoint": "https://api.openai.com"
+ }
+ }
+ ],
+ "modelOverrides": [
+ {
+ "modelRef": "openai::2024-02-01::gpt-4o",
+ "displayName": "GPT-4o",
+ "modelName": "gpt-4o",
+ "capabilities": ["chat"],
+ "category": "accuracy",
+ "status": "stable",
+ "contextWindow": {
+ "maxInputTokens": 45000,
+ "maxOutputTokens": 4000
+ }
+ },
+ {
+ "modelRef": "openai::unknown::gpt-3.5-turbo-instruct",
+ "displayName": "GPT-3.5 Turbo Instruct",
+ "modelName": "gpt-3.5-turbo-instruct",
+ "capabilities": ["autocomplete"],
+ "category": "speed",
+ "status": "stable",
+ "contextWindow": {
+ "maxInputTokens": 7000,
+ "maxOutputTokens": 4000
+ }
+ }
+],
+ "defaultModels": {
+ "chat": "openai::2024-02-01::gpt-4o",
+ "fastChat": "openai::2024-02-01::gpt-4o",
+ "codeCompletion": "openai::unknown::gpt-3.5-turbo-instruct"
+ }
+}
+```
+
+In the configuration above,
+
+- Set up a provider override for OpenAI, routing requests for this provider directly to the specified OpenAI endpoint (bypassing Cody Gateway)
+- Add two OpenAI models:
+ - `"openai::2024-02-01::gpt-4o"` with "chat" capabilities - used for "chat" and "fastChat"
+ - `"openai::unknown::gpt-3.5-turbo-instruct"` with "autocomplete" capability - used for "autocomplete"
+
+
+
+
+
+```json
+"cody.enabled": true,
+"modelConfiguration": {
+ "sourcegraph": null,
+ "providerOverrides": [
+ {
+ "id": "azure-openai",
+ "displayName": "Azure OpenAI",
+ "serverSideConfig": {
+ "type": "azureOpenAI",
+ "accessToken": "token",
+ "endpoint": "https://acme-test.openai.azure.com/",
+ "user": "",
+ "useDeprecatedCompletionsAPI": true
+ }
+ }
+ ],
+ "modelOverrides": [
+ {
+ "modelRef": "azure-openai::unknown::gpt-4o",
+ "displayName": "GPT-4o",
+ "modelName": "gpt-4o",
+ "capabilities": ["chat"],
+ "category": "accuracy",
+ "status": "stable",
+ "contextWindow": {
+ "maxInputTokens": 45000,
+ "maxOutputTokens": 4000
+ }
+ },
+ {
+ "modelRef": "azure-openai::unknown::gpt-35-turbo-instruct-test",
+ "displayName": "GPT-3.5 Turbo Instruct",
+ "modelName": "gpt-35-turbo-instruct-test",
+ "capabilities": ["autocomplete"],
+ "category": "speed",
+ "status": "stable",
+ "contextWindow": {
+ "maxInputTokens": 7000,
+ "maxOutputTokens": 4000
+ }
+ }
+ ],
+ "defaultModels": {
+ "chat": "azure-openai::unknown::gpt-4o",
+ "fastChat": "azure-openai::unknown::gpt-4o",
+ "codeCompletion": "azure-openai::unknown::gpt-35-turbo-instruct-test"
+ }
+}
+```
+
+In the configuration above,
+
+- Set up a provider override for Azure OpenAI, routing requests for this provider directly to the specified Azure OpenAI endpoint (bypassing Cody Gateway).
+ **Note:** For Azure OpenAI, ensure that the `modelName` matches the name defined in your Azure portal configuration for the model.
+- Add two OpenAI models:
+ - `"azure-openai::unknown::gpt-4o"` with "chat" capability - used for "chat" and "fastChat"
+ - `"azure-openai::unknown::gpt-35-turbo-instruct-test"` with "autocomplete" capability - used for "autocomplete"
+- Since `"azure-openai::unknown::gpt-35-turbo-instruct-test"` is not supported on the newer OpenAI `"v1/chat/completions"` endpoint, we set `"useDeprecatedCompletionsAPI"` to `true` to route requests to the legacy `"v1/completions"` endpoint. This setting is unnecessary if you are using a model supported on the `"v1/chat/completions"` endpoint.
+
+
+
+
+
+```json
+"cody.enabled": true,
+"modelConfiguration": {
+ "sourcegraph": null,
+ "providerOverrides": [
+ {
+ "id": "fireworks",
+ "displayName": "Fireworks",
+ "serverSideConfig": {
+ "type": "openaicompatible",
+ "endpoints": [
+ {
+ "url": "https://api.fireworks.ai/inference/v1",
+ "accessToken": "token"
+ }
+ ]
+ }
+ },
+ {
+ "id": "huggingface-codellama",
+ "displayName": "Hugging Face",
+ "serverSideConfig": {
+ "type": "openaicompatible",
+ "endpoints": [
+ {
+ "url": "https://api-inference.huggingface.co/models/meta-llama/CodeLlama-7b-hf/v1/",
+ "accessToken": "token"
+ }
+ ]
+ }
+ },
+ ],
+ "modelOverrides": [
+ {
+ "modelRef": "fireworks::v1::llama-v3p1-70b-instruct",
+ "modelName": "llama-v3p1-70b-instruct",
+ "displayName": "(Fireworks) Llama 3.1 70B Instruct",
+ "contextWindow": {
+ "maxInputTokens": 64000,
+ "maxOutputTokens": 8192
+ },
+ "serverSideConfig": {
+ "type": "openaicompatible",
+ "apiModel": "accounts/fireworks/models/llama-v3p1-70b-instruct"
+ },
+ "clientSideConfig": {
+ "openaicompatible": {}
+ },
+ "capabilities": ["chat"],
+ "category": "balanced",
+ "status": "stable"
+ },
+ {
+ "modelRef": "huggingface-codellama::v1::CodeLlama-7b-hf",
+ "modelName": "CodeLlama-7b-hf",
+ "displayName": "(HuggingFace) CodeLlama-7b-hf",
+ "contextWindow": {
+ "maxInputTokens": 8192,
+ "maxOutputTokens": 4096
+ },
+ "serverSideConfig": {
+ "type": "openaicompatible",
+ "apiModel": "meta-llama/CodeLlama-7b-hf"
+ },
+ "clientSideConfig": {
+ "openaicompatible": {}
+ },
+ "capabilities": ["autocomplete", "chat"],
+ "category": "balanced",
+ "status": "stable"
+ }
+ ],
+ "defaultModels": {
+ "chat": "fireworks::v1::llama-v3p1-70b-instruct",
+ "fastChat": "fireworks::v1::llama-v3p1-70b-instruct",
+ "codeCompletion": "huggingface-codellama::v1::CodeLlama-7b-hf"
+ }
+}
+```
+
+In the configuration above,
+
+- Configure two OpenAI-compatible providers: `"fireworks"` and `"huggingface-codellama"`
+- Add two OpenAI-compatible models: `"fireworks::v1::llama-v3p1-70b-instruct"` and `"huggingface-codellama::v1::CodeLlama-7b-hf"`. Additionally:
+ - Set `clientSideConfig.openaicompatible` to `{}` to indicate to Cody clients that these models are OpenAI-compatible, ensuring the appropriate code paths are utilized
+ - Designate these models as the default choices for chat and autocomplete, respectively
+
+
+
+
+
+```json
+"modelConfiguration": {
+ "sourcegraph": null,
+ "providerOverrides": [
+ {
+ "id": "google",
+ "displayName": "Google Gemini",
+ "serverSideConfig": {
+ "type": "google",
+ "accessToken": "token",
+ "endpoint": "https://generativelanguage.googleapis.com/v1beta/models"
+ }
+ }
+ ],
+ "modelOverrides": [
+ {
+ "modelRef": "google::v1::gemini-1.5-pro",
+ "displayName": "Gemini 1.5 Pro",
+ "modelName": "gemini-1.5-pro",
+ "capabilities": ["chat", "autocomplete"],
+ "category": "balanced",
+ "status": "stable",
+ "contextWindow": {
+ "maxInputTokens": 45000,
+ "maxOutputTokens": 4000
+ }
+ }
+ ],
+ "defaultModels": {
+ "chat": "google::v1::gemini-1.5-pro",
+ "fastChat": "google::v1::gemini-1.5-pro",
+ "codeCompletion": "google::v1::gemini-1.5-pro"
+ }
+}
+```
+
+In the configuration above,
+
+- Set up a provider override for Google Gemini, routing requests for this provider directly to the specified endpoint (bypassing Cody Gateway)
+- Add the `"google::v1::gemini-1.5-pro"` model, which is used for all Cody features. We do not add other models for simplicity, as adding multiple models is already covered in the examples above
+
+
+
+
+
+```json
+"modelConfiguration": {
+ "sourcegraph": null,
+ "providerOverrides": [
+ {
+ "id": "google",
+ "displayName": "Google Anthropic",
+ "serverSideConfig": {
+ "type": "google",
+ "accessToken": "token",
+ "endpoint": "https://us-east5-aiplatform.googleapis.com/v1/projects/project-name/locations/us-east5/publishers/anthropic/models"
+ }
+ }
+ ],
+ "modelOverrides": [
+ {
+ "modelRef": "google::unknown::claude-3-5-sonnet",
+ "displayName": "Claude 3.5 Sonnet (via Google/Vertex)",
+ "modelName": "claude-3-5-sonnet@20240620",
+ "contextWindow": {
+ "maxInputTokens": 45000,
+ "maxOutputTokens": 4000
+ },
+ "capabilities": ["chat"],
+ "category": "accuracy",
+ "status": "stable"
+ },
+ {
+ "modelRef": "google::unknown::claude-3-haiku",
+ "displayName": "Claude 3 Haiku",
+ "modelName": "claude-3-haiku@20240307",
+ "capabilities": ["autocomplete", "chat"],
+ "category": "speed",
+ "status": "stable",
+ "contextWindow": {
+ "maxInputTokens": 7000,
+ "maxOutputTokens": 4000
+ }
+ },
+ ],
+ "defaultModels": {
+ "chat": "google::unknown::claude-3-5-sonnet",
+ "fastChat": "google::unknown::claude-3-5-sonnet",
+ "codeCompletion": "google::unknown::claude-3-haiku"
+ }
+}
+```
+
+In the configuration above,
+
+- Set up a provider override for Google Anthropic, routing requests for this provider directly to the specified endpoint (bypassing Cody Gateway)
+- Add two Anthropic models:
+ - `"google::unknown::claude-3-5-sonnet"` with "chat" capabiity - used for "chat" and "fastChat"
+ - `"google::unknown::claude-3-haiku"` with "autocomplete" capability - used for "autocomplete"
+
+
+
+
+
+```json
+"modelConfiguration": {
+ "sourcegraph": null,
+ "providerOverrides": [
+ {
+ "id": "google",
+ "displayName": "Google Gemini",
+ "serverSideConfig": {
+ "type": "google",
+ "accessToken": "token",
+ "endpoint": "https://us-east5-aiplatform.googleapis.com/v1/projects/project-name/locations/us-east5/publishers/anthropic/models"
+ }
+ }
+ ],
+ "modelOverrides": [
+ {
+ "modelRef": "google::unknown::claude-3-5-sonnet",
+ "displayName": "Claude 3.5 Sonnet (via Google/Vertex)",
+ "modelName": "claude-3-5-sonnet@20240620",
+ "contextWindow": {
+ "maxInputTokens": 45000,
+ "maxOutputTokens": 4000
+ },
+ "capabilities": ["chat"],
+ "category": "accuracy",
+ "status": "stable"
+ },
+ {
+ "modelRef": "google::unknown::claude-3-haiku",
+ "displayName": "Claude 3 Haiku",
+ "modelName": "claude-3-haiku@20240307",
+ "capabilities": ["autocomplete", "chat"],
+ "category": "speed",
+ "status": "stable",
+ "contextWindow": {
+ "maxInputTokens": 7000,
+ "maxOutputTokens": 4000
+ }
+ },
+ ],
+ "defaultModels": {
+ "chat": "google::unknown::claude-3-5-sonnet",
+ "fastChat": "google::unknown::claude-3-5-sonnet",
+ "codeCompletion": "google::unknown::claude-3-haiku"
+ }
+}
+```
+
+In the configuration above,
+
+- Set up a provider override for Google Anthropic, routing requests for this provider directly to the specified endpoint (bypassing Cody Gateway)
+- Add two Anthropic models: - `"google::unknown::claude-3-5-sonnet"` with "chat" capabiity - used for "chat" and "fastChat" - `"google::unknown::claude-3-haiku"` with "autocomplete" capability - used for "autocomplete"
+
+
+
+
+
+```json
+"cody.enabled": true,
+"modelConfiguration": {
+ "sourcegraph": null,
+ "providerOverrides": [
+ {
+ "id": "aws-bedrock",
+ "displayName": "Anthropic models through AWS Bedrock",
+ "serverSideConfig": {
+ "type": "awsBedrock",
+ "accessToken": "",
+ "region": "us-west-2"
+ }
+ }
+ ],
+ "modelOverrides": [
+ {
+ "modelRef": "aws-bedrock::2024-02-29::claude-3-sonnet",
+ "displayName": "Claude 3 Sonnet",
+ "modelName": "claude-3-sonnet",
+ "serverSideConfig": {
+ "type": "awsBedrockProvisionedThroughput",
+ "arn": "" // e.g., arn:aws:bedrock:us-west-2:548543007731:provisioned-model/47u2lgtk1rc1
+ },
+ "contextWindow": {
+ "maxInputTokens": 16000,
+ "maxOutputTokens": 4000
+ },
+ "capabilities": ["chat", "autocomplete"],
+ "category": "balanced",
+ "status": "stable"
+ },
+ ],
+ "defaultModels": {
+ "chat": "aws-bedrock::2024-02-29::claude-3-sonnet",
+ "codeCompletion": "aws-bedrock::2024-02-29::claude-3-sonnet",
+ "fastChat": "aws-bedrock::2024-02-29::claude-3-sonnet"
+ },
+}
+```
+
+In the configuration described above,
+
+- Set up a provider override for Amazon Bedrock, routing requests for this provider directly to the specified endpoint, bypassing Cody Gateway
+- Add the `"aws-bedrock::2024-02-29::claude-3-sonnet"` model, which is used for all Cody features. We do not add other models for simplicity, as adding multiple models is already covered in the examples above
+- Note: Since the model in the example uses provisioned throughput, specify the ARN in the `serverSideConfig.arn` field of the model override.
+
+Provider override `serverSideConfig` fields:
+
+| **Field** | **Description** |
+| ------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
+| `type` | Must be `"awsBedrock"`. |
+| `accessToken` | Leave empty to rely on instance role bindings or other AWS configurations in the frontend service. Use `:` for direct credential configuration, or `::` if a session token is also required. |
+| `endpoint` | For pay-as-you-go, set it to an AWS region code (e.g., `us-west-2`) when using a public Amazon Bedrock endpoint. For provisioned throughput, set it to the provisioned VPC endpoint for the bedrock-runtime API (e.g., `https://vpce-0a10b2345cd67e89f-abc0defg.bedrock-runtime.us-west-2.vpce.amazonaws.com`). |
+| `region` | The region to use when configuring API clients. This is necessary because the 'frontend' binary container cannot access environment variables from the host OS. |
+
+Provisioned throughput for Amazon Bedrock models can be configured using the `"awsBedrockProvisionedThroughput"` server-side configuration type. Refer to the [Model Overrides](/cody/enterprise/model-configuration#model-overrides) section for more details.
+
+
diff --git a/docs/cody/enterprise/model-configuration.mdx b/docs/cody/enterprise/model-configuration.mdx
new file mode 100644
index 000000000..59f3c45b7
--- /dev/null
+++ b/docs/cody/enterprise/model-configuration.mdx
@@ -0,0 +1,481 @@
+# Model Configuration
+
+
+ Learn how to configure Cody via `modelConfiguration` on a Sourcegraph
+ Enterprise instance.
+
+
+
+ `modelConfiguration` is the recommended way to configure chat and
+ autocomplete models in Sourcegraph `v5.6.0` and later.
+
+
+The `modelConfiguration` field in the **Site config** section allows you to configure Cody to use different LLM models for chat and autocomplete, enabling greater flexibility in selecting the best model for your needs.
+
+Using this configuration, you get an LLM selector in the Cody chat with an Enterprise instance that allows you to select the model you want to use.
+
+The LLM models available for a Sourcegraph Enterprise instance offer a combination of Sourcegraph-provided models and any custom models providers that you explicitly add to your Sourcegraph instance site configuration.
+
+## `modelConfiguration`
+
+The model configuration for Cody is managed through the `"modelConfiguration"` field in the **Site config** section. It includes the following fields:
+
+| **Field** | **Description** |
+| --------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
+| [`sourcegraph`](/cody/enterprise/model-configuration#sourcegraph-provided-models) | Configures access to Sourcegraph-provided models available through Cody Gateway. |
+| [`providerOverrides`](/cody/model-configuration#provider-overrides) | Configures access to models through your keys with most common LLM providers (BYOK) or hosted behind any openaicompatible endpoint |
+| [`modelOverrides`](/cody/enterprise/model-configuration#model-overrides) | Extends or modifies the list of models Cody recognizes and their configurations. |
+| [`selfHostedModels`](/cody/enterprise/model-configuration#self-hosted-models) | Adds models to Cody’s recognized models list with default configurations provided by Sourcegraph. Only available for certain models; general models can be configured in `modelOverrides`. |
+| [`defaultModels`](/cody/enterprise/model-configuration#default-models) | Specifies the models assigned to each Cody feature (chat, fast chat, autocomplete). |
+
+## Getting started with `modelConfiguration`
+
+The recommended and easiest way to set up model configuration is using Sourcegraph-provided models through the Cody Gateway.
+
+For a minimal configuration example, see [Configure Sourcegraph-supplied models](/cody/enterprise/model-configuration#configure-sourcegraph-provided-models).
+
+## Sourcegraph-provided models
+
+Sourcegraph-provided models, accessible through the [Cody Gateway](/cody/core-concepts/cody-gateway), are managed via your site configuration.
+
+For most administrators, relying on these models alone ensures access to high-quality models without needing to manage specific configurations.
+
+The use of these models is controlled through the `"modelConfiguration.sourcegraph"` field in the site config.
+
+### Configure Sourcegraph-provided models
+
+The minimal configuration for Sourcegraph-provided models is:
+
+```json
+"cody.enabled": true,
+"modelConfiguration": {
+ "sourcegraph": {}
+}
+```
+
+The above configuration sets up the following:
+
+- Sourcegraph-provided models are enabled (`sourcegraph` is not set to `null`)
+- Requests to LLM providers are routed through the Cody Gateway (no `providerOverrides` field specified)
+- Sourcegraph-defined default models are used for Cody features (no `defaultModels` field specified)
+
+There are three main settings for configuring Sourcegraph-provided LLM models:
+
+| **Field** | **Description** |
+| -------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------- |
+| `endpoint` | (Optional) The URL for connecting to Cody Gateway. The default is set to the production instance. |
+| `accessToken` | (Optional) The access token for connecting to Cody Gateway defaults to the current license key. |
+| [`modelFilters`](/cody/enterprise/model-configuration#model-filters) | (Optional) Filters specifying which models to include from Cody Gateway. |
+
+### Disable Sourcegraph-provided models
+
+To disable all Sourcegraph-provided models and use only the models explicitly defined in your site configuration, set the `"sourcegraph"` field to `null` as shown in the example below.
+
+```json
+"cody.enabled": true,
+"modelConfiguration": {
+ "sourcegraph": null, // ignore Sourcegraph-provided models
+ "providerOverrides": {
+ // define access to the LLM providers
+ },
+ "modelOverrides": {
+ // define models available via providers defined in the providerOverrides
+ },
+ "defaultModels": {
+ // set default models per Cody feature from the list of models defined in modelOverrides
+ }
+}
+```
+
+## Default models
+
+The `"modelConfiguration"` setting includes a `"defaultModels"` field, which allows you to specify the LLM model used for each Cody feature (`"chat"`, `"fastChat"`, and `"autocomplete"`). The values for each feature should be `modelRef`s of either Sourcegraph-provided models or models configured in the `modelOverrides` section.
+
+If no default is specified or the specified model is not found, the configuration will silently fall back to a suitable alternative.
+
+### Model Filters
+
+The `"modelFilters"` section allows you to control which Cody Gateway models are available to users of your Sourcegraph Enterprise instance. The following table describes each field:
+
+| **Field** | **Description** |
+| -------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------- |
+| `statusFilter` | Filters models based on their release status, such as stable, beta, deprecated or experimental. By default, all models available on Cody Gateway are accessible. |
+| `allow` | An array of `modelRef`s specifying which models to include. Supports wildcards. |
+| `deny` | An array of `modelRef`s specifying which models to exclude. Supports wildcards. |
+
+The following examples demonstrate how to use each of these settings together:
+
+```json
+"cody.enabled": true,
+"modelConfiguration": {
+ "sourcegraph": {
+ "modelFilters": {
+ // Only allow "beta" and "stable" models.
+ // Not "experimental" or "deprecated".
+ "statusFilter": ["beta", "stable"],
+
+ // Allow any models provided by Anthropic, OpenAI, Google and Fireworks.
+ "allow": [
+ "anthropic::*", // Anthropic models
+ "openai::*", // OpenAI models
+ "google::*", // Google Gemini models
+ "fireworks::*", // Open-source models hosted by Sourcegraph on Fireworks.ai (typically used for autocomplete and Mistral Chat models)
+ ],
+
+ // Example: Do not include any models with the Model ID containing "turbo",
+ // or any models from a hypothetical provider "AcmeCo"
+ "deny": [
+ "*turbo*",
+ "acmeco::*"
+ ]
+ }
+ }
+}
+```
+
+## Provider Overrides
+
+A `provider` is an organizational concept for grouping LLM models. Typically, a provider refers to the company that produced the model or the specific API/service used to access it, serving as a namespace.
+
+By defining a provider override in your Sourcegraph site configuration, you can introduce a new namespace to organize models or customize the existing provider namespace supplied by Sourcegraph (e.g., for all `"anthropic"` models).
+
+Provider overrides are configured via the `"modelConfiguration.providerOverrides"` field in the site configuration.
+This field is an array of items, each containing the following fields:
+
+| **Field** | **Description** |
+| ------------------ | --------------------------------------------------------------------------------- |
+| `id` | The namespace for models accessed via the provider. |
+| `displayName` | A human-readable name for the provider. |
+| `serverSideConfig` | Defines how to access the provider. See the section below for additional details. |
+
+Example configuration:
+
+```json
+"cody.enabled": true,
+"modelConfiguration": {
+ "sourcegraph": {},
+ "providerOverrides": [
+ {
+ "id": "openai",
+ "displayName": "OpenAI (via BYOK)",
+ "serverSideConfig": {
+ "type": "openai",
+ "accessToken": "token",
+ "endpoint": "https://api.openai.com"
+ }
+ },
+ ],
+ "defaultModels": {
+ "chat": "google::v1::gemini-1.5-pro",
+ "fastChat": "anthropic::2023-06-01::claude-3-haiku",
+ "autocomplete": "fireworks::v1::deepseek-coder-v2-lite-base"
+ }
+}
+```
+
+In the example above:
+
+- Sourcegraph-provided models are enabled (`sourcegraph` is not set to `null`)
+- The `"openai"` provider configuration is overridden to be accessed using your own key, with the provider API accessed directly via the specified `endpoint` URL. In contrast, models from the `"anthropic"` provider are accessed through the Cody Gateway
+
+Refer to the [examples page](/cody/enterprise/model-config-examples) for additional examples.
+
+### Server-side config
+
+The most important part of a provider's configuration is the `"serverSideConfig"` field, which defines how the LLM models should be invoked, i.e., which external service or API will handle the LLM requests.
+
+Sourcegraph natively supports several types of LLM API providers. The current set of supported providers includes:
+
+| **Provider type** | **Description** |
+| -------------------- | ---------------------------------------------------------------------------------------------------------- |
+| `"sourcegraph"` | [Cody Gateway](/cody/core-concepts/cody-gateway), offering access to various models from multiple services |
+| `"openaicompatible"` | Any OpenAI-compatible API implementation |
+| `"awsBedrock"` | [Amazon Bedrock](https://docs.aws.amazon.com/bedrock/) |
+| `"azureOpenAI"` | [Microsoft Azure OpenAI](https://azure.microsoft.com/en-us/products/ai-services/openai-service/) |
+| `"anthropic"` | [Anthropic](https://www.anthropic.com) |
+| `"fireworks"` | [Fireworks AI](https://fireworks.ai) |
+| `"google"` | [Google Gemini](http://cloud.google.com/gemini) and [Vertex](https://cloud.google.com/vertex-ai/) |
+| `"openai"` | [OpenAI](http://platform.openai.com) |
+| `"huggingface-tgi"` | [Hugging Face Text Generation Interface](https://huggingface.co/docs/text-generation-inference/en/index) |
+
+The configuration for `serverSideConfig` varies by provider type.
+
+Refer to the [examples page](/cody/enterprise/model-config-examples) for examples.
+
+## Model Overrides
+
+With a provider defined (either a Sourcegraph-provided provider or a custom provider configured via the `providerOverrides` field), custom models can be specified for that provider by adding them to the `"modelConfiguration.modelOverrides"` section.
+
+This field is an array of items, each with the following fields:
+
+- `modelRef`: Uniquely identifies the model within the provider namespace
+ - A string in the format `${providerId}::${apiVersionId}::${modelId}`
+ - To associate a model with your provider, `${providerId}` must match the provider’s ID
+ - `${modelId}` can be any URL-safe string
+ - `${apiVersionId}` specifies the API version, which helps detect compatibility issues between models and Sourcegraph instances. For example, `"2023-06-01"` can indicate that the model uses that version of the Anthropic API. If unsure, you may set this to `"unknown"` when defining custom models
+- `displayName`: An optional, user-friendly name for the model. If not set, clients should display the `ModelID` part of the `modelRef` instead (not the `modelName`)
+- `modelName`: A unique identifier the API provider uses to specify which model is being invoked. This is the identifier that the LLM provider recognizes to determine the model you are calling
+- `capabilities`: A list of capabilities that the model supports. Supported values: **autocomplete** and **chat**
+- `category`: Specifies the model's category with the following options:
+ - `"balanced"`: Typically the best default choice for most users. This category is suited for models like Sonnet 3.5 (as of October 2024)
+ - `"speed"`: Ideal for low-parameter models that may not suit general-purpose chat but are beneficial for specialized tasks, such as query rewriting
+ - `"accuracy"`: Reserved for models, like OpenAI o1, that use advanced reasoning techniques to improve response accuracy, though with slower latency
+ - `"other"`: Used for older models without distinct advantages in reasoning or speed. Select this category if you are uncertain about which category to choose
+ - `"deprecated"`: For models that are no longer supported by the provider and are filtered out on the client side (not available for use)
+- `contextWindow`: An object that defines the **number of tokens** (units of text) that can be sent to the LLM. This setting influences response time and request cost and may vary according to the limits set by each LLM model or provider. It includes two fields:
+ - `maxInputTokens`: Specifies the maximum number of tokens for the contextual data in the prompt (e.g., question, relevant snippets)
+ - `maxOutputTokens`: Specifies the maximum number of tokens allowed in the response
+- `serverSideConfig`: Additional configuration for the model. It can be one of the following:
+
+ - `awsBedrockProvisionedThroughput`: Specifies provisioned throughput settings for AWS Bedrock models with the following fields:
+ - `type`: Must be `"awsBedrockProvisionedThroughput"`
+ - `arn`: The ARN (Amazon Resource Name) for provisioned throughput to use when sending requests to AWS Bedrock. The ARN format for provisioned models is: `arn:${Partition}:bedrock:${Region}:${Account}:provisioned-model/${ResourceId}`.
+ - `openaicompatible`: Configuration specific to models provided by an OpenAI-compatible provider, with the following fields:
+ - `type`: Must be `"openaicompatible"`
+ - `apiModel`: The literal string value of the `model` field to be sent to the `/chat/completions` API. If set, Sourcegraph treats this as an opaque string and sends it directly to the API without inferring any additional information. By default, the configured model name is sent
+
+#### Example configuration
+
+```json
+"cody.enabled": true,
+"modelConfiguration": {
+ "sourcegraph": {},
+ "providerOverrides": [
+ {
+ "id": "huggingface-codellama",
+ "displayName": "huggingface",
+ "serverSideConfig": {
+ "type": "openaicompatible",
+ "endpoints": [
+ {
+ "url": "https://api-inference.huggingface.co/models/meta-llama/CodeLlama-7b-hf/v1/",
+ "accessToken": "token"
+ }
+ ]
+ }
+ }
+ ],
+ "modelOverrides": [
+ {
+ "modelRef": "huggingface-codellama::v1::CodeLlama-7b-hf",
+ "modelName": "CodeLlama-7b-hf",
+ "displayName": "(HuggingFace) CodeLlama-7b-hf",
+ "contextWindow": {
+ "maxInputTokens": 8192,
+ "maxOutputTokens": 4096
+ },
+ "serverSideConfig": {
+ "type": "openaicompatible",
+ "apiModel": "meta-llama/CodeLlama-7b-hf"
+ },
+ "clientSideConfig": {
+ "openaicompatible": {}
+ },
+ "capabilities": ["autocomplete", "chat"],
+ "category": "balanced",
+ "status": "stable"
+ }
+ ],
+ "defaultModels": {
+ "chat": "google::v1::gemini-1.5-pro",
+ "fastChat": "anthropic::2023-06-01::claude-3-haiku",
+ "autocomplete": "huggingface-codellama::v1::CodeLlama-7b-hf"
+ }
+}
+```
+
+In the example above:
+
+- Sourcegraph-provided models are enabled (`sourcegraph` is not set to `null`)
+- An additional provider, `"huggingface-codellama"`, is configured to access Hugging Face’s OpenAI-compatible API directly
+- A custom model, `"CodeLlama-7b-hf"`, is added using the `"huggingface-codellama"` provider
+- Default models are set up as follows:
+ - Sourcegraph-provided models are used for `"chat"` and `"fastChat"` (accessed via Cody Gateway)
+ - The newly configured model, `"huggingface-codellama::v1::CodeLlama-7b-hf"`, is used for `"autocomplete"` (connecting directly to Hugging Face’s OpenAI-compatible API)
+
+Refer to the [examples page](/cody/enterprise/model-config-examples) for additional examples.
+
+## View configuration
+
+To view the current model configuration, run the following command:
+
+```bash
+export INSTANCE_URL="https://sourcegraph.test:3443" # Replace with your Sourcegraph instance URL
+export ACCESS_TOKEN="your access token"
+
+curl --location "${INSTANCE_URL}/.api/modelconfig/supported-models.json" \
+--header "Authorization: token $ACCESS_TOKEN"
+```
+
+The response includes:
+
+- Configured providers and models—both Sourcegraph-provided (if enabled, with any applied filters) and any overrides
+- Default models for Cody features
+
+#### Example response
+
+```json
+"schemaVersion": "1.0",
+"revision": "0.0.0+dev",
+"providers": [
+ {
+ "id": "anthropic",
+ "displayName": "Anthropic"
+ },
+ {
+ "id": "fireworks",
+ "displayName": "Fireworks"
+ },
+ {
+ "id": "google",
+ "displayName": "Google"
+ },
+ {
+ "id": "openai",
+ "displayName": "OpenAI"
+ },
+ {
+ "id": "mistral",
+ "displayName": "Mistral"
+ }
+],
+"models": [
+ {
+ "modelRef": "anthropic::2024-10-22::claude-3-5-sonnet-latest",
+ "displayName": "Claude 3.5 Sonnet (Latest)",
+ "modelName": "claude-3-5-sonnet-latest",
+ "capabilities": ["chat"],
+ "category": "accuracy",
+ "status": "stable",
+ "contextWindow": {
+ "maxInputTokens": 45000,
+ "maxOutputTokens": 4000
+ }
+ },
+ {
+ "modelRef": "anthropic::2023-06-01::claude-3-opus",
+ "displayName": "Claude 3 Opus",
+ "modelName": "claude-3-opus-20240229",
+ "capabilities": ["chat"],
+ "category": "other",
+ "status": "stable",
+ "contextWindow": {
+ "maxInputTokens": 45000,
+ "maxOutputTokens": 4000
+ }
+ },
+ {
+ "modelRef": "anthropic::2023-06-01::claude-3-haiku",
+ "displayName": "Claude 3 Haiku",
+ "modelName": "claude-3-haiku-20240307",
+ "capabilities": ["chat"],
+ "category": "speed",
+ "status": "stable",
+ "contextWindow": {
+ "maxInputTokens": 7000,
+ "maxOutputTokens": 4000
+ }
+ },
+ {
+ "modelRef": "fireworks::v1::starcoder",
+ "displayName": "StarCoder",
+ "modelName": "starcoder",
+ "capabilities": ["autocomplete"],
+ "category": "speed",
+ "status": "stable",
+ "contextWindow": {
+ "maxInputTokens": 2048,
+ "maxOutputTokens": 256
+ }
+ },
+ {
+ "modelRef": "fireworks::v1::deepseek-coder-v2-lite-base",
+ "displayName": "DeepSeek V2 Lite Base",
+ "modelName": "accounts/sourcegraph/models/deepseek-coder-v2-lite-base",
+ "capabilities": ["autocomplete"],
+ "category": "speed",
+ "status": "stable",
+ "contextWindow": {
+ "maxInputTokens": 2048,
+ "maxOutputTokens": 256
+ }
+ },
+ {
+ "modelRef": "google::v1::gemini-1.5-pro",
+ "displayName": "Gemini 1.5 Pro",
+ "modelName": "gemini-1.5-pro",
+ "capabilities": ["chat"],
+ "category": "balanced",
+ "status": "stable",
+ "contextWindow": {
+ "maxInputTokens": 45000,
+ "maxOutputTokens": 4000
+ }
+ },
+ {
+ "modelRef": "google::v1::gemini-1.5-flash",
+ "displayName": "Gemini 1.5 Flash",
+ "modelName": "gemini-1.5-flash",
+ "capabilities": ["chat"],
+ "category": "speed",
+ "status": "stable",
+ "contextWindow": {
+ "maxInputTokens": 45000,
+ "maxOutputTokens": 4000
+ }
+ },
+ {
+ "modelRef": "mistral::v1::mixtral-8x7b-instruct",
+ "displayName": "Mixtral 8x7B",
+ "modelName": "accounts/fireworks/models/mixtral-8x7b-instruct",
+ "capabilities": ["chat"],
+ "category": "speed",
+ "status": "stable",
+ "contextWindow": {
+ "maxInputTokens": 7000,
+ "maxOutputTokens": 4000
+ }
+ },
+ {
+ "modelRef": "openai::2024-02-01::gpt-4o",
+ "displayName": "GPT-4o",
+ "modelName": "gpt-4o",
+ "capabilities": ["chat"],
+ "category": "accuracy",
+ "status": "stable",
+ "contextWindow": {
+ "maxInputTokens": 45000,
+ "maxOutputTokens": 4000
+ }
+ },
+ {
+ "modelRef": "openai::2024-02-01::cody-chat-preview-001",
+ "displayName": "OpenAI o1-preview",
+ "modelName": "cody-chat-preview-001",
+ "capabilities": ["chat"],
+ "category": "accuracy",
+ "status": "waitlist",
+ "contextWindow": {
+ "maxInputTokens": 45000,
+ "maxOutputTokens": 4000
+ }
+ },
+ {
+ "modelRef": "openai::2024-02-01::cody-chat-preview-002",
+ "displayName": "OpenAI o1-mini",
+ "modelName": "cody-chat-preview-002",
+ "capabilities": ["chat"],
+ "category": "accuracy",
+ "status": "waitlist",
+ "contextWindow": {
+ "maxInputTokens": 45000,
+ "maxOutputTokens": 4000
+ }
+ }
+],
+"defaultModels": {
+ "chat": "anthropic::2024-10-22::claude-3-5-sonnet-latest",
+ "fastChat": "anthropic::2023-06-01::claude-3-haiku",
+ "codeCompletion": "fireworks::v1::deepseek-coder-v2-lite-base"
+}
+```
diff --git a/src/data/navigation.ts b/src/data/navigation.ts
index e49e911ba..6defd654a 100644
--- a/src/data/navigation.ts
+++ b/src/data/navigation.ts
@@ -38,8 +38,15 @@ export const navigation: NavigationItem[] = [
{ title: "Cody for JetBrains", href: "/cody/clients/install-jetbrains", },
{ title: "Cody for Visual Studio", href: "/cody/clients/install-visual-studio", },
{ title: "Cody for Web", href: "/cody/clients/cody-with-sourcegraph", },
- { title: "Cody for Enterprise", href: "/cody/clients/enable-cody-enterprise", },
- { title: "Model Configuration", href: "/cody/clients/model-configuration", },
+ ]
+ },
+ {
+ title: "Cody for Enterprise", href: "/cody/clients/enable-cody-enterprise",
+ subsections: [
+ { title: "Features", href: "/cody/enterprise/features", },
+ { title: "Completions Configuration", href: "/cody/enterprise/completions-configuration", },
+ { title: "Model Configuration", href: "/cody/enterprise/model-configuration", },
+ { title: "modelConfiguration examples", href: "/cody/enterprise/model-config-examples", },
]
},
{