You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/machine-learning/how-to-deploy-models-cohere-command.md
+2-2Lines changed: 2 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -112,7 +112,7 @@ To create a deployment:
112
112
1. Select the endpoint to open its Details page.
113
113
1. Select the **Test** tab to start interacting with the model.
114
114
1. You can always find the endpoint's details, URL, and access keys by navigating to **Workspace** > **Endpoints** > **Serverless endpoints**.
115
-
2. Take note of the **Target** URL and the **Secret Key**. For more information on using the APIs, see the [reference](#reference-for-cohere-models-deployed-as-a-service) section.
115
+
2. Take note of the **Target** URL and the **Secret Key**. For more information on using the APIs, see the [reference](#reference-for-cohere-models-deployed-as-a-serverless-api) section.
116
116
117
117
To learn about billing for models deployed with pay-as-you-go, see [Cost and quota considerations for Cohere models deployed as a service](#cost-and-quota-considerations-for-models-deployed-as-a-service).
118
118
@@ -125,7 +125,7 @@ The previously mentioned Cohere models can be consumed using the chat API.
125
125
1. Copy the **Target** URL and the **Key** token values.
126
126
2. Cohere exposes two routes for inference with the Command R and Command R+ models. The [Azure AI Model Inference API](reference-model-inference-api.md) on the route `/chat/completions` and the native [Cohere API](#cohere-chat-api).
127
127
128
-
For more information on using the APIs, see the [reference](#reference-for-cohere-models-deployed-as-a-service) section.
128
+
For more information on using the APIs, see the [reference](#reference-for-cohere-models-deployed-as-a-serverless-api) section.
129
129
130
130
## Reference for Cohere models deployed as a serverless API
Copy file name to clipboardExpand all lines: articles/machine-learning/how-to-deploy-models-cohere-embed.md
+2-2Lines changed: 2 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -91,7 +91,7 @@ To create a deployment:
91
91
1. Select the endpoint to open its Details page.
92
92
1. Select the **Test** tab to start interacting with the model.
93
93
1. You can always find the endpoint's details, URL, and access keys by navigating to **Workspace** > **Endpoints** > **Serverless endpoints**.
94
-
1. Take note of the **Target** URL and the **Secret Key**. For more information on using the APIs, see the [reference](#embed-api-reference-for-cohere-embed-models-deployed-as-a-service) section.
94
+
1. Take note of the **Target** URL and the **Secret Key**. For more information on using the APIs, see the [reference](#embed-api-reference-for-cohere-embed-models-deployed-as-a-serverless-api) section.
95
95
96
96
To learn about billing for models deployed with pay-as-you-go, see [Cost and quota considerations for Cohere models deployed as a service](#cost-and-quota-considerations-for-models-deployed-as-a-service).
97
97
@@ -104,7 +104,7 @@ The previously mentioned Cohere models can be consumed using the chat API.
104
104
1. Copy the **Target** URL and the **Key** token values.
105
105
1. Cohere exposes two routes for inference with the Embed v3 - English and Embed v3 - Multilingual models. `v1/embeddings` adheres to the Azure AI Generative Messages API schema, and `v1/embed` supports Cohere's native API schema.
106
106
107
-
For more information on using the APIs, see the [reference](#embed-api-reference-for-cohere-embed-models-deployed-as-a-service) section.
107
+
For more information on using the APIs, see the [reference](#embed-api-reference-for-cohere-embed-models-deployed-as-a-serverless-api) section.
108
108
109
109
## Embed API reference for Cohere Embed models deployed as a serverless API
If you need to deploy a different model, [deploy it to real-time endpoints](#deploy-meta-llama-models-to-real-time-endpoints) instead.
46
+
If you need to deploy a different model, [deploy it to managed compute](#deploy-meta-llama-models-to-managed-compute) instead.
47
47
48
48
# [Meta Llama 2](#tab/llama-two)
49
49
@@ -54,7 +54,7 @@ If you need to deploy a different model, [deploy it to real-time endpoints](#dep
54
54
* Meta Llama-2-70B (preview)
55
55
* Meta Llama 2 70B-Chat (preview)
56
56
57
-
If you need to deploy a different model, [deploy it to managed compute](#deploy-meta-llama-models-to-real-time-endpoints) instead.
57
+
If you need to deploy a different model, [deploy it to managed compute](#deploy-meta-llama-models-to-managed-compute) instead.
58
58
59
59
---
60
60
@@ -199,7 +199,7 @@ Models deployed as a service can be consumed using either the chat or the comple
199
199
- For completions models, such as `Llama-3-8B`, use the [`<target_url>/v1/completions`](#completions-api) API.
200
200
- For chat models, such as `Llama-3-8B-Instruct`, use the [`<target_url>/v1/chat/completions`](#chat-api) API.
201
201
202
-
For more information on using the APIs, see the [reference](#reference-for-meta-llama-models-deployed-as-a-service) section.
202
+
For more information on using the APIs, see the [reference](#reference-for-meta-llama-models-deployed-a-serverless-api) section.
203
203
204
204
# [Meta Llama 2](#tab/llama-two)
205
205
@@ -211,7 +211,7 @@ Models deployed as a service can be consumed using either the chat or the comple
211
211
- For completions models, such as `Meta-Llama-2-7B`, use the [`/v1/completions`](#completions-api) API or the [Azure AI Model Inference API](reference-model-inference-api.md) on the route `/completions`.
212
212
- For chat models, such as `Meta-Llama-2-7B-Chat`, use the [`/v1/chat/completions`](#chat-api) API or the [Azure AI Model Inference API](reference-model-inference-api.md) on the route `/chat/completions`.
213
213
214
-
For more information on using the APIs, see the [reference](#reference-for-meta-llama-models-deployed-as-a-service) section.
214
+
For more information on using the APIs, see the [reference](#reference-for-meta-llama-models-deployed-a-serverless-api) section.
0 commit comments