You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/ai-studio/how-to/deploy-models-cohere-command.md
+36-37Lines changed: 36 additions & 37 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -77,9 +77,7 @@ Above mentioned Cohere models can be deployed as a service with pay-as-you-go, a
77
77
> For Cohere family models, the pay-as-you-go model deployment offering is only available with AI hubs created in EastUS, EastUS2 or Sweden Central regions.
78
78
79
79
- An [Azure AI project](../how-to/create-projects.md) in Azure AI Studio.
80
-
- Azure role-based access controls (Azure RBAC) are used to grant access to operations in Azure AI Studio. To perform the steps in this article, your user account must be assigned the __Azure AI Developer role__ on the resource group.
81
-
82
-
For more information on permissions, see [Role-based access control in Azure AI Studio](../concepts/rbac-ai-studio.md).
80
+
- Azure role-based access controls (Azure RBAC) are used to grant access to operations in Azure AI Studio. To perform the steps in this article, your user account must be assigned the __Azure AI Developer role__ on the resource group. For more information on permissions, see [Role-based access control in Azure AI Studio](../concepts/rbac-ai-studio.md).
83
81
84
82
85
83
### Create a new deployment
@@ -129,20 +127,21 @@ These models can be consumed using the chat API.
129
127
130
128
1. Cohere exposes two routes for inference with the Command R and Command R+ models. `v1/chat/completions` adheres to the Azure AI Generative Messages API schema, and `v1/chat` supports Cohere's native API schema.
131
129
132
-
For more information on using the APIs, see the [reference](#chat-api-reference-for-cohere-models-deployed-as-a-service) section.
130
+
For more information on using the APIs, see the [reference](#chat-api-reference-for-cohere-models-deployed-as-a-service) section.
133
131
134
132
## Chat API reference for Cohere models deployed as a service
135
133
136
-
## v1/chat/completions
137
-
### Request
134
+
### v1/chat/completions
135
+
136
+
#### Request
138
137
```
139
138
POST /v1/chat/completions HTTP/1.1
140
139
Host: <DEPLOYMENT_URI>
141
140
Authorization: Bearer <TOKEN>
142
141
Content-type: application/json
143
142
```
144
143
145
-
### v1/chat/completions request schema
144
+
####v1/chat/completions request schema
146
145
147
146
Cohere Command R and Command R+ accept the following parameters for a `v1/chat/completions` response inference call:
148
147
@@ -162,15 +161,15 @@ Cohere Command R and Command R+ accept the following parameters for a `v1/chat/c
162
161
`response_format` and `tool_choice` aren't yet supported parameters for the Command R and Command R+ models.
163
162
164
163
165
-
#### System or user message
164
+
166
165
A System or User Message supports the following properties:
167
166
168
167
| Property | Type | Default | Description |
169
168
| --- | --- | --- | --- |
170
169
|`role`|`enum`| Required |`role=system` or `role=user`. |
171
170
|`content`|`string`|Required |Text input for the model to respond to. |
172
171
173
-
#### Assistant message
172
+
174
173
An Assistant Message supports the following properties:
175
174
176
175
| Property | Type | Default | Description |
@@ -179,7 +178,7 @@ An Assistant Message supports the following properties:
179
178
|`content`|`string`|Required |The contents of the assistant message. |
180
179
|`tool_calls`|`array`|None |The tool calls generated by the model, such as function calls. |
181
180
182
-
#### Tool message
181
+
183
182
A Tool Message supports the following properties:
184
183
185
184
| Property | Type | Default | Description |
@@ -189,7 +188,7 @@ A Tool Message supports the following properties:
189
188
|`tool_call_id`|`string`|None |Tool call that this message is responding to. |
190
189
191
190
192
-
### v1/chat/completions response schema
191
+
####v1/chat/completions response schema
193
192
194
193
The response payload is a dictionary with the following fields:
195
194
@@ -219,9 +218,9 @@ The `usage` object is a dictionary with the following fields:
219
218
|`total_tokens`|`integer`| Total tokens. |
220
219
221
220
222
-
### Examples
221
+
####Examples
223
222
224
-
**Request**
223
+
Request:
225
224
226
225
```json
227
226
"messages": [
@@ -250,7 +249,7 @@ The `usage` object is a dictionary with the following fields:
250
249
]
251
250
```
252
251
253
-
**Response**
252
+
Response:
254
253
255
254
```json
256
255
{
@@ -276,8 +275,8 @@ The `usage` object is a dictionary with the following fields:
276
275
}
277
276
```
278
277
279
-
## v1/chat
280
-
## Request
278
+
###v1/chat
279
+
####Request
281
280
282
281
```
283
282
POST /v1/chat HTTP/1.1
@@ -286,7 +285,7 @@ The `usage` object is a dictionary with the following fields:
286
285
Content-type: application/json
287
286
```
288
287
289
-
### v1/chat request schema
288
+
####v1/chat request schema
290
289
291
290
Cohere Command R and Command R+ accept the following parameters for a `v1/chat` response inference call:
292
291
@@ -324,7 +323,7 @@ The `documents` object has the following optional fields:
324
323
|`id`|`string`|`None`|Can be supplied to identify the document in the citations. This field isn't passed to the model. |
325
324
|`_excludes`|`array of strings`|`None`| Can be optionally supplied to omit some key-value pairs from being shown to the model. The omitted fields still show up in the citation object. The `_excludes` field isn't passed to the model. |
326
325
327
-
### v1/chat response schema
326
+
####v1/chat response schema
328
327
329
328
Response fields are fully documented on [Cohere's Chat API reference](https://docs.cohere.com/reference/chat). The response object always contains:
330
329
@@ -339,7 +338,7 @@ Response fields are fully documented on [Cohere's Chat API reference](https://do
339
338
340
339
<br/>
341
340
342
-
### Documents
341
+
####Documents
343
342
If `documents` are specified in the request, there are two other fields in the response:
344
343
345
344
|Key |Type |Description |
@@ -356,7 +355,7 @@ If `documents` are specified in the request, there are two other fields in the r
356
355
|`text`|`string`|The text of the citation. For example, a generation of `Hello, world!` with a citation of `world` would have a text value of `world`. |
357
356
|`document_ids`|`array of strings`|Identifiers of documents cited by this section of the generated reply. |
358
357
359
-
### Tools
358
+
####Tools
360
359
If `tools` are specified and invoked by the model, there's another field in the response:
361
360
362
361
|Key |Type |Description |
@@ -370,7 +369,7 @@ If `tools` are specified and invoked by the model, there's another field in the
370
369
|`name`|`string`|Name of the tool to call. |
371
370
|`parameters`|`object`|The name and value of the parameters to use when invoking a tool. |
372
371
373
-
### Search_queries_only
372
+
####Search_queries_only
374
373
If `search_queries_only=TRUE` is specified in the request, there are two other fields in the response:
375
374
376
375
|Key |Type |Description |
@@ -385,12 +384,12 @@ If `search_queries_only=TRUE` is specified in the request, there are two other f
385
384
|`text`|`string`|The text of the search query. |
386
385
|`generation_id`|`string`|Unique identifier for the generated search query. Useful for submitting feedback. |
387
386
388
-
### Examples
387
+
####Examples
389
388
390
-
### Chat - Completions
389
+
#####Chat - Completions
391
390
The following example is a sample request call to get chat completions from the Cohere Command model. Use when generating a chat completion.
392
391
393
-
**Request**
392
+
Request:
394
393
395
394
```json
396
395
{
@@ -402,7 +401,7 @@ The following example is a sample request call to get chat completions from the
402
401
}
403
402
```
404
403
405
-
**Response**
404
+
Response:
406
405
407
406
```json
408
407
{
@@ -428,11 +427,11 @@ The following example is a sample request call to get chat completions from the
428
427
}
429
428
```
430
429
431
-
### Chat - Grounded generation and RAG capabilities
430
+
#####Chat - Grounded generation and RAG capabilities
432
431
433
432
Command R and Command R+ are trained for RAG via a mixture of supervised fine-tuning and preference fine-tuning, using a specific prompt template. We introduce that prompt template via the `documents` parameter. The document snippets should be chunks, rather than long documents, typically around 100-400 words per chunk. Document snippets consist of key-value pairs. The keys should be short descriptive strings. The values can be text or semi-structured.
434
433
435
-
**Request**
434
+
Request:
436
435
437
436
```json
438
437
{
@@ -450,7 +449,7 @@ Command R and Command R+ are trained for RAG via a mixture of supervised fine-tu
450
449
}
451
450
```
452
451
453
-
**Response**
452
+
Response:
454
453
455
454
```json
456
455
{
@@ -506,11 +505,11 @@ Command R and Command R+ are trained for RAG via a mixture of supervised fine-tu
506
505
}
507
506
```
508
507
509
-
### Chat - Tool Use
508
+
#####Chat - Tool Use
510
509
511
510
If invoking tools or generating a response based on tool results, use the following parameters.
512
511
513
-
**Request**
512
+
Request:
514
513
515
514
```json
516
515
{
@@ -569,7 +568,7 @@ If invoking tools or generating a response based on tool results, use the follow
569
568
}
570
569
```
571
570
572
-
**Response**
571
+
Response:
573
572
574
573
```json
575
574
{
@@ -634,7 +633,7 @@ If invoking tools or generating a response based on tool results, use the follow
634
633
635
634
Once you run your function and received tool outputs, you can pass them back to the model to generate a response for the user.
636
635
637
-
**Request**
636
+
Request:
638
637
639
638
```json
640
639
{
@@ -693,7 +692,7 @@ Once you run your function and received tool outputs, you can pass them back to
693
692
}
694
693
```
695
694
696
-
**Response**
695
+
Response:
697
696
698
697
```json
699
698
{
@@ -756,11 +755,11 @@ Once you run your function and received tool outputs, you can pass them back to
756
755
}
757
756
```
758
757
759
-
### Chat - Search queries
758
+
#####Chat - Search queries
760
759
If you're building a RAG agent, you can also use Cohere's Chat API to get search queries from Command. Specify `search_queries_only=TRUE` in your request.
761
760
762
761
763
-
**Request**
762
+
Request:
764
763
765
764
```json
766
765
{
@@ -769,7 +768,7 @@ If you're building a RAG agent, you can also use Cohere's Chat API to get search
769
768
}
770
769
```
771
770
772
-
**Response**
771
+
Response:
773
772
774
773
```json
775
774
{
@@ -791,7 +790,7 @@ If you're building a RAG agent, you can also use Cohere's Chat API to get search
0 commit comments