Skip to content

Commit 8462c64

Browse files
authored
Update deploy-models-llama.md
1 parent 9f9abf3 commit 8462c64

File tree

1 file changed

+20
-0
lines changed

1 file changed

+20
-0
lines changed

articles/ai-studio/how-to/deploy-models-llama.md

Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -524,6 +524,26 @@ Follow these steps to deploy a model such as `Llama-2-7b-chat` to a real-time en
524524

525525
For reference about how to invoke Llama models deployed to managed compute, see the model's card in the Azure AI Studio [model catalog](../how-to/model-catalog-overview.md). Each model's card has an overview page that includes a description of the model, samples for code-based inferencing, fine-tuning, and model evaluation.
526526

527+
##### More inference examples
528+
529+
| **Package** | **Sample Notebook** |
530+
|----------------|----------------------------------------|
531+
| CLI using CURL and Python web requests - Command R | [command-r.ipynb](https://aka.ms/samples/cohere-command-r/webrequests)|
532+
| CLI using CURL and Python web requests - Command R+ | [command-r-plus.ipynb](https://aka.ms/samples/cohere-command-r-plus/webrequests)|
533+
| OpenAI SDK (experimental) | [openaisdk.ipynb](https://aka.ms/samples/cohere-command/openaisdk) |
534+
| LangChain | [langchain.ipynb](https://aka.ms/samples/cohere/langchain) |
535+
| Cohere SDK | [cohere-sdk.ipynb](https://aka.ms/samples/cohere-python-sdk) |
536+
| LiteLLM SDK | [litellm.ipynb](https://github.com/Azure/azureml-examples/blob/main/sdk/python/foundation-models/cohere/litellm.ipynb) |
537+
538+
##### Retrieval Augmented Generation (RAG) and tool use samples
539+
**Description** | **Package** | **Sample Notebook**
540+
--|--|--
541+
Create a local Facebook AI similarity search (FAISS) vector index, using Cohere embeddings - Langchain|`langchain`, `langchain_cohere`|[cohere_faiss_langchain_embed.ipynb](https://github.com/Azure/azureml-examples/blob/main/sdk/python/foundation-models/cohere/cohere_faiss_langchain_embed.ipynb)
542+
Use Cohere Command R/R+ to answer questions from data in local FAISS vector index - Langchain|`langchain`, `langchain_cohere`|[command_faiss_langchain.ipynb](https://github.com/Azure/azureml-examples/blob/main/sdk/python/foundation-models/cohere/command_faiss_langchain.ipynb)
543+
Use Cohere Command R/R+ to answer questions from data in AI search vector index - Langchain|`langchain`, `langchain_cohere`|[cohere-aisearch-langchain-rag.ipynb](https://github.com/Azure/azureml-examples/blob/main/sdk/python/foundation-models/cohere/cohere-aisearch-langchain-rag.ipynb)
544+
Use Cohere Command R/R+ to answer questions from data in AI search vector index - Cohere SDK| `cohere`, `azure_search_documents`|[cohere-aisearch-rag.ipynb](https://github.com/Azure/azureml-examples/blob/main/sdk/python/foundation-models/cohere/cohere-aisearch-rag.ipynb)
545+
Command R+ tool/function calling, using LangChain|`cohere`, `langchain`, `langchain_cohere`|[command_tools-langchain.ipynb](https://github.com/Azure/azureml-examples/blob/main/sdk/python/foundation-models/cohere/command_tools-langchain.ipynb)
546+
527547
## Cost and quotas
528548

529549
### Cost and quota considerations for Llama models deployed as a service

0 commit comments

Comments
 (0)