Skip to content

Commit 9f9abf3

Browse files
authored
Update how-to-deploy-models-llama.md
1 parent 616fb3e commit 9f9abf3

File tree

1 file changed

+19
-0
lines changed

1 file changed

+19
-0
lines changed

articles/machine-learning/how-to-deploy-models-llama.md

Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -544,6 +544,25 @@ For more information on how to deploy models to managed compute using the studio
544544

545545
For reference about how to invoke Meta Llama 3 models deployed to real-time endpoints, see the model's card in Azure Machine Learning studio [model catalog](concept-model-catalog.md). Each model's card has an overview page that includes a description of the model, samples for code-based inferencing, fine-tuning, and model evaluation.
546546

547+
#### Additional inference examples
548+
549+
| **Package** | **Sample Notebook** |
550+
|----------------|----------------------------------------|
551+
| CLI using CURL and Python web requests | [cohere-embed.ipynb](https://aka.ms/samples/embed-v3/webrequests)|
552+
| OpenAI SDK (experimental) | [openaisdk.ipynb](https://aka.ms/samples/cohere-embed/openaisdk) |
553+
| LangChain | [langchain.ipynb](https://aka.ms/samples/cohere-embed/langchain) |
554+
| Cohere SDK | [cohere-sdk.ipynb](https://aka.ms/samples/cohere-embed/cohere-python-sdk) |
555+
| LiteLLM SDK | [litellm.ipynb](https://github.com/Azure/azureml-examples/blob/main/sdk/python/foundation-models/cohere/litellm.ipynb) |
556+
557+
##### Retrieval Augmented Generation (RAG) and tool use samples
558+
**Description** | **Package** | **Sample Notebook**
559+
--|--|--
560+
Create a local Facebook AI Similarity Search (FAISS) vector index, using Cohere embeddings - Langchain|`langchain`, `langchain_cohere`|[cohere_faiss_langchain_embed.ipynb](https://github.com/Azure/azureml-examples/blob/main/sdk/python/foundation-models/cohere/cohere_faiss_langchain_embed.ipynb)
561+
Use Cohere Command R/R+ to answer questions from data in local FAISS vector index - Langchain|`langchain`, `langchain_cohere`|[command_faiss_langchain.ipynb](https://github.com/Azure/azureml-examples/blob/main/sdk/python/foundation-models/cohere/command_faiss_langchain.ipynb)
562+
Use Cohere Command R/R+ to answer questions from data in AI search vector index - Langchain|`langchain`, `langchain_cohere`|[cohere-aisearch-langchain-rag.ipynb](https://github.com/Azure/azureml-examples/blob/main/sdk/python/foundation-models/cohere/cohere-aisearch-langchain-rag.ipynb)
563+
Use Cohere Command R/R+ to answer questions from data in AI search vector index - Cohere SDK| `cohere`, `azure_search_documents`|[cohere-aisearch-rag.ipynb](https://github.com/Azure/azureml-examples/blob/main/sdk/python/foundation-models/cohere/cohere-aisearch-rag.ipynb)
564+
Command R+ tool/function calling, using LangChain|`cohere`, `langchain`, `langchain_cohere`|[command_tools-langchain.ipynb](https://github.com/Azure/azureml-examples/blob/main/sdk/python/foundation-models/cohere/command_tools-langchain.ipynb)
565+
547566
## Cost and quotas
548567

549568
### Cost and quota considerations for Meta Llama models deployed as a serverless API

0 commit comments

Comments
 (0)