Skip to content

Commit c6cad9f

Browse files
Merge pull request #617 from ssalgadodev/patch-13
Update how-to-deploy-models-llama.md
2 parents d4aaed2 + 42fa8bd commit c6cad9f

File tree

1 file changed

+49
-33
lines changed

1 file changed

+49
-33
lines changed

articles/machine-learning/how-to-deploy-models-llama.md

Lines changed: 49 additions & 33 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
---
2-
title: How to deploy Meta Llama 3.1 models with Azure Machine Learning studio
2+
title: How to use the Meta Llama family of models with Azure Machine Learning studio
33
titleSuffix: Azure Machine Learning
4-
description: Learn how to deploy Meta Llama 3.1 models with Azure Machine Learning studio.
4+
description: How to use the Meta Llama family of models with Azure Machine Learning studio.
55
manager: scottpolly
66
ms.service: azure-machine-learning
77
ms.subservice: inferencing
@@ -17,61 +17,77 @@ ms.custom: references_regions, build-2024
1717
---
1818

1919

20-
# How to deploy Meta Llama 3.1 models with Azure Machine Learning studio
20+
# How to use the Meta Llama family of models with Azure Machine Learning studio
2121

22-
In this article, you learn about the Meta Llama models family (LLMs). You also learn how to use Azure Machine Learning studio to deploy models from this set either to serverless APIs with pay-as you go billing or to managed compute.
22+
In this article, you learn about the Meta Llama models family (LLMs). Meta Llama models and tools are a collection of pretrained and fine-tuned generative AI text and image reasoning models - ranging in scale from SLMs (1B, 3B Base and Instruct models) for on-device and edge inferencing - to mid-size LLMs (7B, 8B and 70B Base and Instruct models) and high performant models like Meta Llama 3.1 405B Instruct for synthetic data generation and distillation use cases.
2323

24-
> [!IMPORTANT]
25-
> Read more about the announcement of Meta Llama 3.1 405B Instruct and other Llama 3.1 models available now on Azure AI Model Catalog: [Microsoft Tech Community Blog](https://aka.ms/meta-llama-3.1-release-on-azure) and from [Meta Announcement Blog](https://aka.ms/meta-llama-3.1-release-announcement).
26-
27-
Now available on Azure Machine Learning studio Models-as-a-Service:
28-
- `Meta-Llama-3.1-405B-Instruct`
29-
- `Meta-Llama-3.1-70B-Instruct`
30-
- `Meta-Llama-3.1-8B-Instruct`
31-
32-
The Meta Llama 3.1 family of multilingual large language models (LLMs) is a collection of pretrained and instruction tuned generative models in 8B, 70B and 405B sizes (text in/text out). All models support long context length (128k) and are optimized for inference with support for grouped query attention (GQA). The Llama 3.1 instruction tuned text only models (8B, 70B, 405B) are optimized for multilingual dialogue use cases and outperform many of the available open source chat models on common industry benchmarks.
24+
> [!TIP]
25+
> See our announcements of Meta's Llama 3.2 family models available now on Azure AI Model Catalog through [Meta's blog](https://aka.ms/llama-3.2-meta-announcement) and [Microsoft Tech Community Blog](https://aka.ms/llama-3.2-microsoft-announcement).
3326
3427
See the following GitHub samples to explore integrations with [LangChain](https://aka.ms/meta-llama-3.1-405B-instruct-langchain), [LiteLLM](https://aka.ms/meta-llama-3.1-405B-instruct-litellm), [OpenAI](https://aka.ms/meta-llama-3.1-405B-instruct-openai) and the [Azure API](https://aka.ms/meta-llama-3.1-405B-instruct-webrequests).
3528

3629
[!INCLUDE [machine-learning-preview-generic-disclaimer](includes/machine-learning-preview-generic-disclaimer.md)]
3730

38-
## Deploy Meta Llama 3.1 405B Instruct as a serverless API
31+
## Meta Llama family of models
3932

40-
Meta Llama 3.1 models - like `Meta Llama 3.1 405B Instruct` - can be deployed as a serverless API with pay-as-you-go, providing a way to consume them as an API without hosting them on your subscription while keeping the enterprise security and compliance organizations need. This deployment option doesn't require quota from your subscription. Meta Llama 3.1 models are deployed as a serverless API with pay-as-you-go billing through Microsoft Azure Marketplace, and they might add more terms of use and pricing.
33+
The Meta Llama family of models include the following models:
4134

42-
### Azure Marketplace model offerings
35+
# [Llama-3.2](#tab/python-llama-3-2)
4336

44-
The following models are available in Azure Marketplace for Llama 3.1 and Llama 3 when deployed as a service with pay-as-you-go:
37+
The Llama 3.2 collection of SLMs and image reasoning models are now available. Coming soon, Llama 3.2 11B Vision Instruct and Llama 3.2 90B Vision Instruct will be available as a serverless API endpoint via Models-as-a-Service. Starting today, the following models will be available for deployment via managed compute:
38+
* Llama 3.2 1B
39+
* Llama 3.2 3B
40+
* Llama 3.2 1B Instruct
41+
* Llama 3.2 3B Instruct
42+
* Llama Guard 3 1B
43+
* Llama Guard 11B Vision
44+
* Llama 3.2 11B Vision Instruct
45+
* Llama 3.2 90B Vision Instruct are available for managed compute deployment.
4546

46-
# [Meta Llama 3.1](#tab/llama-three)
47+
# [Meta Llama-3.1](#tab/python-meta-llama-3-1)
4748

48-
* [Meta-Llama-3.1-405B-Instruct (preview)](https://aka.ms/aistudio/landing/meta-llama-3-405B-base)
49-
* [Meta-Llama-3.1-70B-Instruct (preview)](https://aka.ms/aistudio/landing/meta-llama-3-8B-refresh)
50-
* [Meta Llama-3.1-8B-Instruct (preview)](https://aka.ms/aistudio/landing/meta-llama-3-70B-refresh)
51-
* [Meta-Llama-3-70B-Instruct (preview)](https://aka.ms/aistudio/landing/meta-llama-3-70b-chat)
52-
* [Meta-Llama-3-8B-Instruct (preview)](https://aka.ms/aistudio/landing/meta-llama-3-8b-chat)
49+
The Meta Llama 3.1 collection of multilingual large language models (LLMs) is a collection of pretrained and instruction tuned generative models in 8B, 70B and 405B sizes (text in/text out). The Llama 3.1 instruction tuned text only models (8B, 70B, 405B) are optimized for multilingual dialogue use cases and outperform many of the available open-source and closed models on common industry benchmarks.
5350

54-
If you need to deploy a different model, [deploy it to managed compute](#deploy-meta-llama-models-to-managed-compute) instead.
5551

56-
# [Meta Llama 2](#tab/llama-two)
52+
The following models are available:
53+
54+
* [Meta-Llama-3.1-405B-Instruct](https://aka.ms/azureai/landing/Meta-Llama-3.1-405B-Instruct)
55+
* [Meta-Llama-3.1-70B-Instruct](https://ai.azure.com/explore/models/Meta-Llama-3.1-70B-Instruct/version/1/registry/azureml-meta)
56+
* [Meta-Llama-3.1-8B-Instruct](https://ai.azure.com/explore/models/Meta-Llama-3.1-8B-Instruct/version/1/registry/azureml-meta)
57+
58+
59+
# [Meta Llama-3](#tab/python-meta-llama-3)
60+
61+
Meta developed and released the Meta Llama 3 family of large language models (LLMs), a collection of pretrained and instruction tuned generative text models in 8B, and 70B sizes. The Llama 3 instruction tuned models are optimized for dialogue use cases and outperform many of the available open-source models on common industry benchmarks. Further, in developing these models, we took great care to optimize helpfulness and safety.
62+
5763

58-
* Meta Llama-2-7B (preview)
59-
* Meta Llama 2 7B-Chat (preview)
60-
* Meta Llama-2-13B (preview)
61-
* Meta Llama 2 13B-Chat (preview)
62-
* Meta Llama-2-70B (preview)
63-
* Meta Llama 2 70B-Chat (preview)
64+
The following models are available:
65+
66+
* [Meta-Llama-3-70B-Instruct](https://ai.azure.com/explore/models/Meta-Llama-3-70B-Instruct/version/6/registry/azureml-meta)
67+
* [Meta-Llama-3-8B-Instruct](https://ai.azure.com/explore/models/Meta-Llama-3-8B-Instruct/version/6/registry/azureml-meta)
68+
69+
70+
# [Meta Llama-2](#tab/python-meta-llama-2)
71+
72+
Meta has developed and publicly released the Llama 2 family of large language models (LLMs), a collection of pretrained and fine-tuned generative text models ranging in scale from 7 billion to 70 billion parameters. Our fine-tuned LLMs, called Llama-2-Chat, are optimized for dialogue use cases. Llama-2-Chat models outperform open-source chat models on most benchmarks we tested, and in our human evaluations for helpfulness and safety, are on par with some popular closed-source models like ChatGPT and PaLM. We provide a detailed description of our approach to fine-tuning and safety improvements of Llama-2-Chat in order to enable the community to build on our work and contribute to the responsible development of LLMs.
73+
74+
75+
The following models are available:
76+
77+
* [Llama-2-70b-chat](https://ai.azure.com/explore/models/Llama-2-70b-chat/version/20/registry/azureml-meta)
78+
* [Llama-2-13b-chat](https://ai.azure.com/explore/models/Llama-2-13b-chat/version/20/registry/azureml-meta)
79+
* [Llama-2-7b-chat](https://ai.azure.com/explore/models/Llama-2-7b-chat/version/24/registry/azureml-meta)
6480

65-
If you need to deploy a different model, [deploy it to managed compute](#deploy-meta-llama-models-to-managed-compute) instead.
6681

6782
---
6883

84+
6985
### Prerequisites
7086

7187
# [Meta Llama 3](#tab/llama-three)
7288

7389
- An Azure subscription with a valid payment method. Free or trial Azure subscriptions won't work. If you don't have an Azure subscription, create a [paid Azure account](https://azure.microsoft.com/pricing/purchase-options/pay-as-you-go) to begin.
74-
- An Azure Machine Learning workspace and a compute instance. If you don't have these, use the steps in the [Quickstart: Create workspace resources](quickstart-create-resources.md) article to create them. The serverless API model deployment offering for Meta Llama 3.1 and Llama 3 is only available with hubs created in these regions:
90+
- An Azure Machine Learning workspace and a compute instance. If you don't have these, use the steps in the [Quickstart: Create workspace resources](quickstart-create-resources.md) article to create them. The serverless API model deployment offering for Meta Llama 3.1 and Llama 3 is only available with workspaces created in these regions:
7591

7692
* East US
7793
* East US 2

0 commit comments

Comments
 (0)