You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
description: Learn about small and large language models, including when and how you can use them with your AI and machine learning workloads on Azure Kubernetes Service (AKS).
3
+
description: Learn about small and large language models, including when to use them and how you can onboard them to your AI and machine learning workflows on Azure Kubernetes Service (AKS).
4
4
ms.topic: conceptual
5
-
ms.date: 06/19/2024
5
+
ms.date: 06/20/2024
6
6
author: schaffererin
7
7
ms.author: schaffererin
8
8
---
9
9
10
10
# Concepts - Small and large language models
11
11
12
-
In this article, you learn about small and large language models, including when to use them and how you can use them with your AI and machine learning workloads on Azure Kubernetes Service (AKS).
12
+
In this article, you learn about small and large language models, including when to use them and how you can use them with your AI and machine learning workflows on Azure Kubernetes Service (AKS).
13
13
14
14
## What are language models?
15
15
16
16
Language models are powerful machine learning models used for natural language processing (NLP) tasks, such as text generation and sentiment analysis. These models represent natural language based on the probability of words or sequences of words occurring in a given context.
17
17
18
-
*Conventional language models* have been used in supervised settings for research purposes where the models are trained on well-labeled text datasets for specific tasks. *Pre-trained language models* have become more widely used in recent years. These modes are trained on large-scale text corpora from the internet using deep neural networks and can be fine-tuned on smaller datasets for specific tasks.
18
+
*Conventional language models* have been used in supervised settings for research purposes where the models are trained on well-labeled text datasets for specific tasks. *Pre-trained language models*offer an accessible way to get started with AI and have become more widely used in recent years. These models are trained on large-scale text corpora from the internet using deep neural networks and can be fine-tuned on smaller datasets for specific tasks.
19
19
20
20
The size of a language model is determined by the its number of parameters, or *weights*, that determine how the model processes input data and generates output. Parameters are learned during the training process by adjusting the weights within layers of the model to minimize the difference between the model's predictions and the actual data. The more parameters a model has, the more complex and expressive it is, but also the more computationally expensive it is to train and use.
21
21
22
-
In general, **small language models** have *fewer than 100 million parameters*, and **large language models** have *more than 100 million parameters*. For example, GPT-2 has four versions with different sizes: small (124 million parameters), medium (355 million parameters), large (774 million parameters), and extra-large (1.5 billion parameters).
22
+
In general, **small language models** have *fewer than 10 billion parameters*, and **large language models** have *more than 10 billion parameters*. For example, the new Microsoft Phi-3 model family has three versions with different sizes: mini (3.8 billion parameters), small (7 billion parameters), and medium (14 billion parameters).
23
23
24
24
## When to use small language models
25
25
26
26
### Advantages
27
27
28
28
Small language models are a good choice if you want models that are:
29
29
30
-
***Faster and cheaper to train and run**: They require less data and compute power.
30
+
***Faster and more cost-effective to train and run**: They require less data and compute power.
31
31
***Easy to deploy and maintain**: They have smaller storage and memory footprints.
32
32
***Less prone to *overfitting***, which is when a model learns the noise or specific patterns of the training data and fails to generalize new data.
33
33
***Interpretable and explainable**: They have fewer parameters and components to understand and analyze.
@@ -45,51 +45,49 @@ The following table lists some popular, high-performance small language models:
45
45
46
46
| Model family | Model sizes (Number of parameters) | Software license |
## Experiment with small and large language models on AKS
80
78
81
-
The Kubernetes AI Toolchain Operator (KAITO) is a Kubernetes operator that automates AI and machine learning model deployments in Kubernetes clusters. The KAITO add-on for AKS simplifies the experience of running OSS AI models on your AKS clusters. The add-on automatically provisions the necessary GPU nodes and sets up the associated interference server as an endpoint server to your AI models.
79
+
Kubernetes AI Toolchain Operator (KAITO) is an open-source operator that automates small and large language model deployments in Kubernetes clusters. The KAITO add-on for AKS simplifies onboarding and reduces the time-to-inference for open-source models on your AKS clusters. The add-on automatically provisions right-sized GPU nodes and sets up the associated interference server as an endpoint server to your chosen model.
82
80
83
81
For more information, [Deploy an AI model on AKS with the AI toolchain operator][ai-toolchain-operator].
84
82
85
83
## Next steps
86
84
87
-
To learn more about AI and machine learning with AKS, see the following articles:
85
+
To learn more about containerized AI and machine learning workloads on AKS, see the following articles:
88
86
89
-
*[Deploy an application that uses OpenAI on AKS][openai-aks]
87
+
*[Use KAITO to forecast energy usage with intelligent apps][forecast-energy-usage]
90
88
*[Build and deploy data and machine learning pipelines with Flyte on AKS][flyte-aks]
0 commit comments