Skip to content

Commit 9ecc139

Browse files
authored
Merge pull request #51254 from theresa-i/Updates-for-clarity-and-course-alignment
Updates for clarity and course alignment
2 parents 3741dad + 2928dc2 commit 9ecc139

32 files changed

+559
-470
lines changed
Lines changed: 15 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -1,15 +1,15 @@
1-
### YamlMime:ModuleUnit
2-
uid: learn.wwl.introduction-language-models-databricks.introduction
3-
title: Introduction
4-
metadata:
5-
title: Introduction
6-
description: "Introduction"
7-
ms.date: 03/20/2025
8-
author: wwlpublish
9-
ms.author: theresai
10-
ms.topic: unit
11-
azureSandbox: false
12-
labModal: false
13-
durationInMinutes: 2
14-
content: |
15-
[!include[](includes/01-introduction.md)]
1+
### YamlMime:ModuleUnit
2+
uid: learn.wwl.introduction-language-models-databricks.introduction
3+
title: Introduction
4+
metadata:
5+
title: Introduction
6+
description: "Introduction"
7+
ms.date: 07/07/2025
8+
author: theresa-i
9+
ms.author: theresai
10+
ms.topic: unit
11+
azureSandbox: false
12+
labModal: false
13+
durationInMinutes: 2
14+
content: |
15+
[!include[](includes/01-introduction.md)]
Lines changed: 15 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -1,15 +1,15 @@
1-
### YamlMime:ModuleUnit
2-
uid: learn.wwl.introduction-language-models-databricks.what-is-generative-ai
3-
title: Understand Generative AI
4-
metadata:
5-
title: Understand Generative AI
6-
description: "Understand Generative AI"
7-
ms.date: 03/20/2025
8-
author: wwlpublish
9-
ms.author: theresai
10-
ms.topic: unit
11-
azureSandbox: false
12-
labModal: false
13-
durationInMinutes: 5
14-
content: |
15-
[!include[](includes/02-what-is-generative-ai.md)]
1+
### YamlMime:ModuleUnit
2+
uid: learn.wwl.introduction-language-models-databricks.what-is-generative-ai
3+
title: Understand Generative AI
4+
metadata:
5+
title: Understand Generative AI
6+
description: "Understand Generative AI"
7+
ms.date: 07/07/2025
8+
author: theresa-i
9+
ms.author: theresai
10+
ms.topic: unit
11+
azureSandbox: false
12+
labModal: false
13+
durationInMinutes: 5
14+
content: |
15+
[!include[](includes/02-what-is-generative-ai.md)]
Lines changed: 15 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -1,15 +1,15 @@
1-
### YamlMime:ModuleUnit
2-
uid: learn.wwl.introduction-language-models-databricks.what-are-large-language-models
3-
title: Understand Large Language Models (LLMs)
4-
metadata:
5-
title: Understand Large Language Models (LLMs)
6-
description: "Understand Large Language Models (LLMs)"
7-
ms.date: 03/20/2025
8-
author: wwlpublish
9-
ms.author: theresai
10-
ms.topic: unit
11-
azureSandbox: false
12-
labModal: false
13-
durationInMinutes: 5
14-
content: |
15-
[!include[](includes/03-what-are-large-language-models.md)]
1+
### YamlMime:ModuleUnit
2+
uid: learn.wwl.introduction-language-models-databricks.what-are-large-language-models
3+
title: Understand Large Language Models (LLMs)
4+
metadata:
5+
title: Understand Large Language Models (LLMs)
6+
description: "Understand Large Language Models (LLMs)"
7+
ms.date: 07/07/2025
8+
author: theresa-i
9+
ms.author: theresai
10+
ms.topic: unit
11+
azureSandbox: false
12+
labModal: false
13+
durationInMinutes: 5
14+
content: |
15+
[!include[](includes/03-what-are-large-language-models.md)]
Lines changed: 15 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -1,15 +1,15 @@
1-
### YamlMime:ModuleUnit
2-
uid: learn.wwl.introduction-language-models-databricks.key-components-llms
3-
title: Identify key components of LLM applications
4-
metadata:
5-
title: Identify key components of LLM applications
6-
description: "Identify key components of LLM applications"
7-
ms.date: 03/20/2025
8-
author: wwlpublish
9-
ms.author: theresai
10-
ms.topic: unit
11-
azureSandbox: false
12-
labModal: false
13-
durationInMinutes: 9
14-
content: |
15-
[!include[](includes/04-key-components-llms.md)]
1+
### YamlMime:ModuleUnit
2+
uid: learn.wwl.introduction-language-models-databricks.key-components-llms
3+
title: Identify key components of LLM applications
4+
metadata:
5+
title: Identify key components of LLM applications
6+
description: "Identify key components of LLM applications"
7+
ms.date: 07/07/2025
8+
author: theresa-i
9+
ms.author: theresai
10+
ms.topic: unit
11+
azureSandbox: false
12+
labModal: false
13+
durationInMinutes: 9
14+
content: |
15+
[!include[](includes/04-key-components-llms.md)]
Lines changed: 15 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -1,15 +1,15 @@
1-
### YamlMime:ModuleUnit
2-
uid: learn.wwl.introduction-language-models-databricks.use-llms
3-
title: Use LLMs for Natural Language Processing (NLP) tasks
4-
metadata:
5-
title: Use LLMs for Natural Language Processing (NLP) tasks
6-
description: "Use LLMs for Natural Language Processing (NLP) tasks"
7-
ms.date: 03/20/2025
8-
author: wwlpublish
9-
ms.author: theresai
10-
ms.topic: unit
11-
azureSandbox: false
12-
labModal: false
13-
durationInMinutes: 9
14-
content: |
15-
[!include[](includes/05-use-llms.md)]
1+
### YamlMime:ModuleUnit
2+
uid: learn.wwl.introduction-language-models-databricks.use-llms
3+
title: Use LLMs for Natural Language Processing (NLP) tasks
4+
metadata:
5+
title: Use LLMs for Natural Language Processing (NLP) tasks
6+
description: "Use LLMs for Natural Language Processing (NLP) tasks"
7+
ms.date: 07/07/2025
8+
author: theresa-i
9+
ms.author: theresai
10+
ms.topic: unit
11+
azureSandbox: false
12+
labModal: false
13+
durationInMinutes: 9
14+
content: |
15+
[!include[](includes/05-use-llms.md)]
Lines changed: 15 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -1,15 +1,15 @@
1-
### YamlMime:ModuleUnit
2-
uid: learn.wwl.introduction-language-models-databricks.exercise
3-
title: Exercise - Explore language models
4-
metadata:
5-
title: Exercise - Explore language models
6-
description: "Exercise - Explore language models"
7-
ms.date: 03/20/2025
8-
author: wwlpublish
9-
ms.author: theresai
10-
ms.topic: unit
11-
azureSandbox: false
12-
labModal: false
13-
durationInMinutes: 30
14-
content: |
15-
[!include[](includes/06-exercise.md)]
1+
### YamlMime:ModuleUnit
2+
uid: learn.wwl.introduction-language-models-databricks.exercise
3+
title: Exercise - Explore language models
4+
metadata:
5+
title: Exercise - Explore language models
6+
description: "Exercise - Explore language models"
7+
ms.date: 07/07/2025
8+
author: theresa-i
9+
ms.author: theresai
10+
ms.topic: unit
11+
azureSandbox: false
12+
labModal: false
13+
durationInMinutes: 30
14+
content: |
15+
[!include[](includes/06-exercise.md)]
Lines changed: 50 additions & 50 deletions
Original file line numberDiff line numberDiff line change
@@ -1,50 +1,50 @@
1-
### YamlMime:ModuleUnit
2-
uid: learn.wwl.introduction-language-models-databricks.knowledge-check
3-
title: Module assessment
4-
metadata:
5-
title: Module assessment
6-
description: "Knowledge check"
7-
ms.date: 03/20/2025
8-
author: wwlpublish
9-
ms.author: theresai
10-
ms.topic: unit
11-
module_assessment: true
12-
azureSandbox: false
13-
labModal: false
14-
durationInMinutes: 3
15-
quiz:
16-
questions:
17-
- content: "What is the primary function of tokenization in Large Language Models (LLMs)?"
18-
choices:
19-
- content: "To generate responses for user queries."
20-
isCorrect: false
21-
explanation: "Incorrect. The primary function of tokenization isn't generating responses for user queries."
22-
- content: "To convert text into smaller units for easier processing."
23-
isCorrect: true
24-
explanation: "Correct. Tokenization is a preprocessing step in LLMs where text is broken down into smaller units, such as words, subwords, or characters. Tokenization makes it easier for the model to process and understand the text."
25-
- content: "To summarize long texts into shorter versions."
26-
isCorrect: false
27-
explanation: "Incorrect. Summarization is summarizing long texts into shorter versions and it isn't the primary function of tokenization."
28-
- content: "Which of the following tasks involves determining the emotional tone of a piece of text?"
29-
choices:
30-
- content: "Summarization"
31-
isCorrect: false
32-
explanation: "Incorrect. Summarization is summarizing long texts into shorter versions and not determining the emotional tone of a piece of text."
33-
- content: "Translation"
34-
isCorrect: false
35-
explanation: "Incorrect. Translation is converting text from one language to multiple languages and not determining the emotional tone of a piece of text."
36-
- content: "Sentiment Analysis"
37-
isCorrect: true
38-
explanation: "Correct. Sentiment analysis is the task of identifying the emotional tone of a text, such as determining if the sentiment is positive, negative, or neutral. Sentiment analysis helps in understanding opinions and feelings expressed in the text."
39-
- content: "In the context of Large Language Models (LLMs), what does zero-shot classification refer to?"
40-
choices:
41-
- content: "Classifying text into predefined categories without any prior training examples."
42-
isCorrect: true
43-
explanation: "Correct. Zero-shot classification involves categorizing text into predefined labels without seeing any labeled examples during training. Zero-shot classification is achieved by using the model's extensive general knowledge and language understanding."
44-
- content: "Training the model on a few examples for a specific task."
45-
isCorrect: false
46-
explanation: "Incorrect. Training the model on a few examples isn't zero-shot classification."
47-
- content: "Generating text responses based on a given prompt."
48-
isCorrect: false
49-
explanation: "Incorrect. Generating text responses based on a given prompt isn't zero-shot classification."
50-
1+
### YamlMime:ModuleUnit
2+
uid: learn.wwl.introduction-language-models-databricks.knowledge-check
3+
title: Module assessment
4+
metadata:
5+
title: Module assessment
6+
description: "Knowledge check"
7+
ms.date: 07/07/2025
8+
author: theresa-i
9+
ms.author: theresai
10+
ms.topic: unit
11+
module_assessment: true
12+
azureSandbox: false
13+
labModal: false
14+
durationInMinutes: 3
15+
quiz:
16+
questions:
17+
- content: "What is the primary function of tokenization in Large Language Models (LLMs)?"
18+
choices:
19+
- content: "To generate responses for user queries."
20+
isCorrect: false
21+
explanation: "Incorrect. The primary function of tokenization isn't generating responses for user queries."
22+
- content: "To convert text into smaller units for easier processing."
23+
isCorrect: true
24+
explanation: "Correct. Tokenization is a preprocessing step in LLMs where text is broken down into smaller units, such as words, subwords, or characters. Tokenization makes it easier for the model to process and understand the text."
25+
- content: "To summarize long texts into shorter versions."
26+
isCorrect: false
27+
explanation: "Incorrect. Summarization is summarizing long texts into shorter versions and it isn't the primary function of tokenization."
28+
- content: "Which of the following tasks involves determining the emotional tone of a piece of text?"
29+
choices:
30+
- content: "Summarization"
31+
isCorrect: false
32+
explanation: "Incorrect. Summarization is summarizing long texts into shorter versions and not determining the emotional tone of a piece of text."
33+
- content: "Translation"
34+
isCorrect: false
35+
explanation: "Incorrect. Translation is converting text from one language to multiple languages and not determining the emotional tone of a piece of text."
36+
- content: "Sentiment Analysis"
37+
isCorrect: true
38+
explanation: "Correct. Sentiment analysis is the task of identifying the emotional tone of a text, such as determining if the sentiment is positive, negative, or neutral. Sentiment analysis helps in understanding opinions and feelings expressed in the text."
39+
- content: "In the context of Large Language Models (LLMs), what does zero-shot classification refer to?"
40+
choices:
41+
- content: "Classifying text into predefined categories without any prior training examples."
42+
isCorrect: true
43+
explanation: "Correct. Zero-shot classification involves categorizing text into predefined labels without seeing any labeled examples during training. Zero-shot classification is achieved by using the model's extensive general knowledge and language understanding."
44+
- content: "Training the model on a few examples for a specific task."
45+
isCorrect: false
46+
explanation: "Incorrect. Training the model on a few examples isn't zero-shot classification."
47+
- content: "Generating text responses based on a given prompt."
48+
isCorrect: false
49+
explanation: "Incorrect. Generating text responses based on a given prompt isn't zero-shot classification."
50+
Lines changed: 15 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -1,15 +1,15 @@
1-
### YamlMime:ModuleUnit
2-
uid: learn.wwl.introduction-language-models-databricks.summary
3-
title: Summary
4-
metadata:
5-
title: Summary
6-
description: "Summary"
7-
ms.date: 03/20/2025
8-
author: wwlpublish
9-
ms.author: theresai
10-
ms.topic: unit
11-
azureSandbox: false
12-
labModal: false
13-
durationInMinutes: 1
14-
content: |
15-
[!include[](includes/08-summary.md)]
1+
### YamlMime:ModuleUnit
2+
uid: learn.wwl.introduction-language-models-databricks.summary
3+
title: Summary
4+
metadata:
5+
title: Summary
6+
description: "Summary"
7+
ms.date: 07/07/2025
8+
author: theresa-i
9+
ms.author: theresai
10+
ms.topic: unit
11+
azureSandbox: false
12+
labModal: false
13+
durationInMinutes: 1
14+
content: |
15+
[!include[](includes/08-summary.md)]

learn-pr/wwl-data-ai/introduction-language-models-databricks/includes/02-what-is-generative-ai.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -25,7 +25,7 @@ The underlying technology involves training on diverse text corpora, allowing th
2525

2626
In the realm of visual arts, generative AI is making significant strides with the development of **Generative Adversarial Networks** (**GANs**).
2727

28-
GANs consist of two neural networks—a **generator** and a **discriminator—that work in tandem to create realistic images. The generator creates images, while the discriminator evaluates them, leading to the production of increasingly authentic visuals over time. This technology is used to create stunning artwork, realistic human faces, and even design new products.
28+
GANs consist of two neural networks—a **generator** and a **discriminator**—that work in tandem to create realistic images. The generator creates images, while the discriminator evaluates them, leading to the production of increasingly authentic visuals over time. This technology is used to create stunning artwork, realistic human faces, and even design new products.
2929

3030
The ability to generate high-quality images also finds applications in industries like fashion, where AI designs clothing, and in entertainment, where it creates special effects and virtual characters.
3131

learn-pr/wwl-data-ai/introduction-language-models-databricks/includes/03-what-are-large-language-models.md

Lines changed: 5 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -11,22 +11,20 @@ Let's start by exploring what LLMs are.
1111
:::image type="content" source="../media/02-large-language-model.png" alt-text="Diagram of LLMs and foundation models as part of Generative AI.":::
1212

1313
1. **Generative AI** refers to systems that can create new content, such as text, images, audio, or video.
14-
1. **Large Language Models** (**LLMs**) are a type of Generative AI that focus on language-related tasks.
14+
1. **Large Language Models** (**LLMs**) are a type of Generative AI that focuses on language-related tasks.
1515
1. **Foundation models** are the underlying models that serve as the basis for AI applications. The models are trained on broad and diverse datasets and can be adapted to a wide range of downstream tasks.
1616

1717
When you want to achieve Generative AI, you can use LLMs to generate new content. You can use a publicly available foundation model as an LLM, or you can choose to train your own.
1818

1919
## Understand the LLM architecture
2020

21-
The architecture of LLMs typically involves **transformer networks**, which is a type of neural network introduced by the [*Attention is all you need* paper by Vaswani, et al. from 2017](https://arxiv.org/abs/1706.03762?azure-portal=true).
21+
To understand how LLMs work, we need to start with neural networks - computer systems inspired by how the human brain processes information, with interconnected nodes that learn patterns from data. LLMs use a specific type of neural network called transformers, which have revolutionized how AI understands language.
2222

23-
Transformers use **self-attention mechanisms** to weigh the importance of different words in a sentence, allowing the model to understand context more effectively than previous models like **recurrent neural networks** (**RNNs**) or **long short-term memory** (**LSTM**) **networks**.
23+
**Transformers** are the architectural foundation of modern LLMs, designed specifically to process and understand text. Unlike older neural network approaches that had to read text word-by-word in sequence, transformers can analyze all words in a sentence simultaneously and determine how they relate to each other.
2424

25-
This architectural breakthrough greatly improved LLMs, making them better at handling long-range dependencies and understanding the overall structure of the text.
25+
The breakthrough innovation in LLM architecture is the **self-attention mechanism** - it allows the model to focus on the most relevant words when understanding any part of the text. For example, in "The dog that was barking loudly woke up the neighbors," the transformer architecture enables the LLM to instantly connect "barking" and "loudly" to "dog," even though they're separated by other words.
2626

27-
Training language models requires substantial computational resources and large-scale datasets. The datasets often include a diverse range of texts from books, websites, articles, and other written materials.
28-
29-
During training, the model learns to predict the next word in a sentence, given the preceding words, which help it understand context and develop language comprehension. The sheer size of these models, often consisting of billions of parameters, allows them to store a vast amount of linguistic knowledge. For instance, GPT-3, one of the most well-known LLMs, has 175 billion parameters, making it one of the largest AI models ever created.
27+
This **transformer architecture** is what makes LLMs so powerful at understanding context and generating coherent text. They have a parallel processing capability that allows LLMs to handle long documents effectively and maintain understanding across entire conversations or articles, which was impossible with previous neural network designs. This architectural foundation, combined with massive training datasets and billions of parameters, creates the sophisticated language understanding we see in modern LLMs.
3028

3129
## Explore LLM applications
3230

0 commit comments

Comments
 (0)