You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
- content: "What is the primary function of tokenization in Large Language Models (LLMs)?"
18
-
choices:
19
-
- content: "To generate responses for user queries."
20
-
isCorrect: false
21
-
explanation: "Incorrect. The primary function of tokenization isn't generating responses for user queries."
22
-
- content: "To convert text into smaller units for easier processing."
23
-
isCorrect: true
24
-
explanation: "Correct. Tokenization is a preprocessing step in LLMs where text is broken down into smaller units, such as words, subwords, or characters. Tokenization makes it easier for the model to process and understand the text."
25
-
- content: "To summarize long texts into shorter versions."
26
-
isCorrect: false
27
-
explanation: "Incorrect. Summarization is summarizing long texts into shorter versions and it isn't the primary function of tokenization."
28
-
- content: "Which of the following tasks involves determining the emotional tone of a piece of text?"
29
-
choices:
30
-
- content: "Summarization"
31
-
isCorrect: false
32
-
explanation: "Incorrect. Summarization is summarizing long texts into shorter versions and not determining the emotional tone of a piece of text."
33
-
- content: "Translation"
34
-
isCorrect: false
35
-
explanation: "Incorrect. Translation is converting text from one language to multiple languages and not determining the emotional tone of a piece of text."
36
-
- content: "Sentiment Analysis"
37
-
isCorrect: true
38
-
explanation: "Correct. Sentiment analysis is the task of identifying the emotional tone of a text, such as determining if the sentiment is positive, negative, or neutral. Sentiment analysis helps in understanding opinions and feelings expressed in the text."
39
-
- content: "In the context of Large Language Models (LLMs), what does zero-shot classification refer to?"
40
-
choices:
41
-
- content: "Classifying text into predefined categories without any prior training examples."
42
-
isCorrect: true
43
-
explanation: "Correct. Zero-shot classification involves categorizing text into predefined labels without seeing any labeled examples during training. Zero-shot classification is achieved by using the model's extensive general knowledge and language understanding."
44
-
- content: "Training the model on a few examples for a specific task."
45
-
isCorrect: false
46
-
explanation: "Incorrect. Training the model on a few examples isn't zero-shot classification."
47
-
- content: "Generating text responses based on a given prompt."
48
-
isCorrect: false
49
-
explanation: "Incorrect. Generating text responses based on a given prompt isn't zero-shot classification."
- content: "What is the primary function of tokenization in Large Language Models (LLMs)?"
18
+
choices:
19
+
- content: "To generate responses for user queries."
20
+
isCorrect: false
21
+
explanation: "Incorrect. The primary function of tokenization isn't generating responses for user queries."
22
+
- content: "To convert text into smaller units for easier processing."
23
+
isCorrect: true
24
+
explanation: "Correct. Tokenization is a preprocessing step in LLMs where text is broken down into smaller units, such as words, subwords, or characters. Tokenization makes it easier for the model to process and understand the text."
25
+
- content: "To summarize long texts into shorter versions."
26
+
isCorrect: false
27
+
explanation: "Incorrect. Summarization is summarizing long texts into shorter versions and it isn't the primary function of tokenization."
28
+
- content: "Which of the following tasks involves determining the emotional tone of a piece of text?"
29
+
choices:
30
+
- content: "Summarization"
31
+
isCorrect: false
32
+
explanation: "Incorrect. Summarization is summarizing long texts into shorter versions and not determining the emotional tone of a piece of text."
33
+
- content: "Translation"
34
+
isCorrect: false
35
+
explanation: "Incorrect. Translation is converting text from one language to multiple languages and not determining the emotional tone of a piece of text."
36
+
- content: "Sentiment Analysis"
37
+
isCorrect: true
38
+
explanation: "Correct. Sentiment analysis is the task of identifying the emotional tone of a text, such as determining if the sentiment is positive, negative, or neutral. Sentiment analysis helps in understanding opinions and feelings expressed in the text."
39
+
- content: "In the context of Large Language Models (LLMs), what does zero-shot classification refer to?"
40
+
choices:
41
+
- content: "Classifying text into predefined categories without any prior training examples."
42
+
isCorrect: true
43
+
explanation: "Correct. Zero-shot classification involves categorizing text into predefined labels without seeing any labeled examples during training. Zero-shot classification is achieved by using the model's extensive general knowledge and language understanding."
44
+
- content: "Training the model on a few examples for a specific task."
45
+
isCorrect: false
46
+
explanation: "Incorrect. Training the model on a few examples isn't zero-shot classification."
47
+
- content: "Generating text responses based on a given prompt."
48
+
isCorrect: false
49
+
explanation: "Incorrect. Generating text responses based on a given prompt isn't zero-shot classification."
Copy file name to clipboardExpand all lines: learn-pr/wwl-data-ai/introduction-language-models-databricks/includes/03-what-are-large-language-models.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -11,7 +11,7 @@ Let's start by exploring what LLMs are.
11
11
:::image type="content" source="../media/02-large-language-model.png" alt-text="Diagram of LLMs and foundation models as part of Generative AI.":::
12
12
13
13
1.**Generative AI** refers to systems that can create new content, such as text, images, audio, or video.
14
-
1.**Large Language Models** (**LLMs**) are a type of Generative AI that focus on language-related tasks.
14
+
1.**Large Language Models** (**LLMs**) are a type of Generative AI that focuses on language-related tasks.
15
15
1.**Foundation models** are the underlying models that serve as the basis for AI applications. The models are trained on broad and diverse datasets and can be adapted to a wide range of downstream tasks.
16
16
17
17
When you want to achieve Generative AI, you can use LLMs to generate new content. You can use a publicly available foundation model as an LLM, or you can choose to train your own.
Copy file name to clipboardExpand all lines: learn-pr/wwl-data-ai/introduction-language-models-databricks/includes/04-key-components-llms.md
+3-3Lines changed: 3 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,8 +1,8 @@
1
1
**Large Language Models** (**LLMs**) are like sophisticated language processing systems designed to understand and generate human language. Think of them as having four essential parts that work together, similar to how a car needs an engine, fuel system, transmission, and steering wheel to function properly.
2
2
3
-
-**Prompt**: Your instructions to the model. The prompt is how you communicate with the LLM. It's your question, request or instruction.
3
+
-**Prompt**: Your instructions to the model. The prompt is how you communicate with the LLM. It's your question, request, or instruction.
4
4
-**Tokenizer**: Breaks down language. The tokenizer is a language translator that converts human text into a format the computer can understand.
5
-
-**Model**: The 'brain' of the operation. The model is the actual 'brain' that processes information and generates responses. It is typically based on the transformer architecture, utilizes self-attention mechanisms to process text and generates contextually relevant responses.
5
+
-**Model**: The 'brain' of the operation. The model is the actual 'brain' that processes information and generates responses. It's typically based on the transformer architecture, utilizes self-attention mechanisms to process text, and generates contextually relevant responses.
6
6
-**Tasks**: What LLMs can do. Tasks are the different language-related jobs that LLMs can perform, such as text classification, translation, and dialogue generation.
7
7
8
8
These components create a powerful language processing system:
@@ -82,7 +82,7 @@ Let's use this diagram as an example of how LLM processing works.
82
82
The **LLM** is trained on a large volume of natural language text.
83
83
**Step1: Input** Training documents and a prompt "When my dog was..." enter the system.
84
84
**Step 2: Encoder (The analyzer)** Breaks text into **tokens** and analyzes its meaning. The **encoder** block processes token sequences using **self-attention** to determine the relationships between tokens or words.
85
-
**Step 3: Embeddings are created** The output from the encoder is a collection of **vectors** (multi-valued numeric arrays) in which each element of the vector represents a semantic attribute of the tokens. These vectors are referred to as **embeddings**. They are numerical representations that capture meaning:
85
+
**Step 3: Embeddings are created** The output from the encoder is a collection of **vectors** (multi-valued numeric arrays) in which each element of the vector represents a semantic attribute of the tokens. These vectors are referred to as **embeddings**. They're numerical representations that capture meaning:
86
86
87
87
-**dog [10,3,2]** - animal, pet, subject
88
88
-**cat [10,3,1]** - animal, pet, different species
0 commit comments