You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: learn-pr/wwl-data-ai/introduction-language-models-databricks/includes/02-what-is-generative-ai.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -25,7 +25,7 @@ The underlying technology involves training on diverse text corpora, allowing th
25
25
26
26
In the realm of visual arts, generative AI is making significant strides with the development of **Generative Adversarial Networks** (**GANs**).
27
27
28
-
GANs consist of two neural networks—a **generator** and a **discriminator—that work in tandem to create realistic images. The generator creates images, while the discriminator evaluates them, leading to the production of increasingly authentic visuals over time. This technology is used to create stunning artwork, realistic human faces, and even design new products.
28
+
GANs consist of two neural networks—a **generator** and a **discriminator**—that work in tandem to create realistic images. The generator creates images, while the discriminator evaluates them, leading to the production of increasingly authentic visuals over time. This technology is used to create stunning artwork, realistic human faces, and even design new products.
29
29
30
30
The ability to generate high-quality images also finds applications in industries like fashion, where AI designs clothing, and in entertainment, where it creates special effects and virtual characters.
Copy file name to clipboardExpand all lines: learn-pr/wwl-data-ai/introduction-language-models-databricks/includes/04-key-components-llms.md
+13-9Lines changed: 13 additions & 9 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -80,17 +80,21 @@ Transformers consist of layers of **encoders** and **decoders** that work togeth
80
80
Let's use this diagram as an example of how LLM processing works.
81
81
82
82
The **LLM** is trained on a large volume of natural language text.
83
-
**Step1: Input** Training documents and a prompt "When my dog was..." enter the system.
84
-
**Step 2: Encoder (The analyzer)** Breaks text into **tokens** and analyzes its meaning. The **encoder** block processes token sequences using **self-attention** to determine the relationships between tokens or words.
85
-
**Step 3: Embeddings are created** The output from the encoder is a collection of **vectors** (multi-valued numeric arrays) in which each element of the vector represents a semantic attribute of the tokens. These vectors are referred to as **embeddings**. They're numerical representations that capture meaning:
86
83
87
-
-**dog [10,3,2]** - animal, pet, subject
88
-
-**cat [10,3,1]** - animal, pet, different species
89
-
-**puppy [5,2,1]** - young animal, related to dog
90
-
-**skateboard [-3,3,2]** - object, unrelated to animals
84
+
-**Step 1: Input** - Training documents and a prompt "When my dog was..." enter the system.
91
85
92
-
**Step 4: Decoder (The writer)** block works on a new sequence of text tokens and uses the embeddings generated by the encoder to generate an appropriate natural language output. It compares the options and chooses the most appropriate response.
93
-
**Step 5: Output generated** Given an input sequence like `When my dog was`, the model can use the self-attention mechanism to analyze the input tokens and the semantic attributes encoded in the embeddings to predict an appropriate completion of the sentence, such as `a puppy`.
86
+
-**Step 2: Encoder (The analyzer)** - Breaks text into **tokens** and analyzes its meaning. The **encoder** block processes token sequences using **self-attention** to determine the relationships between tokens or words.
87
+
88
+
-**Step 3: Embeddings are created** - The output from the encoder is a collection of **vectors** (multi-valued numeric arrays) in which each element of the vector represents a semantic attribute of the tokens. These vectors are referred to as **embeddings**. They're numerical representations that capture meaning:
89
+
90
+
-**dog [10,3,2]** - animal, pet, subject
91
+
-**cat [10,3,1]** - animal, pet, different species
92
+
-**puppy [5,2,1]** - young animal, related to dog
93
+
-**skateboard [-3,3,2]** - object, unrelated to animals
94
+
95
+
-**Step 4: Decoder (The writer)** - The decoder block works on a new sequence of text tokens and uses the embeddings generated by the encoder to generate an appropriate natural language output. It compares the options and chooses the most appropriate response.
96
+
97
+
-**Step 5: Output generated** - Given an input sequence like `When my dog was`, the model can use the self-attention mechanism to analyze the input tokens and the semantic attributes encoded in the embeddings to predict an appropriate completion of the sentence, such as `a puppy`.
94
98
95
99
This architecture is highly parallelizable, making it efficient for training on large datasets. The size of the LLM, often defined by the number of parameters, determines its capacity to store linguistic knowledge and perform complex tasks. Think of parameters as millions or billions of tiny memory cells that store language rules and patterns. More memory cells mean the model can remember more about language and handle harder tasks. Large models, such as GPT-3 and GPT-4, contain billions of parameters, allowing them to store vast language knowledge.
Copy file name to clipboardExpand all lines: learn-pr/wwl-data-ai/retrieval-augmented-generation-azure-databricks/includes/1-introduction.md
+6-6Lines changed: 6 additions & 6 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -2,18 +2,18 @@ Retrieval Augmented Generation (RAG) is a technique in natural language processi
2
2
3
3
Language models have become incredibly popular because they can generate impressive, well-structured answers to user questions. When people interact with these models through chat interfaces, it feels like a natural and intuitive way to get information.
4
4
5
-
However, there's a major challenge: ensuring the AI's responses are accurate and factual. This challenge is called **groundedness** - which simply means whether the AI's answer is based on real, reliable information rather than made-up or incorrect details. Without proper groundedness, language models might confidently stating things that aren't true. Another challenge is that traditional models use only information they were trained on that can be outdated or incomplete.
5
+
However, there's a challenge: ensuring the AI's responses are accurate and factual. This challenge is called **groundedness** - which simply means whether the AI's answer is based on real, reliable information rather than made-up or incorrect details. Without proper groundedness, language models might confidently state things that aren't true. Another challenge is that traditional models use only information they were trained on that can be outdated or incomplete.
6
6
7
7
When you want a language model to have access to specific knowledge, you have three main options:
8
8
9
9
:::image type="content" source="../media/learn-knowledge.png" alt-text="Diagram of three approaches for language models to learn knowledge.":::
10
10
11
-
1.**Model pretraining**: Build a language model from the ground up, which requires massive datasets with billions or trillions of text pieces or tokens. This is extremely expensive and time-consuming.
11
+
-**Model pretraining**: Build a language model from the ground up, which requires massive datasets with billions or trillions of text pieces or tokens. This is extremely expensive and time-consuming.
12
12
13
-
2.**Model fine-tuning**: Take an existing language model and train it further on your specific data or industry, which requires thousands of specialized examples. This is moderately expensive and complex.
13
+
-**Model fine-tuning**: Take an existing language model and train it further on your specific data or industry, which requires thousands of specialized examples. This is moderately expensive and complex.
14
14
15
-
3.**Passing contextual information**: Connect a language model to external databases or documents so it can look up information in real-time. This is a strategy known as Retrieval Augmented Generation (RAG). This requires setting up a knowledge base but is much simpler than the other options.
15
+
-**Passing contextual information**: Connect a language model to external databases or documents so it can look up information in real-time. This is a strategy known as Retrieval Augmented Generation (RAG). This requires setting up a knowledge base but is much simpler than the other options.
16
16
17
-
RAG is most practical when you need an AI with access to current, verifiable information. It's easier to implement and uses less computing power than retraining entire models.
17
+
RAG is most practical when you need an AI system with access to current, verifiable information. It's easier to implement and uses less computing power than retraining entire models.
18
18
19
-
In this module, you'll learn when and how to use RAG to make language models more reliable and accurate. You'll also discover how vector search technology helps the AI quickly find the most relevant information to include in its responses.
19
+
In this module, you'll learn when and how to use RAG to make language models more reliable and accurate. You'll also discover how vector search technology helps AI quickly find the most relevant information to include in its responses.
Copy file name to clipboardExpand all lines: learn-pr/wwl-data-ai/retrieval-augmented-generation-azure-databricks/includes/2-workflow.md
+8-8Lines changed: 8 additions & 8 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -8,7 +8,7 @@
8
8
9
9
3.**Add Context to Prompt**: The relevant information retrieved from your documents is combined with the user's original question to create an enhanced prompt that provides the LLM with the specific context it needs.
10
10
11
-
4.**LLM Generates Response**: The base language model processes both the original question and the retrieved context from your documents to generate an accurate, grounded response based on your specific data.**
11
+
4.**LLM Generates Response**: The base language model processes both the original question and the retrieved context from your documents to generate an accurate, grounded response based on your specific data.
12
12
13
13
This process bridges the gap between a general-purpose LLM and your specific, private, or recent information, allowing you to get accurate answers based on your own documents without having to retrain the entire base model.
14
14
@@ -18,9 +18,9 @@ Let's look at when you can use RAG, and then review the main components and conc
18
18
19
19
You can use RAG for chatbots, search enhancement, and content creation and summarization.
20
20
21
-
**Chatbots**: RAG helps chatbots provide more accurate answers by accessing your current company information. When integrated with customer support systems, RAG-powered chatbots can automate support and quickly resolve customer questions using up-to-date data.
21
+
**Chatbots**: RAG helps chatbots provide more accurate answers by accessing current information. When integrated with customer support systems, RAG-powered chatbots can automate support and quickly resolve customer questions using up-to-date data.
22
22
23
-
**Search enhancement**: Instead of returning just links and snippets, RAG-powered search engines provide complete, conversational answers. Users get comprehensive responses that synthesize information from multiple sources, making it much easier to find what they need.
23
+
**Search enhancement**: Instead of returning just links and snippets, RAG-powered search engines provide complete, conversational answers. Users get comprehensive responses that synthesize information from multiple sources, making it easy to find what they need.
24
24
25
25
**Content creation and summarization**: Produce high-quality, fact-based content using your own data sources. RAG enables you to generate informed articles, create summaries from lengthy documents, and develop reports that synthesize information from multiple sources.
26
26
@@ -45,7 +45,7 @@ Document embedding, as shown in the diagram, is part of a preparation phase. Thi
45
45
46
46
:::image type="content" source="../media/document-embedding.png" alt-text="Diagram of embeddings model converting documents to vectors.":::
47
47
48
-
Query embedding, shown in the diagram, happens each time a user asks a question. First, the user's question is converted into an embedding using the same embedding model. This real-time conversion prepares the query for comparison against your preprocessed document embeddings. Only after the query is embedded can the system begin searching for relevant documents.
48
+
Query embedding, shown in the diagram, happens each time a user asks a question. First, the user's question is converted into an embedding using the same embedding model that was used to process the documents. This real-time conversion prepares the query for comparison against your preprocessed document embeddings. Only after the query is embedded can the system begin searching for relevant documents.
49
49
50
50
:::image type="content" source="../media/query-embedding.png" alt-text="Diagram of embeddings model.":::
51
51
@@ -63,21 +63,21 @@ The vector store enables semantic search, which means it finds relevant content
63
63
64
64
### Augment your prompt with retrieved content
65
65
66
-
After finding the most relevant documents, the RAG system combines this information with the user's original question to create an "enhanced prompt" that gives the LLM everything it needs to provide an accurate answer.
66
+
After finding the most relevant documents, the RAG system combines this information with the user's original question to create an "augmented prompt" that gives the LLM everything it needs to provide an accurate answer.
67
67
68
68
The augmentation process looks like this:
69
69
70
70
1. Start with the user's question: "What's our vacation policy?"
71
71
2. Add retrieved context: Include relevant excerpts from your HR documents
72
-
3. Create enhanced prompt: "Based on these HR policy documents: [retrieved content], what's our vacation policy?"
72
+
3. Create augmented prompt: "Based on these HR policy documents: [retrieved content], what's our vacation policy?"
73
73
The LLM now has both the user's question **and** the specific information needed to answer it accurately. This is called "in-context learning" because the LLM learns from the context provided in the prompt rather than from its original training data.
74
74
75
-
In the final step, the enhanced prompt is sent to the Large Language Model (LLM), which generates a response based on both the question and the retrieved information. The LLM can include citations of the original sources, allowing users to verify where the information came from.
75
+
In the final step, the augmented prompt is sent to the Large Language Model (LLM), which generates a response based on both the question and the retrieved information. The LLM can include citations of the original sources, allowing users to verify where the information came from.
76
76
77
77
The key benefit of the RAG workflows is that it gives you accurate, source-backed answers without having to retrain the entire language model on your specific documents.
78
78
79
79
## RAG architecture overview
80
-
The complete RAG workflow combines all the components we've discussed into a unified system that transforms general-purpose LLMs into knowledgeable assistants for your specific domain.
80
+
The complete RAG workflow combines all the components we've reviewed into a unified system that transforms general-purpose LLMs into knowledgeable assistants for your specific domain.
81
81
82
82
The key mechanism is **in-context learning** - instead of retraining the LLM, you provide relevant information as context in each prompt, allowing the LLM to generate informed responses without permanent modification.
Copy file name to clipboardExpand all lines: learn-pr/wwl-data-ai/retrieval-augmented-generation-azure-databricks/includes/4-vector-search.md
+5-11Lines changed: 5 additions & 11 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -12,7 +12,7 @@ This mathematical representation is what makes vector search possible - instead
12
12
13
13
## How vector search works
14
14
15
-
Vector search builds on the text embedding process. Here's how it works step-by-step:
15
+
Vector search builds on the text embedding process. Here's how it works step by step:
16
16
17
17
1.**Query conversion**: When a user asks a question, your system converts their query into an embedding using the same model you used for your document data chunks.
18
18
2.**Similarity calculation**: The system compares the query embedding to all document chunk embeddings using mathematical similarity measures.
@@ -30,29 +30,23 @@ This semantic understanding comes from the embedding model, which learned these
30
30
31
31
## Choosing your vector search approach
32
32
33
-
You have two main options for implementing vector search in Azure Databricks: vector databases and vector libraries. Let's explore each of these options, so you can determine which approach best fits your needs.
33
+
There are two main options for implementing vector search in Azure Databricks: vector databases and vector libraries. Let's explore each of these options.
34
34
35
-
### Vector databases: For dynamic, large-scale data
36
-
37
-
**What is a vector database?**
35
+
### What is a vector database?
38
36
39
37
A vector database is a specialized database optimized to store and retrieve embeddings - those vectors with hundreds or thousands of numbers that represent meaning. Like traditional databases, vector databases use indices (organized structures that speed up searches) to quickly find relevant data, but these vector indices are designed to find mathematically similar vectors rather than exact matches. In RAG applications, vector databases primarily store text embeddings - vectors that represent the semantic meaning of your document chunks - along with metadata about each chunk (like source document, page numbers, or categories).
40
38
41
-
:::image type="content" source="../media/matched-query.png" alt-text="Vector space visualization showing document vectors as blue dots and a query vector as an orange dot, with the relevant document vector positioned close to the query vector.":::
39
+
:::image type="content" source="../media/matched-query.png" alt-text="Vector space visualization showing both document and query vectors as dots. A relevant document vector is positioned close to the query vector.":::
42
40
43
41
This visualization shows how vectors work in practice. Each dot represents a vector - the blue dots are document chunk embeddings stored in the vector database, and the orange dot is a query vector. The image labels "Relevant document" and "Query" indicate vectors that are close together in the mathematical space, showing similarity. When you search, the database finds document vectors nearest to your query vector.
44
42
45
43
This image illustrates why vector databases are so powerful for RAG: documents with similar meanings are positioned close together in the vector space, making it easy to find relevant content by measuring mathematical distance between vectors.
46
44
47
-
**Azure Databricks option: Mosaic AI Vector Search**
48
-
49
45
In Azure Databricks, you can use Mosaic AI Vector Search as a vector database to store the vector representations of your data and metadata. Mosaic AI Vector Search integrates with your Lakehouse data and provides search capabilities for your embeddings, with support for filtering results based on metadata you've stored with your document chunks.
50
46
51
47
Vector databases like Mosaic AI Vector Search can be used when you have large amounts of data and need to store your embeddings persistently for long-term use. Vector databases are well-suited for dynamic data because they support real-time operations - you can add new document chunks, update existing ones, or delete outdated content without rebuilding the entire search system. This makes them ideal for scenarios like knowledge bases that grow over time, document repositories with frequent updates, or applications where multiple users are adding content. They typically provide search capabilities and support filtering on metadata, making them applicable for scenarios where multiple applications need to access the same vector data.
52
48
53
-
### Vector libraries: For smaller, static datasets
54
-
55
-
**What are vector libraries?**
49
+
### What are vector libraries?
56
50
57
51
Vector libraries are tools that create vector indices to enable fast similarity search without requiring a separate database system. Think of these indices like a specialized filing system that organizes your embeddings so they can be searched quickly - similar to how a database index speeds up queries, but designed for mathematical similarity searches.
0 commit comments