Skip to content

Commit 0593c76

Browse files
authored
Merge pull request #2859 from antfin/cms/antfin/hpe-dev-portal/blog/llm-agentic-tool-mesh-empowering-gen-ai-with-retrieval-augmented-generation-rag
Create Blog “llm-agentic-tool-mesh-empowering-gen-ai-with-retrieval-augmented-generation-rag”
2 parents c1c01ae + a3eb11b commit 0593c76

File tree

4 files changed

+267
-0
lines changed

4 files changed

+267
-0
lines changed
Lines changed: 267 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,267 @@
1+
---
2+
title: "LLM Agentic Tool Mesh: Empowering Gen AI with Retrieval-Augmented
3+
Generation (RAG)"
4+
date: 2025-01-08T11:46:13.362Z
5+
author: Antonio Fin
6+
authorimage: /img/afin_photo.jpg
7+
disable: false
8+
tags:
9+
- HPE
10+
- GenAI
11+
- LAT-Mesh
12+
- RAG
13+
---
14+
<style>
15+
li {
16+
font-size: 27px !important;
17+
line-height: 33px !important;
18+
max-width: none !important;
19+
}
20+
</style>
21+
22+
In our previous blog posts, we explored the [Chat Service](https://developer.hpe.com/blog/ll-mesh-exploring-chat-service-and-factory-design-pattern/) and the [Agents Service](https://developer.hpe.com/blog/llm-agentic-tool-mesh-harnessing-agent-services-and-multi-agent-ai-for-next-level-gen-ai/) of [LLM Agentic Tool Mesh](https://developer.hpe.com/blog/ll-mesh-democratizing-gen-ai-through-open-source-innovation-1/), highlighting how they simplify the integration of Generative AI (Gen AI) into applications.
23+
24+
Today, we'll dive into another pivotal feature of LLM Agentic Tool Mesh: **Retrieval-Augmented Generation (RAG) Service**. We'll explain what RAG is, how LLM Agentic Tool Mesh handles it, delve into the RAG services, and showcase an example of an agentic tool using RAG.
25+
26+
# Understanding Retrieval-Augmented Generation (RAG)
27+
28+
RAG is a technique that enhances the capabilities of language models by providing them with access to external knowledge sources. Instead of relying solely on the information contained within the model's parameters, RAG allows models to retrieve and utilize relevant data from external documents or databases.
29+
30+
This approach improves the accuracy and relevance of generated responses, especially in domains requiring up-to-date or specialized information. The key benefits of RAG are:
31+
32+
* **Enhanced accuracy**: Provides more precise and factual responses by accessing more data available externally
33+
* **Domain specialization**: Enables models to handle specialized topics by leveraging domain-specific documents
34+
* **Reduced hallucinations**: Minimizes the generation of incorrect or nonsensical information by grounding responses in real data
35+
36+
# RAG in LLM Agentic Tool Mesh
37+
38+
In the LLM Agentic Tool Mesh platform, RAG is a **crucial tool** that **can enhance** **a language model's** capabilities by integrating external knowledge. LLM Agentic Tool implements RAG through two main stages:
39+
40+
* **Injection**
41+
* **Retrieval**
42+
43+
Each stage is designed to standardize and optimize data use, ensuring generated content is both relevant and accurate.
44+
45+
## The injection process
46+
47+
The injection process involves preparing and integrating data into a storage system where it can be efficiently retrieved when content is being generated.
48+
49+
This process is abstracted into three key steps:
50+
51+
1. **Extraction**
52+
2. **Transformation**
53+
3. **Loading**
54+
55+
![](/img/ingestion.png)
56+
57+
### Extraction
58+
59+
As part of the extraction phase, data gathering involves collecting information from various sources, such as DOCX, PDF, or other formats, and converting it into a common format, typically JSON, to ensure consistency.
60+
61+
Example usage
62+
63+
```python
64+
# Configuration for the Data Extractor
65+
EXTRACTOR_CONFIG = {
66+
'type': 'UnstructuredSections',
67+
'document_type': 'Pdf',
68+
'cache_elements_to_file': True,
69+
'extract_text': True,
70+
'exclude_header': True,
71+
'exclude_footer': True,
72+
'extract_image': False,
73+
'image_output_folder': './images'
74+
}
75+
76+
# Initialize the Data Extractor
77+
data_extractor = DataExtractor.create(EXTRACTOR_CONFIG)
78+
79+
# Parse a document file
80+
file_path = 'example_document.pdf'
81+
result = data_extractor.parse(file_path)
82+
83+
# Handle the extraction result
84+
if result.status == "success":
85+
print(f"EXTRACTED ELEMENTS:\n{result.elements}")
86+
else:
87+
print(f"ERROR:\n{result.error_message}")
88+
```
89+
90+
### Transformation
91+
92+
As part of the transformation phase, the process involves cleaning the data by removing irrelevant or redundant information, enriching it with metadata to enhance searchability during retrieval, and transforming the cleaned data using LLMs to generate summaries, question-and-answer pairs, or other structured outputs.
93+
94+
Example usage
95+
96+
````python
97+
from athon.rag import DataTransformer
98+
99+
# Configuration for the Data Transformer
100+
TRANSFORMER_CONFIG = {
101+
'type': 'CteActionRunner',
102+
'clean': {
103+
'headers_to_remove': ['Confidential', 'Draft'],
104+
'min_section_length': 100
105+
},
106+
'transform': {
107+
'llm_config': {
108+
'type': 'LangChainChatOpenAI',
109+
'api_key': 'your-api-key-here',
110+
'model_name': 'gpt-4o'
111+
},
112+
'system_prompt': 'Summarize the following content.',
113+
'transform_delimeters': ['```', '```json']
114+
},
115+
'enrich': {
116+
'metadata': {
117+
'source': 'LLM Agentic Tool Mesh Platform',
118+
'processed_by': 'CteActionRunner'
119+
}
120+
}
121+
}
122+
123+
# Initialize the Data Transformer
124+
data_transformer = DataTransformer.create(TRANSFORMER_CONFIG)
125+
126+
# List of extracted elements to be transformed
127+
extracted_elements = [
128+
{"text": "Confidential Report on AI Development", "metadata": {"type": "Header"}},
129+
{"text": "AI is transforming industries worldwide...", "metadata": {"type": "Paragraph"}}
130+
]
131+
132+
# Define the actions to be performed
133+
actions = ['RemoveSectionsByHeader', 'TransformInSummary', 'EnrichMetadata']
134+
135+
# Process the elements
136+
result = data_transformer.process(actions, extracted_elements)
137+
138+
# Handle the transformation result
139+
if result.status == "success":
140+
print(f"TRANSFORMED ELEMENTS:\n{result.elements}")
141+
else:
142+
print(f"ERROR:\n{result.error_message}")
143+
````
144+
145+
### Loading
146+
147+
As part of the loading phase, the process includes injecting the transformed data into the chosen storage solution, such as a vector database, and further adapting the data as needed, such as chunking it into smaller pieces for efficient retrieval.
148+
149+
Example usage
150+
151+
```python
152+
from athon.rag import DataLoader
153+
154+
# Configuration for the Data Loader
155+
LOADER_CONFIG = {
156+
'type': 'ChromaForSentences'
157+
}
158+
159+
# Initialize the Data Loader
160+
data_loader = DataLoader.create(LOADER_CONFIG)
161+
162+
# Example collection (retrieved from a DataStorage instance)
163+
collection = data_storage.get_collection().collection
164+
165+
# List of elements to be inserted
166+
elements = [
167+
{"text": "Generative AI is transforming industries.", "metadata": {"category": "AI", "importance": "high"}},
168+
{"text": "This document discusses the impact of AI.", "metadata": {"category": "AI", "importance": "medium"}}
169+
]
170+
171+
# Insert the elements into the collection
172+
result = data_loader.insert(collection, elements)
173+
174+
# Handle the insertion result
175+
if result.status == "success":
176+
print("Data successfully inserted into the collection.")
177+
else:
178+
print(f"ERROR:\n{result.error_message}")
179+
```
180+
181+
## The retrieval process
182+
183+
Once the data has been injected and is ready for use, the retrieval process focuses on fetching the most relevant information based on a given input query. This ensures that the language model has access to the right data to generate accurate and contextually relevant outputs.
184+
185+
1. **Data retrieval**: Uses various methods, such as dense or sparse retrieval, to fetch the most relevant data from storage
186+
2. **Metadata filtering**: Applies metadata filters to narrow down search results, ensuring the retrieved data matches the specific needs of the query
187+
3. **Chunk Expansion**: Expands the retrieved data chunks to provide comprehensive information for the language model
188+
189+
![](/img/retrieve.png)
190+
191+
Example usage
192+
193+
```python
194+
from athon.rag import DataRetriever
195+
196+
# Configuration for the Data Retriever
197+
RETRIEVER_CONFIG = {
198+
'type': 'ChromaForSentences',
199+
'expansion_type': 'Section',
200+
'sentence_window': 3,
201+
'n_results': 10,
202+
'include': ['documents', 'metadatas']
203+
}
204+
205+
# Initialize the Data Retriever
206+
data_retriever = DataRetriever.create(RETRIEVER_CONFIG)
207+
208+
# Example collection (retrieved from a DataStorage instance)
209+
collection = data_storage.get_collection().collection
210+
211+
# Query to search within the collection
212+
query = "What is the impact of Generative AI on industries?"
213+
214+
# Retrieve relevant data based on the query
215+
result = data_retriever.select(collection, query)
216+
217+
# Handle the retrieval result
218+
if result.status == "success":
219+
for element in result.elements:
220+
print(f"TEXT:\n{element['text']}\nMETADATA:\n{element['metadata']}\n")
221+
else:
222+
print(f"ERROR:\n{result.error_message}")
223+
```
224+
225+
# LLM Agentic Tool Mesh in action: Agentic tool using RAG
226+
227+
In the [LLM Agentic Tool Mesh GitHub](https://github.com/HewlettPackard/llmesh), there is an example of a RAG-based tool that provides quick and accurate access to 5G specifications: the **telco expert** (inside folder `examples/tool_rag`).
228+
229+
This agentic tool leverages the RAG services in LLM Agentic Tool Mesh to read telco standards, build or use a vector store from them, and then uses a query engine to find and return relevant information based on user queries.
230+
231+
For enhanced observability, the telco expert not only provides the answer but also displays the retrieved chunks used to formulate the response. This includes both the text of the chunks and their associated metadata, such as the document source, date, and other relevant details. This feature allows users to verify the origin of the information and gain deeper insights into the data supporting the answer.
232+
233+
Tool code snippet
234+
235+
```python
236+
@AthonTool(config, logger)
237+
def telco_expert(query: str) -> str:
238+
"""
239+
This function reads the telco standards, builds or uses a vector store
240+
from them, and then uses a query engine to find and return relevant
241+
information to the input question.
242+
"""
243+
collection = _get_collection()
244+
if LOAD:
245+
_load_files_into_db(collection)
246+
augment_query = _augment_query_generated(query)
247+
rag_results = _retrieve_from_collection(collection, augment_query)
248+
ordered_rag_results = _rerank_answers(augment_query, rag_results)
249+
summary_answer = _summary_answer(augment_query, ordered_rag_results)
250+
chunk_answer = _create_chunk_string(ordered_rag_results)
251+
return summary_answer + "\n\n" + chunk_answer
252+
```
253+
254+
The functionalities shown are:
255+
256+
* **Data injection**: Loads telco standards into a vector store
257+
* **Query augmentation**: Enhances the user's query for better retrieval
258+
* **Data retrieval**: Retrieves relevant chunks from the vector store
259+
* **Answer generation**: Summarizes and formats the retrieved information to provide a comprehensive answer
260+
261+
![](/img/rag_tool.png)
262+
263+
# Conclusion
264+
265+
The RAG Service in LLM Agentic Tool Mesh exemplifies how advanced design principles and innovative engineering simplify and enhance the adoption of Gen AI. By abstracting complexities and providing versatile examples, LLM Agentic Tool Mesh enables developers and users alike to unlock the transformative potential of Gen AI in various domains.
266+
267+
Stay tuned for our next post, where we'll explore the System Service of LLM Agentic Tool Mesh, essential for creating and managing a mesh of tools, as we continue our journey to democratize Gen AI!

static/img/ingestion.png

170 KB
Loading

static/img/rag_tool.png

1.49 MB
Loading

static/img/retrieve.png

105 KB
Loading

0 commit comments

Comments
 (0)