Skip to content

Commit 64dff8b

Browse files
committed
Start adding AutoGen documentation
1 parent b8e45f3 commit 64dff8b

File tree

5 files changed

+69
-21
lines changed

5 files changed

+69
-21
lines changed

text_2_sql/README.md

Lines changed: 40 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -39,6 +39,7 @@ Three different iterations are presented and code provided for:
3939
- **Iteration 2:** Injection of a brief description of the available entities is injected into the prompt. This limits the number of tokens used and avoids filling the prompt with confusing schema information.
4040
- **Iteration 3:** Indexing the entity definitions in a vector database, such as AI Search, and querying it to retrieve the most relevant entities for the key terms from the query.
4141
- **Iteration 4:** Keeping an index of commonly asked questions and which schema / SQL query they resolve to - this index is generated by the LLM when it encounters a question that has not been previously asked. Additionally, indexing the entity definitions in a vector database, such as AI Search _(same as Iteration 3)_. First querying this index to see if a similar SQL query can be obtained _(if high probability of exact SQL query match, the results can be pre-fetched)_. If not, falling back to the schema index, and querying it to retrieve the most relevant entities for the key terms from the query.
42+
- **Iteration 5:** Moves the Iteration 4 approach into a multi-agent approach for improved reasoning and query generation. With separation into agents, different agents can focus on one task only, and provide a better overall flow and response quality.
4243

4344
All approaches limit the number of tokens used and avoids filling the prompt with confusing schema information.
4445

@@ -48,15 +49,17 @@ To improve the scalability and accuracy in SQL Query generation, the entity rela
4849

4950
For the query cache enabled approach, AI Search is used as a vector based cache, but any other cache that supports vector queries could be used, such as Redis.
5051

51-
### Full Logical Flow for Vector Based Approach
52+
### Full Logical Flow for Agentic Vector Based Approach
5253

53-
The following diagram shows the logical flow within the Vector Based plugin. In an ideal scenario, the questions will follow the _Pre-Fetched Cache Results Path** which leads to the quickest answer generation. In cases where the question is not known, the plugin will fall back the other paths accordingly and generate the SQL query using the LLM. The cache is then updated with the newly generated query and schemas.
54+
The following diagram shows the logical flow within mutlti agent system. In an ideal scenario, the questions will follow the _Pre-Fetched Cache Results Path** which leads to the quickest answer generation. In cases where the question is not known, the group chat selector will fall back to the other agents accordingly and generate the SQL query using the LLMs. The cache is then updated with the newly generated query and schemas.
55+
56+
Unlike the previous approaches, **gpt4o-mini** can be used as each agent's prompt is small and focuses on a single simple task.
5457

5558
As the query cache is shared between users (no data is stored in the cache), a new user can benefit from the pre-mapped question and schema resolution in the index.
5659

5760
**Database results were deliberately not stored within the cache. Storing them would have removed one of the key benefits of the Text2SQL plugin, the ability to get near-real time information inside a RAG application. Instead, the query is stored so that the most-recent results can be obtained quickly. Additionally, this retains the ability to apply Row or Column Level Security.**
5861

59-
![Vector Based with Query Cache Logical Flow.](./images/Text2SQL%20Query%20Cache.png "Vector Based with Query Cache Logical Flow")
62+
![Vector Based with Query Cache Logical Flow.](./images/Agentic%20Text2SQL%20Query%20Cache.png "Agentic Vector Based with Query Cache Logical Flow")
6063

6164
### Caching Strategy
6265

@@ -68,20 +71,22 @@ The cache strategy implementation is a simple way to prove that the system works
6871
- **Always update:** Always add all questions into the cache when they are asked. The sample code in the repository currently implements this approach, but this could lead to poor SQL queries reaching the cache. One of the other caching strategies would be better production version.
6972

7073
### Comparison of Iterations
71-
| | Common Text2SQL Approach | Prompt Based Multi-Shot Text2SQL Approach | Vector Based Multi-Shot Text2SQL Approach | Vector Based Multi-Shot Text2SQL Approach With Query Cache |
72-
|-|-|-|-|-|
74+
| | Common Text2SQL Approach | Prompt Based Multi-Shot Text2SQL Approach | Vector Based Multi-Shot Text2SQL Approach | Vector Based Multi-Shot Text2SQL Approach With Query Cache | Agentic Vector Based Multi-Shot Text2SQL Approach With Query Cache |
75+
|-|-|-|-|-|-|
7376
|**Advantages** | Fast for a limited number of entities. | Significant reduction in token usage. | Significant reduction in token usage. | Significant reduction in token usage.
74-
| | | | Scales well to multiple entities. | Scales well to multiple entities. |
75-
| | | | Uses a vector approach to detect the best fitting entity which is faster than using an LLM. Matching is offloaded to AI Search. | Uses a vector approach to detect the best fitting entity which is faster than using an LLM. Matching is offloaded to AI Search. |
76-
| | | | | Significantly faster to answer similar questions as best fitting entity detection is skipped. Observed tests resulted in almost half the time for final output compared to the previous iteration. |
77-
| | | | | Significantly faster execution time for known questions. Total execution time can be reduced by skipping the query generation step. |
78-
|**Disadvantages** | Slows down significantly as the number of entities increases. | Uses LLM to detect the best fitting entity which is slow compared to a vector approach. | AI Search adds additional cost to the solution. | Slower than other approaches for the first time a question with no similar questions in the cache is asked. |
79-
| | Consumes a significant number of tokens as number of entities increases. | As number of entities increases, token usage will grow but at a lesser rate than Iteration 1. | | AI Search adds additional cost to the solution. |
80-
| | LLM struggled to differentiate which table to choose with the large amount of information passed. | | |
77+
| | | | Scales well to multiple entities. | Scales well to multiple entities. | Scales well to multiple entities with small agents. |
78+
| | | | Uses a vector approach to detect the best fitting entity which is faster than using an LLM. Matching is offloaded to AI Search. | Uses a vector approach to detect the best fitting entity which is faster than using an LLM. Matching is offloaded to AI Search. | Uses a vector approach to detect the best fitting entity which is faster than using an LLM. Matching is offloaded to AI Search. |
79+
| | | | | Significantly faster to answer similar questions as best fitting entity detection is skipped. Observed tests resulted in almost half the time for final output compared to the previous iteration. | Significantly faster to answer similar questions as best fitting entity detection is skipped. Observed tests resulted in almost half the time for final output compared to the previous iteration. |
80+
| | | | | Significantly faster execution time for known questions. Total execution time can be reduced by skipping the query generation step. | Significantly faster execution time for known questions. Total execution time can be reduced by skipping the query generation step. |
81+
| | | | | | Instruction following and accuracy is improved by decomposing the task into smaller tasks. |
82+
| | | | | | Handles query decomposition for complex questions. |
83+
|**Disadvantages** | Slows down significantly as the number of entities increases. | Uses LLM to detect the best fitting entity which is slow compared to a vector approach. | AI Search adds additional cost to the solution. | Slower than other approaches for the first time a question with no similar questions in the cache is asked. | Slower than other approaches for the first time a question with no similar questions in the cache is asked. |
84+
| | Consumes a significant number of tokens as number of entities increases. | As number of entities increases, token usage will grow but at a lesser rate than Iteration 1. | | AI Search adds additional cost to the solution. | AI Search and multiple agents adds additional cost to the solution. |
85+
| | LLM struggled to differentiate which table to choose with the large amount of information passed. | | | |
8186
|**Code Availability**| | | | |
82-
| Semantic Kernel | Yes :heavy_check_mark: | Yes :heavy_check_mark: | Yes :heavy_check_mark: | Yes :heavy_check_mark: |
83-
| LangChain | | | | |
84-
| AutoGen | | | | | |
87+
| Semantic Kernel | Yes :heavy_check_mark: | Yes :heavy_check_mark: | Yes :heavy_check_mark: | Yes :heavy_check_mark: | |
88+
| LangChain | | | | | |
89+
| AutoGen | | | | | Yes :heavy_check_mark: |
8590

8691
### Complete Execution Time Comparison for Approaches
8792

@@ -247,13 +252,28 @@ The following environmental variables control the behaviour of the Vector Based
247252
- **Text2Sql__UseQueryCache** - controls whether the query cached index is checked before using the standard schema index.
248253
- **Text2Sql__PreRunQueryCache** - controls whether the top result from the query cache index (if enabled) is pre-fetched against the data source to include the results in the prompt.
249254

255+
## Agentic Vector Based Approach (Iteration 5)
256+
257+
This approach builds on the the Vector Based SQL Plugin approach, but adds a agentic approach to the solution.
258+
259+
This agentic system contains the following agents:
260+
261+
- **Query Cache Agent:** Responsible for checking the cache for previously asked questions.
262+
- **Query Decomposition Agent:** Responsible for decomposing complex questions, into sub questions that can be answered with SQL.
263+
- **Schema Selection Agent:** Responsible for extracting key terms from the question and checking the index store for the queries.
264+
- **SQL Query Generation Agent:** Responsible for using the previously extracted schemas and generated SQL queries to answer the question. This agent can request more schemas if needed. This agent will run the query.
265+
- **SQL Query Verification Agent:** Responsible for verifying that the SQL query and results question will answer the question.
266+
- **Answer Generation Agent:** Responsible for taking the database results and generating the final answer for the user.
267+
268+
The combination of this agent allows the system to answer complex questions, whilst staying under the token limits when including the database schemas. The query cache ensures that previously asked questions, can be answered quickly to avoid degrading user experience.
269+
250270
## Code Availability
251271

252-
| | Common Text2SQL Approach | Prompt Based Multi-Shot Text2SQL Approach | Vector Based Multi-Shot Text2SQL Approach | Vector Based Multi-Shot Text2SQL Approach With Query Cache |
253-
|-|-|-|-|-|
254-
| Semantic Kernel | Yes :heavy_check_mark: | Yes :heavy_check_mark: | Yes :heavy_check_mark: | Yes :heavy_check_mark: |
255-
| LangChain | | | | |
256-
| AutoGen | | | | | |
272+
| | Common Text2SQL Approach | Prompt Based Multi-Shot Text2SQL Approach | Vector Based Multi-Shot Text2SQL Approach | Vector Based Multi-Shot Text2SQL Approach With Query Cache | Agentic Vector Based Multi-Shot Text2SQL Approach With Query Cache |
273+
|-|-|-|-|-|-|
274+
| Semantic Kernel | Yes :heavy_check_mark: | Yes :heavy_check_mark: | Yes :heavy_check_mark: | Yes :heavy_check_mark: | |
275+
| LangChain | | | | | |
276+
| AutoGen | | | | | Yes :heavy_check_mark: |
257277

258278
See the relevant directory for the code in the provided framework.
259279

text_2_sql/autogen/README.md

Lines changed: 17 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,19 @@
11
# Multi-Shot Text2SQL Component - AutoGen
22

3-
Very much still work in progress, more documentation coming soon.
3+
The implementation is written for [AutoGen](https://github.com/microsoft/autogen) in Python, although it can easily be adapted for C#.
4+
5+
**The provided AutoGen code only implements Iterations 5 (Agentic Approach)**
6+
7+
## Full Logical Flow for Agentic Vector Based Approach
8+
9+
The following diagram shows the logical flow within mutlti agent system. In an ideal scenario, the questions will follow the _Pre-Fetched Cache Results Path** which leads to the quickest answer generation. In cases where the question is not known, the group chat selector will fall back to the other agents accordingly and generate the SQL query using the LLMs. The cache is then updated with the newly generated query and schemas.
10+
11+
Unlike the previous approaches, **gpt4o-mini** can be used as each agent's prompt is small and focuses on a single simple task.
12+
13+
As the query cache is shared between users (no data is stored in the cache), a new user can benefit from the pre-mapped question and schema resolution in the index.
14+
15+
**Database results were deliberately not stored within the cache. Storing them would have removed one of the key benefits of the Text2SQL plugin, the ability to get near-real time information inside a RAG application. Instead, the query is stored so that the most-recent results can be obtained quickly. Additionally, this retains the ability to apply Row or Column Level Security.**
16+
17+
![Vector Based with Query Cache Logical Flow.](../images/Agentic%20Text2SQL%20Query%20Cache.png "Agentic Vector Based with Query Cache Logical Flow")
18+
19+
## Provided Notebooks & Scripts
329 KB
Loading
583 KB
Loading

text_2_sql/semantic_kernel/README.md

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,18 @@
22

33
The implementation is written for [Semantic Kernel](https://github.com/microsoft/semantic-kernel) in Python, although it can easily be adapted for C#.
44

5+
**The provided Semantic Kernel code implements Iterations 2, 3 & 4.**
6+
7+
## Full Logical Flow for Vector Based Approach
8+
9+
The following diagram shows the logical flow within the Vector Based plugin. In an ideal scenario, the questions will follow the _Pre-Fetched Cache Results Path** which leads to the quickest answer generation. In cases where the question is not known, the plugin will fall back the other paths accordingly and generate the SQL query using the LLMs.
10+
11+
As the query cache is shared between users (no data is stored in the cache), a new user can benefit from the pre-mapped question and schema resolution in the index.
12+
13+
**Database results were deliberately not stored within the cache. Storing them would have removed one of the key benefits of the Text2SQL plugin, the ability to get near-real time information inside a RAG application. Instead, the query is stored so that the most-recent results can be obtained quickly. Additionally, this retains the ability to apply Row or Column Level Security.**
14+
15+
![Vector Based with Query Cache Logical Flow.](../images/Text2SQL%20Query%20Cache.png "Vector Based with Query Cache Logical Flow")
16+
517
## Provided Notebooks & Scripts
618

719
- `./rag_with_prompt_based_text_2_sql.ipynb` provides example of how to utilise the Prompt Based Text2SQL plugin to query the database.

0 commit comments

Comments
 (0)