|
8 | 8 | "source": [ |
9 | 9 | "\n", |
10 | 10 | "\n", |
11 | | - "# Full featured Agent Architecture\n", |
| 11 | + "# Full-Featured Agent Architecture\n", |
| 12 | + "The following example demonstrates how to build a tool-enabled agentic workflow with a semantic cache and an allow/block list router. This approach helps reduce latency and costs in the final solution.\n", |
12 | 13 | "\n", |
13 | | - "The following example covers how to build a tool call agentic workflow with a semantic cache and allow/block list router to help reduce the latency and cost of the final solution. \n", |
14 | | - "\n", |
15 | | - "Note: This notebook summarizes [this workshop](https://github.com/redis-developer/oregon-trail-agent-workshop) for a more detailed step by step walk through of each element see the repo. \n", |
| 14 | + "Note: This notebook summarizes this [this workshop](https://github.com/redis-developer/oregon-trail-agent-workshop). For a more detailed step-by-step walkthrough of each element, please refer to the repository.\n", |
16 | 15 | "\n", |
17 | 16 | "## Let's Begin!\n", |
18 | 17 | "<a href=\"https://colab.research.google.com/github/redis-developer/redis-ai-resources/blob/main/python-recipes/agents/02_full_featured_agent.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>" |
|
37 | 36 | }, |
38 | 37 | "outputs": [], |
39 | 38 | "source": [ |
| 39 | + "# NBVAL_SKIP\n", |
40 | 40 | "%%capture --no-stderr\n", |
41 | 41 | "%pip install -U --quiet langchain langchain-openai langchain-redis langgraph" |
42 | 42 | ] |
|
71 | 71 | ], |
72 | 72 | "source": [ |
73 | 73 | "# NBVAL_SKIP\n", |
74 | | - "import getpass\n", |
75 | 74 | "import os\n", |
| 75 | + "import getpass\n", |
| 76 | + "\n", |
76 | 77 | "\n", |
77 | 78 | "\n", |
78 | 79 | "def _set_env(key: str):\n", |
|
169 | 170 | } |
170 | 171 | ], |
171 | 172 | "source": [ |
| 173 | + "import os\n", |
172 | 174 | "from redis import Redis\n", |
| 175 | + "\n", |
173 | 176 | "# Use the environment variable if set, otherwise default to localhost\n", |
174 | 177 | "REDIS_URL = os.getenv(\"REDIS_URL\", \"redis://localhost:6379\")\n", |
175 | 178 | "\n", |
|
185 | 188 | "source": [ |
186 | 189 | "# Motivation\n", |
187 | 190 | "\n", |
188 | | - "This notebook is a consolidated version of the [Redis Developer Oregon Trail Agent Workshop](https://github.com/redis-developer/oregon-trail-agent-workshop) check out that repo for a more detailed explanation and project structure if new to agents.\n", |
189 | | - "\n", |
190 | | - "The goal of the workshop is to create an agent workflow that can handle 5 Oregon Trailed themed scenarios that mimic situation that often arise when implementing agent workflows in practice.\n", |
| 191 | + "The goal of the workshop is to create an agent workflow that can handle five Oregon Trail-themed scenarios, mimicking situations that often arise when implementing agent workflows in practice.\n", |
191 | 192 | "\n", |
192 | 193 | "## Scenario 1 - name of the wagon leader\n", |
193 | 194 | "\n", |
|
239 | 240 | "\n", |
240 | 241 | "\n", |
241 | 242 | "\n", |
| 243 | + "As a reminder for more detail see: [Redis Developer Oregon Trail Agent Workshop](https://github.com/redis-developer/oregon-trail-agent-workshop).\n", |
| 244 | + "\n", |
242 | 245 | "# Defining the agent with LangGraph\n", |
243 | 246 | "\n", |
244 | 247 | "## Tools\n", |
|
247 | 250 | "\n", |
248 | 251 | "### Restock tool\n", |
249 | 252 | "\n", |
250 | | - "The first tool we will define implements the restocking formula. LLMs are designed to predict text responses not to do deterministic math. In this case, the agent will act as a parser and extract the necessary information from the human query and call the tool with the appropriate schema. One of the nice things about LangGraph is that the schema for the tool can be defined as a `pydantic` model. Note: it's also essential that a good doc_string be used with the tool function such that the agent can determine the appropriate situation to use the tool. " |
| 253 | + "The first tool we will define implements the restocking formula. LLMs are designed to predict text responses, not to perform deterministic math. In this case, the agent will act as a parser, extracting the necessary information from the human query and calling the tool with the appropriate schema.\n", |
| 254 | + "\n", |
| 255 | + "One of the advantages of `LangGraph` is that the schema for the tool can be defined as a `pydantic` model. Note: It is also essential to include a well-written `doc_string` with the tool function so the agent can determine the appropriate situation to use the tool." |
251 | 256 | ] |
252 | 257 | }, |
253 | 258 | { |
|
282 | 287 | "source": [ |
283 | 288 | "## Retriever tool\n", |
284 | 289 | "\n", |
285 | | - "Sometimes an LLM might need access to data that it was not trained on wether because that data is proprietary, time bound, etc. \n", |
| 290 | + "Sometimes an LLM might need access to data that it was not trained on, whether because the data is proprietary, time-sensitive, or otherwise unavailable.\n", |
286 | 291 | "\n", |
287 | | - "In cases like these, RAG (Retrieval Augmented Generation) is often necessary wherein a vector search is used to augment a final LLM prompt with helpful necessary context.\n", |
| 292 | + "In such cases, Retrieval-Augmented Generation (RAG) is often necessary. Here, a vector search is used to augment the final LLM prompt with helpful and necessary context.\n", |
288 | 293 | "\n", |
289 | | - "RAG and Agents are not mutually exclusive and below we define a retriever tool that performs RAG when the agent determines it necessary." |
| 294 | + "RAG and agents are not mutually exclusive. Below, we define a retriever tool that performs RAG whenever the agent determines it is necessary." |
290 | 295 | ] |
291 | 296 | }, |
292 | 297 | { |
|
492 | 497 | "source": [ |
493 | 498 | "# Graph\n", |
494 | 499 | "\n", |
495 | | - "The graph composes the tools and nodes into a compilable workflow that we can invoke. " |
| 500 | + "The graph composes the tools and nodes into a compilable workflow that can be invoked." |
496 | 501 | ] |
497 | 502 | }, |
498 | 503 | { |
|
556 | 561 | "source": [ |
557 | 562 | "# Evaluate graph structure\n", |
558 | 563 | "\n", |
559 | | - "When we invoke the graph, it takes 4 primary steps:\n", |
| 564 | + "When we invoke the graph, it follows four primary steps: \n", |
560 | 565 | "\n", |
561 | | - "1. Evaluates the conditional edge between tools and agent via the `should_continue` function to determine if it should `continue` and call a tool or if it should `structure_response` and format for a user.\n", |
562 | | - "2. If it invokes the tools it appends the response from the tool as a message to state and passes back to the agent.\n", |
563 | | - "3. If it has already called tools or has decided tools are not necessary it moves to the `structure_response` node. \n", |
564 | | - "4. If the question is determined to be a **multiple choice question** within the `structure_response` node a model is invoked to make sure the response is in returns a literal `A, B, C, or D` as the game would expect otherwise it just moves forward." |
| 566 | + "1. **Evaluate Conditional Edge**: The graph evaluates the conditional edge between tools and the agent via the `should_continue` function. This determines whether it should `continue` and call a tool or move to `structure_response` to format the output for the user. \n", |
| 567 | + "2. **Invoke Tools**: If it decides to invoke the tools, the response from the tool is appended as a message to the state and passed back to the agent. \n", |
| 568 | + "3. **Determine Next Step**: If tools have already been called or are deemed unnecessary, the graph moves to the `structure_response` node. \n", |
| 569 | + "4. **Handle Multiple-Choice Questions**: If the question is identified as a **multiple-choice question** within the `structure_response` node, a model is invoked to ensure the response is returned as a literal `A, B, C, or D`, as expected by the game. Otherwise, it simply proceeds forward. " |
565 | 570 | ] |
566 | 571 | }, |
567 | 572 | { |
|
596 | 601 | "\n", |
597 | 602 | "## Scenario 1 - name of wagon leader\n", |
598 | 603 | "\n", |
599 | | - "This test just confirms that our graph has been setup correctly and can handle a case where tools don't need to be invoked." |
| 604 | + "This test confirms that our graph has been setup correctly and can handle a case where tools don't need to be invoked." |
600 | 605 | ] |
601 | 606 | }, |
602 | 607 | { |
|
742 | 747 | "source": [ |
743 | 748 | "## Scenario 4 - Semantic caching\n", |
744 | 749 | "\n", |
745 | | - "Agent workflows are highly flexible and can handle many different scenarios but they do this at a cost. Even in our simple example there can be multiple large context LLM calls in the same execution which can lead to high latency and high service costs at the end of the month. A good practice is to cache answers to known questions. Often chatbot interactions are fairly predictable, especially if related to support or FAQ type use cases, and therefore good candidates for caching.\n", |
| 750 | + "Agent workflows are highly flexible and capable of handling a wide range of scenarios, but this flexibility comes at a cost. Even in our simple example, there can be multiple large-context LLM calls in the same execution, leading to high latency and increased service costs by the end of the month.<br>\n", |
| 751 | + "\n", |
| 752 | + "A good practice is to cache answers to known questions. Chatbot interactions are often fairly predictable, particularly in support or FAQ-type use cases, making them excellent candidates for caching.\n", |
746 | 753 | "\n", |
747 | 754 | "\n", |
748 | 755 | "\n", |
|
865 | 872 | "source": [ |
866 | 873 | "## Scenario 5 - Allow/block list router\n", |
867 | 874 | "\n", |
868 | | - "When ChatGPT first came out there was a famous example where a car dealership accidentally made available to everyone the latest model for free. They assumed everyone would only ask question about cars to their chatbot but a group of developers quickly realized that the model was also powerful enough to answer coding questions so they started using the chevy dealership's chatbot for free. To prevent this from happening to your system adding an allow/block router to the front of your application is important. This is also very easy to do with redisvl.\n", |
| 875 | + "When ChatGPT first launched, there was a famous example where a car dealership accidentally made one of the latest language models available for free to everyone. They assumed users would only ask questions about cars through their chatbot. However, a group of developers quickly realized that the model was powerful enough to answer coding questions, so they started using the dealership's chatbot for free. <br>\n", |
| 876 | + "\n", |
| 877 | + "To prevent this kind of misuse in your system, adding an allow/block router to the front of your application is essential. Fortunately, this is very easy to implement using `redisvl`.\n", |
869 | 878 | "\n", |
870 | 879 | "" |
871 | 880 | ] |
|
0 commit comments