|
| 1 | +--- |
| 2 | +title: LangGraph |
| 3 | +--- |
| 4 | + |
| 5 | +# LangGraph Agents |
| 6 | + |
| 7 | +<div class="subtitle"> |
| 8 | +Write tests for your <code>langgraph</code> applications. |
| 9 | +</div> |
| 10 | + |
| 11 | +LangGraph is a [library](https://github.com/langchain-ai/langgraph) for building stateful, multi-actor applications with LLMs, used to create agent and multi-agent workflows. In this example, we build a weather agent that helps us answer queries about the weather by using tool calling. |
| 12 | + |
| 13 | +## Setup |
| 14 | +To use `langgraph`, you need to need to install the corresponding package: |
| 15 | + |
| 16 | +```bash |
| 17 | +pip install langgraph |
| 18 | +``` |
| 19 | + |
| 20 | +## Agent code |
| 21 | + |
| 22 | +You can view the agent code [here](https://github.com/invariantlabs-ai/testing/blob/main/sample_tests/langgraph/weather_agent/weather_agent.py). |
| 23 | + |
| 24 | +This can be invoked as: |
| 25 | + |
| 26 | +```python |
| 27 | +from langchain_core.messages import HumanMessage |
| 28 | + |
| 29 | +from .weather_agent import WeatherAgent |
| 30 | + |
| 31 | +invocation_response = WeatherAgent().get_graph().invoke( |
| 32 | + {"messages": [HumanMessage(content="what is the weather in sf")]}, |
| 33 | + config={"configurable": {"thread_id": 42}}, |
| 34 | +) |
| 35 | +``` |
| 36 | + |
| 37 | + |
| 38 | +## Running example tests |
| 39 | + |
| 40 | +You can run the example tests discussed in this notebook by running the following command in the root of the repository: |
| 41 | + |
| 42 | +```bash |
| 43 | +poetry run invariant test sample_tests/langgraph/weather_agent/test_weather_agent.py --push --dataset_name langgraph_weather_agent |
| 44 | +``` |
| 45 | + |
| 46 | +!!! note |
| 47 | + |
| 48 | + If you want to run the example without sending the results to the Explorer UI, you can always run without the `--push` flag. You will still see the parts of the trace that fail |
| 49 | + as higihlighted in the terminal. |
| 50 | + |
| 51 | +## Unit tests |
| 52 | + |
| 53 | +We can now use `testing` to assess the correctness of our agent. We will write two tests to verify different properties of the agents' behavior. For this, we want to verify that: |
| 54 | + |
| 55 | +1. The agent can correctly answer a query about the weather in San Francisco. |
| 56 | + |
| 57 | +2. The agent can correctly answer queries when asked about both the weather in San Francisco and New York City. |
| 58 | + |
| 59 | +For this, we will use `TraceFactory` to create traces from the invocation response and then use the corresponding `Trace` methods to examine the resulting runtime traces. |
| 60 | + |
| 61 | +### Test 1: |
| 62 | + |
| 63 | +<div class='tiles'> |
| 64 | +<a target="_blank" href="https://explorer.invariantlabs.ai/u/hemang1729/langgraph_weather_agent-1733695457/t/1" class='tile'> |
| 65 | + <span class='tile-title'>Open in Explorer →</span> |
| 66 | + <span class='tile-description'>See this example in the Invariant Explorer</span> |
| 67 | +</a> |
| 68 | +</div> |
| 69 | + |
| 70 | +```python |
| 71 | +def test_weather_agent_with_only_sf(weather_agent): |
| 72 | + """Test the weather agent with San Francisco.""" |
| 73 | + invocation_response = weather_agent.invoke( |
| 74 | + {"messages": [HumanMessage(content="what is the weather in sf")]}, |
| 75 | + config={"configurable": {"thread_id": 42}}, |
| 76 | + ) |
| 77 | + |
| 78 | + trace = TraceFactory.from_langgraph(invocation_response) |
| 79 | + |
| 80 | + with trace.as_context(): |
| 81 | + find_weather_tool_calls = trace.tool_calls(name="_find_weather") |
| 82 | + assert_true(F.len(find_weather_tool_calls) == 1) |
| 83 | + assert_true( |
| 84 | + find_weather_tool_calls[0]["function"]["arguments"].contains( |
| 85 | + "San francisco" |
| 86 | + ) |
| 87 | + ) |
| 88 | + |
| 89 | + find_weather_tool_outputs = trace.messages(role="tool") |
| 90 | + assert_true(F.len(find_weather_tool_outputs) == 1) |
| 91 | + assert_true( |
| 92 | + find_weather_tool_outputs[0]["content"].contains("60 degrees and foggy") |
| 93 | + ) |
| 94 | + |
| 95 | + assert_true(trace.messages(-1)["content"].contains("60 degrees and foggy")) |
| 96 | +``` |
| 97 | + |
| 98 | +We first use the `tool_calls()` method to retrieve all tool calls where the name is `_find_weather`, and we assert that there is exactly one such call. We also verify that the argument passed to the tool call includes `San Francisco`. |
| 99 | + |
| 100 | +Next, we use the `messages()` method with the `role="tool"` filter to check the output for `_find_weather` tool call, ensuring that the content of this output contains our desired answer. |
| 101 | + |
| 102 | +Finally, we confirm that the last message also includes our desired answer. |
| 103 | + |
| 104 | +### Test 2: |
| 105 | + |
| 106 | +<div class='tiles'> |
| 107 | +<a target="_blank" href="https://explorer.invariantlabs.ai/u/hemang1729/langgraph_weather_agent-1733695457/t/2" class='tile'> |
| 108 | + <span class='tile-title'>Open in Explorer →</span> |
| 109 | + <span class='tile-description'>See this example in the Invariant Explorer</span> |
| 110 | +</a> |
| 111 | +</div> |
| 112 | + |
| 113 | +```python |
| 114 | +def test_weather_agent_with_sf_and_nyc(weather_agent): |
| 115 | + """Test the weather agent with San Francisco and New York City.""" |
| 116 | + _ = weather_agent.invoke( |
| 117 | + {"messages": [HumanMessage(content="what is the weather in sf")]}, |
| 118 | + config={"configurable": {"thread_id": 41}}, |
| 119 | + ) |
| 120 | + invocation_response = weather_agent.invoke( |
| 121 | + {"messages": [HumanMessage(content="what is the weather in nyc")]}, |
| 122 | + config={"configurable": {"thread_id": 41}}, |
| 123 | + ) |
| 124 | + |
| 125 | + trace = TraceFactory.from_langgraph(invocation_response) |
| 126 | + |
| 127 | + with trace.as_context(): |
| 128 | + find_weather_tool_calls = trace.tool_calls(name="_find_weather") |
| 129 | + assert_true(len(find_weather_tool_calls) == 2) |
| 130 | + find_weather_tool_call_args = str( |
| 131 | + F.map(lambda x: x.argument(), find_weather_tool_calls) |
| 132 | + ) |
| 133 | + assert_true( |
| 134 | + "San Francisco" in find_weather_tool_call_args |
| 135 | + and "New York City" in find_weather_tool_call_args |
| 136 | + ) |
| 137 | + |
| 138 | + find_weather_tool_outputs = trace.messages(role="tool") |
| 139 | + assert_true(F.len(find_weather_tool_outputs) == 2) |
| 140 | + assert_true( |
| 141 | + find_weather_tool_outputs[0]["content"].contains("60 degrees and foggy") |
| 142 | + ) |
| 143 | + assert_true( |
| 144 | + find_weather_tool_outputs[1]["content"].contains("90 degrees and sunny") |
| 145 | + ) |
| 146 | + |
| 147 | + assistant_response_messages = F.filter( |
| 148 | + lambda m: m.get("tool_calls") is None, trace.messages(role="assistant") |
| 149 | + ) |
| 150 | + assert_true(len(assistant_response_messages) == 2) |
| 151 | + assert_true( |
| 152 | + assistant_response_messages[0]["content"].contains( |
| 153 | + "weather in San Francisco is" |
| 154 | + ) |
| 155 | + ) |
| 156 | + assert_true( |
| 157 | + assistant_response_messages[1]["content"].contains( |
| 158 | + "weather in New York City is" |
| 159 | + ) |
| 160 | + ) |
| 161 | +``` |
| 162 | +In this test, we use `F.map` to extract the arguments of the tool calls from the list of tool calls. We then assert that both our queries are present in the arguments list. |
| 163 | + |
| 164 | +There are two types of messages with `role="assistant"`: those where tool calls are made and those corresponding to the final response back to the caller. We use `F.filter` to filter out messages where `role="assistant"` but `tool_calls` is `None`. Finally, we assert that these response messages contain the results of the weather queries. |
| 165 | + |
| 166 | +## Conclusion |
| 167 | + |
| 168 | +We have seen how to to write unit tests for specific test cases when building an agent with the Langgraph library. |
0 commit comments