|
21 | 21 | - [Chapter 10: Prompt Engineering for Mellea](#chapter-10-prompt-engineering-for-m) |
22 | 22 | - [Custom Templates](#custom-templates) |
23 | 23 | - [Chapter 11: Tool Calling](#chapter-11-tool-calling) |
| 24 | +- [Chapter 12: Asynchronicity](#chapter-12-asynchronicity) |
24 | 25 | - [Appendix: Contributing to Melles](#appendix-contributing-to-mellea) |
25 | 26 |
|
26 | 27 | ## Chapter 1: What Is Generative Programming |
@@ -943,6 +944,23 @@ or the entire last turn (user query + assistant response): |
943 | 944 | print(m.ctx.last_turn()) |
944 | 945 | ``` |
945 | 946 |
|
| 947 | +You can also use `session.clone()` to create a copy of a given session with its context at given point in time. This allows you to make multiple generation requests with the same objects in your context: |
| 948 | +```python |
| 949 | +m = start_session(ctx=ChatContext()) |
| 950 | +m.instruct("Multiply 2x2.") |
| 951 | + |
| 952 | +m1 = m.clone() |
| 953 | +m2 = m.clone() |
| 954 | + |
| 955 | +# Need to run this code in an async event loop. |
| 956 | +co1 = m1.ainstruct("Multiply that by 3") |
| 957 | +co2 = m2.ainstruct("Multiply that by 5") |
| 958 | + |
| 959 | +print(await co1) # 12 |
| 960 | +print(await co2) # 20 |
| 961 | +``` |
| 962 | +In the above example, both requests have `Multiply 2x2` and the LLM's response to that (presumably `4`) in their context. By cloning the session, the new requests both operate independently on that context to get the correct answers to 4 x 3 and 4 x 5. |
| 963 | + |
946 | 964 | ## Chapter 8: Implementing Agents |
947 | 965 |
|
948 | 966 | > **Definition:** An *agent* is a generative program in which an LLM determines the control flow of the program. |
@@ -1317,6 +1335,59 @@ assert "web_search" in output.tool_calls |
1317 | 1335 | result = output.tool_calls["web_search"].call_func() |
1318 | 1336 | ``` |
1319 | 1337 |
|
| 1338 | +## Chapter 12: Asynchronicity |
| 1339 | +Mellea supports asynchronous behavior in several ways: asynchronous functions and asynchronous event loops in synchronous functions. |
| 1340 | + |
| 1341 | +### Asynchronous Functions: |
| 1342 | +`MelleaSession`s have asynchronous functions that work just like regular async functions in python. These async session functions mirror their synchronous counterparts: |
| 1343 | +```python |
| 1344 | +m = start_session() |
| 1345 | +result = await m.ainstruct("Write your instruction here!") |
| 1346 | +``` |
| 1347 | + |
| 1348 | +However, if you want to run multiple async functions at the same time, you need to be careful with your context. By default, `MelleaSession`s use a `SimpleContext` that has no history. This will work just fine when running multiple async requests at once: |
| 1349 | +```python |
| 1350 | +m = start_session() |
| 1351 | +coroutines = [] |
| 1352 | + |
| 1353 | +for i in range(5): |
| 1354 | + coroutines.append(m.ainstruct(f"Write a math problem using {i}")) |
| 1355 | + |
| 1356 | +results = await asyncio.gather(*coroutines) |
| 1357 | +``` |
| 1358 | + |
| 1359 | +If you try to use a `ChatContext`, you will need to await between each request so that the context can be properly modified: |
| 1360 | +```python |
| 1361 | +m = start_session(ctx=ChatContext()) |
| 1362 | + |
| 1363 | +result = await m.ainstruct("Write a short fairy tale.") |
| 1364 | +print(result) |
| 1365 | + |
| 1366 | +main_character = await m.ainstruct("Who is the main character of the previous fairy tail?") |
| 1367 | +print(main_character) |
| 1368 | +``` |
| 1369 | + |
| 1370 | +Otherwise, you're requests will use outdated contexts that don't have the messages you expect. For example, |
| 1371 | +```python |
| 1372 | +m = start_session(ctx=ChatContext()) |
| 1373 | + |
| 1374 | +co1 = m.ainstruct("Write a very long math problem.") # Start first request. |
| 1375 | +co2 = m.ainstruct("Solve the math problem.") # Start second request with an empty context. |
| 1376 | + |
| 1377 | +results = await asyncio.gather(co1, co2) |
| 1378 | +for result in results: |
| 1379 | + print(result) # Neither request had anything in its context. |
| 1380 | + |
| 1381 | +print(m.ctx) # Only shows the operations from the second request. |
| 1382 | +``` |
| 1383 | + |
| 1384 | +Additionally, see [Chapter 7: Context Management](#chapter-7-on-context-management) for an example of how to use `session.clone()` to avoid these context issues. |
| 1385 | + |
| 1386 | +### Asynchronicity in Synchronous Functions |
| 1387 | +Mellea utilizes asynchronicity internally. When you call `m.instruct`, you are using synchronous code that executes an asynchronous request to an LLM to generate the result. For a single request, this won't cause any differences in execution speed. |
| 1388 | + |
| 1389 | +When using `SamplingStrategy`s or during validation, Mellea can speed up the execution time of your program by generating multiple results and validating those results against multiple requirements simultaneously. Whether you use `m.instruct` or the asynchronous `m.ainstruct`, Mellea will attempt to speed up your requests by dispatching those requests as quickly as possible and asynchronously awaiting the results. |
| 1390 | + |
1320 | 1391 | ## Appendix: Contributing to Mellea |
1321 | 1392 |
|
1322 | 1393 | ### Contributor Guide: Requirements and Verifiers |
|
0 commit comments