Skip to content

Commit 689e1a9

Browse files
authored
feat: add async functions (#169)
* feat: add pytest-asyncio * feat: add async funcs and session funcs * fix: add default RejectionSamplingStrategy to session funcs to make it explicit * fix: default sampling strat for session funcs * test: add minimum test examples for async session funcs * docs: add async to tutorial * feat: add warning for async with non-Simple contexts * fix: add async session tests with chat context * feat: add clone method to session; remove refs to model_opts in session * feat: add async generative slots * fix: remove sessions' backend stack * fix: remove testing code * fix: docstrings * fix: test failing due to sampling copy
1 parent 517e9c5 commit 689e1a9

File tree

17 files changed

+1236
-300
lines changed

17 files changed

+1236
-300
lines changed

docs/tutorial.md

Lines changed: 71 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -21,6 +21,7 @@
2121
- [Chapter 10: Prompt Engineering for Mellea](#chapter-10-prompt-engineering-for-m)
2222
- [Custom Templates](#custom-templates)
2323
- [Chapter 11: Tool Calling](#chapter-11-tool-calling)
24+
- [Chapter 12: Asynchronicity](#chapter-12-asynchronicity)
2425
- [Appendix: Contributing to Melles](#appendix-contributing-to-mellea)
2526

2627
## Chapter 1: What Is Generative Programming
@@ -943,6 +944,23 @@ or the entire last turn (user query + assistant response):
943944
print(m.ctx.last_turn())
944945
```
945946

947+
You can also use `session.clone()` to create a copy of a given session with its context at given point in time. This allows you to make multiple generation requests with the same objects in your context:
948+
```python
949+
m = start_session(ctx=ChatContext())
950+
m.instruct("Multiply 2x2.")
951+
952+
m1 = m.clone()
953+
m2 = m.clone()
954+
955+
# Need to run this code in an async event loop.
956+
co1 = m1.ainstruct("Multiply that by 3")
957+
co2 = m2.ainstruct("Multiply that by 5")
958+
959+
print(await co1) # 12
960+
print(await co2) # 20
961+
```
962+
In the above example, both requests have `Multiply 2x2` and the LLM's response to that (presumably `4`) in their context. By cloning the session, the new requests both operate independently on that context to get the correct answers to 4 x 3 and 4 x 5.
963+
946964
## Chapter 8: Implementing Agents
947965

948966
> **Definition:** An *agent* is a generative program in which an LLM determines the control flow of the program.
@@ -1317,6 +1335,59 @@ assert "web_search" in output.tool_calls
13171335
result = output.tool_calls["web_search"].call_func()
13181336
```
13191337

1338+
## Chapter 12: Asynchronicity
1339+
Mellea supports asynchronous behavior in several ways: asynchronous functions and asynchronous event loops in synchronous functions.
1340+
1341+
### Asynchronous Functions:
1342+
`MelleaSession`s have asynchronous functions that work just like regular async functions in python. These async session functions mirror their synchronous counterparts:
1343+
```python
1344+
m = start_session()
1345+
result = await m.ainstruct("Write your instruction here!")
1346+
```
1347+
1348+
However, if you want to run multiple async functions at the same time, you need to be careful with your context. By default, `MelleaSession`s use a `SimpleContext` that has no history. This will work just fine when running multiple async requests at once:
1349+
```python
1350+
m = start_session()
1351+
coroutines = []
1352+
1353+
for i in range(5):
1354+
coroutines.append(m.ainstruct(f"Write a math problem using {i}"))
1355+
1356+
results = await asyncio.gather(*coroutines)
1357+
```
1358+
1359+
If you try to use a `ChatContext`, you will need to await between each request so that the context can be properly modified:
1360+
```python
1361+
m = start_session(ctx=ChatContext())
1362+
1363+
result = await m.ainstruct("Write a short fairy tale.")
1364+
print(result)
1365+
1366+
main_character = await m.ainstruct("Who is the main character of the previous fairy tail?")
1367+
print(main_character)
1368+
```
1369+
1370+
Otherwise, you're requests will use outdated contexts that don't have the messages you expect. For example,
1371+
```python
1372+
m = start_session(ctx=ChatContext())
1373+
1374+
co1 = m.ainstruct("Write a very long math problem.") # Start first request.
1375+
co2 = m.ainstruct("Solve the math problem.") # Start second request with an empty context.
1376+
1377+
results = await asyncio.gather(co1, co2)
1378+
for result in results:
1379+
print(result) # Neither request had anything in its context.
1380+
1381+
print(m.ctx) # Only shows the operations from the second request.
1382+
```
1383+
1384+
Additionally, see [Chapter 7: Context Management](#chapter-7-on-context-management) for an example of how to use `session.clone()` to avoid these context issues.
1385+
1386+
### Asynchronicity in Synchronous Functions
1387+
Mellea utilizes asynchronicity internally. When you call `m.instruct`, you are using synchronous code that executes an asynchronous request to an LLM to generate the result. For a single request, this won't cause any differences in execution speed.
1388+
1389+
When using `SamplingStrategy`s or during validation, Mellea can speed up the execution time of your program by generating multiple results and validating those results against multiple requirements simultaneously. Whether you use `m.instruct` or the asynchronous `m.ainstruct`, Mellea will attempt to speed up your requests by dispatching those requests as quickly as possible and asynchronously awaiting the results.
1390+
13201391
## Appendix: Contributing to Mellea
13211392

13221393
### Contributor Guide: Requirements and Verifiers

0 commit comments

Comments
 (0)