You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+38-36Lines changed: 38 additions & 36 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -25,22 +25,24 @@ Monty avoids the cost, latency, complexity and general faff of using a full cont
25
25
Instead, it lets you safely run Python code written by an LLM embedded in your agent, with startup times measured in single digit microseconds not hundreds of milliseconds.
26
26
27
27
What Monty **can** do:
28
-
* Run a reasonable subset of Python code - enough for your agent to express what it wants to do
29
-
* Completely block access to the host environment: filesystem, env variables and network access are all implemented via external function calls the developer can control
30
-
* Call functions on the host - only functions you give it access to
31
-
* Run typechecking - monty supports full modern python type hints and comes with [ty](https://docs.astral.sh/ty/) included in a single binary to run typechecking
32
-
* Be snapshotted to bytes at external function calls, meaning you can store the interpreter state in a file or database, and resume later
33
-
* Startup extremely fast (<1μs to go from code to execution result), and has runtime performance that is similar to CPython (generally between 5x faster and 5x slower)
34
-
* Be called from Rust, Python, or Javascript - because Monty has no dependencies on cpython, you can use it anywhere you can run Rust
35
-
* Control resource usage - Monty can track memory usage, allocations, stack depth, and execution time and cancel execution if it exceeds preset limits
36
-
* Collect stdout and stderr and return it to the caller
37
-
* Run async or sync code on the host via async or sync code on the host
28
+
29
+
- Run a reasonable subset of Python code - enough for your agent to express what it wants to do
30
+
- Completely block access to the host environment: filesystem, env variables and network access are all implemented via external function calls the developer can control
31
+
- Call functions on the host - only functions you give it access to
32
+
- Run typechecking - monty supports full modern python type hints and comes with [ty](https://docs.astral.sh/ty/) included in a single binary to run typechecking
33
+
- Be snapshotted to bytes at external function calls, meaning you can store the interpreter state in a file or database, and resume later
34
+
- Startup extremely fast (<1μs to go from code to execution result), and has runtime performance that is similar to CPython (generally between 5x faster and 5x slower)
35
+
- Be called from Rust, Python, or Javascript - because Monty has no dependencies on cpython, you can use it anywhere you can run Rust
36
+
- Control resource usage - Monty can track memory usage, allocations, stack depth, and execution time and cancel execution if it exceeds preset limits
37
+
- Collect stdout and stderr and return it to the caller
38
+
- Run async or sync code on the host via async or sync code on the host
38
39
39
40
What Monty **cannot** do:
40
-
* Use the standard library (except a few select modules: `sys`, `typing`, `asyncio`, `dataclasses` (soon), `json` (soon))
41
-
* Use third party libraries (like Pydantic), support for external python library is not a goal
42
-
* define classes (support should come soon)
43
-
* use match statements (again, support should come soon)
41
+
42
+
- Use the standard library (except a few select modules: `sys`, `typing`, `asyncio`, `dataclasses` (soon), `json` (soon))
43
+
- Use third party libraries (like Pydantic), support for external python library is not a goal
44
+
- define classes (support should come soon)
45
+
- use match statements (again, support should come soon)
44
46
45
47
---
46
48
@@ -49,10 +51,11 @@ In short, Monty is extremely limited and designed for **one** use case:
49
51
**To run code written by agents.**
50
52
51
53
For motivation on why you might want to do this, see:
52
-
*[Codemode](https://blog.cloudflare.com/code-mode/) from Cloudflare
53
-
*[Programmatic Tool Calling](https://platform.claude.com/docs/en/agents-and-tools/tool-use/programmatic-tool-calling) from Anthropic
54
-
*[Code Execution with MCP](https://www.anthropic.com/engineering/code-execution-with-mcp) from Anthropic
55
-
*[Smol Agents](https://github.com/huggingface/smolagents) from Hugging Face
54
+
55
+
-[Codemode](https://blog.cloudflare.com/code-mode/) from Cloudflare
56
+
-[Programmatic Tool Calling](https://platform.claude.com/docs/en/agents-and-tools/tool-use/programmatic-tool-calling) from Anthropic
57
+
-[Code Execution with MCP](https://www.anthropic.com/engineering/code-execution-with-mcp) from Anthropic
58
+
-[Smol Agents](https://github.com/huggingface/smolagents) from Hugging Face
56
59
57
60
In very simple terms, the idea of all the above is that LLMs can work faster, cheaper and more reliably if they're asked to write Python (or Javascript) code, instead of relying on traditional tool calling. Monty makes that possible without the complexity of a sandbox or risk of running code directly on the host.
58
61
@@ -105,7 +108,6 @@ prompt: str = ''
105
108
m = pydantic_monty.Monty(
106
109
code,
107
110
inputs=['prompt'],
108
-
external_functions=['call_llm'],
109
111
script_name='agent.py',
110
112
type_check=True,
111
113
type_check_stubs=type_definitions,
@@ -151,13 +153,13 @@ data = fetch(url)
151
153
len(data)
152
154
"""
153
155
154
-
m = pydantic_monty.Monty(code, inputs=['url'], external_functions=['fetch'])
156
+
m = pydantic_monty.Monty(code, inputs=['url'])
155
157
156
158
# Start execution - pauses when fetch() is called
157
159
result = m.start(inputs={'url': 'https://example.com'})
158
160
159
161
print(type(result))
160
-
#> <class 'pydantic_monty.MontySnapshot'>
162
+
#> <class 'pydantic_monty.FunctionSnapshot'>
161
163
print(result.function_name) # fetch
162
164
#> fetch
163
165
print(result.args)
@@ -174,7 +176,7 @@ print(result.output)
174
176
175
177
#### Serialization
176
178
177
-
Both `Monty` and `MontySnapshot` can be serialized to bytes and restored later.
179
+
Both `Monty` and snapshot types like `FunctionSnapshot` can be serialized to bytes and restored later.
178
180
This allows caching parsed code or suspending execution across process boundaries:
@@ -337,15 +339,15 @@ I'll try to run through the most obvious alternatives, and why there aren't righ
337
339
338
340
NOTE: all these technologies are impressive and have widespread uses, this commentary on their limitations for our use case should not be seen as a criticism. Most of these solutions were not conceived with the goal of providing an LLM sandbox, which is why they're not necessary great at it.
See [./scripts/startup_performance.py](scripts/startup_performance.py) for the script used to calculate the startup performance numbers.
351
353
@@ -397,7 +399,7 @@ Running Python in WebAssembly via [Wasmer](https://wasmer.io/).
397
399
-**Security**: In principle WebAssembly should provide strong sandboxing guarantees.
398
400
-**Start latency**: The [wasmer](https://pypi.org/project/wasmer/) python package hasn't been updated for 3 years and I couldn't find docs on calling Python in wasmer from Python, so I called it via subprocess. Start latency was 66ms.
399
401
-**Setup complexity**: wasmer download is 100mb, the "python/python" package is 50mb.
400
-
-**FOSS**: I marked this as "free *" since the cost is zero but not everything seems to be open source. As of 2026-02-10 the [`python/python` wasmer package](https://wasmer.io/python/python) package has no readme, no license, no source link and no indication of how it's built, the recently uploaded versions show size as "0B" although the download is ~50MB - the build process for the Python binary is not clear and transparent. _(If I'm wrong here, please create an issue to correct correct me)_
402
+
-**FOSS**: I marked this as "free \*" since the cost is zero but not everything seems to be open source. As of 2026-02-10 the [`python/python` wasmer package](https://wasmer.io/python/python) package has no readme, no license, no source link and no indication of how it's built, the recently uploaded versions show size as "0B" although the download is ~50MB - the build process for the Python binary is not clear and transparent. _(If I'm wrong here, please create an issue to correct correct me)_
0 commit comments