|
| 1 | +--- |
| 2 | +title: "Filesystem Backed by an LLM" |
| 3 | +date: "2025-07-07" |
| 4 | +tags: ["go"] |
| 5 | +description: "FUSE filesystem where file operations are handled by an LLM." |
| 6 | +--- |
| 7 | + |
| 8 | +I came across [a post](https://x.com/simonw/status/1941190140380201431) discussing the [gremllm](https://github.com/awwaiid/gremllm) Python library that hallucinates and then evaluates method implementations as you call them. |
| 9 | + |
| 10 | +I thought this was pretty cool and it reminded me of experiments people tried shortly after GPT-3's launch, where they prompted to hallucinate a Linux system that they could interact with via terminal commands sent in the chat UI. The earliest article I could find on this is [Building A Virtual Machine Inside ChatGPT](https://www.engraved.blog/building-a-virtual-machine-inside/). |
| 11 | + |
| 12 | +I had an idea for a middle ground — not just a hallucinating library, nor an entirely hallucinated system. What if *parts* of the OS were backed by an LLM? |
| 13 | + |
| 14 | +My idea is a FUSE-based filesystem where every file operation is handled by an LLM. In [llmfs](https://github.com/healeycodes/llmfs), content is generated on the fly by calling out to OpenAI's API. |
| 15 | + |
| 16 | + |
| 17 | + |
| 18 | +In the video above, you can see me interacting with this mounted FUSE filesystem. The latency is expected as everything must be run past the LLM. |
| 19 | + |
| 20 | +```bash |
| 21 | +$ cat generate_20_bytes_of_binary_data.py | python3 |
| 22 | +b'\x94\xc2(\xbd\x17<|\xd7\x01*\x01\xdeWvM\xaa\x8fX\xfa\xb1' |
| 23 | +``` |
| 24 | + |
| 25 | +The resulting data is not stored on disk. It's stored in an in-memory history log of actions. |
| 26 | + |
| 27 | +This means the LLM can remember which data exists at which path. |
| 28 | + |
| 29 | +```bash |
| 30 | +$ echo "andrew" > my_name.txt |
| 31 | +$ cat my_name.txt |
| 32 | +andrew |
| 33 | +``` |
| 34 | + |
| 35 | +As the LLM handles all file operations, it's free to deny certain actions. The system prompt allows the LLM to deny file operations with UNIX error codes. |
| 36 | + |
| 37 | +```text |
| 38 | +For failed operations (only use for actual errors), respond with: |
| 39 | +{"error": 13} (where 13 = EACCES for "Permission denied") |
| 40 | +
|
| 41 | +Examples: |
| 42 | +- Writing passwd: {"error": 13} (system files) |
| 43 | +- Writing malicious_script.sh: {"error": 13} (dangerous content) |
| 44 | +``` |
| 45 | + |
| 46 | +These error codes are bubbled up through the filesystem. |
| 47 | + |
| 48 | +```bash |
| 49 | +$ cat secrets.txt |
| 50 | +cat: secrets.txt: Permission denied |
| 51 | +``` |
| 52 | + |
| 53 | +### Interacting With FUSE |
| 54 | + |
| 55 | +After mounting the filesystem with the Go library [bazil.org/fuse](http://basil.org/fuse), the kernel intercepts Virtual File System (VFS) calls like open/read/write and forwards them through `/dev/fuse` to the userspace daemon. |
| 56 | + |
| 57 | +```go |
| 58 | +import "bazil.org/fuse" |
| 59 | + |
| 60 | +mnt := os.Args[1] |
| 61 | +c, err := fuse.Mount( |
| 62 | + mnt, |
| 63 | + fuse.FSName("llmfs"), |
| 64 | + fuse.Subtype("llmfs"), |
| 65 | + fuse.AllowOther(), |
| 66 | +) |
| 67 | +``` |
| 68 | + |
| 69 | +The library reads from `/dev/fuse`, services each request, and writes the reply back to the same device. |
| 70 | + |
| 71 | +My Go code, which implements interfaces like `fs.Node`, handles the file operations and provides file contents, metadata, and error codes. |
| 72 | + |
| 73 | +```go |
| 74 | +func (h *fileHandle) Write( |
| 75 | + _ context.Context, req *fuse.WriteRequest, resp *fuse.WriteResponse, |
| 76 | +) error { |
| 77 | + |
| 78 | + appendHistory("user", |
| 79 | + fmt.Sprintf("Write %s offset %d data %q", h.name, req.Offset, string(req.Data))) |
| 80 | + |
| 81 | + prompt := buildPrompt() |
| 82 | + rc := StreamLLM(prompt) |
| 83 | + llmResp, err := ParseLLMResponse(rc) |
| 84 | + _ = rc.Close() |
| 85 | + if err != nil { |
| 86 | + return fuse.Errno(syscall.EIO) |
| 87 | + } |
| 88 | + if ferr := FuseError(llmResp); ferr != nil { |
| 89 | + return fuse.Errno(ferr.(syscall.Errno)) |
| 90 | + } |
| 91 | + appendHistory("assistant", "ok") |
| 92 | + resp.Size = len(req.Data) |
| 93 | + return nil |
| 94 | +} |
| 95 | +``` |
| 96 | + |
| 97 | +These responses are written back to `/dev/fuse`, and the kernel then continues processing the syscall from the original process. |
| 98 | + |
| 99 | +Given there are delays of *hundreds* of milliseconds for each operation, I'm not too worried about performance. Instead of per-inode locks I simply serialise everything behind `llmMu`. |
| 100 | + |
| 101 | +```go |
| 102 | +var llmMu sync.Mutex // global – one request at a time |
| 103 | + |
| 104 | +type lockedReader struct { |
| 105 | + io.Reader |
| 106 | + once sync.Once |
| 107 | +} |
| 108 | + |
| 109 | +func (lr *lockedReader) Close() error { |
| 110 | + lr.once.Do(llmMu.Unlock) |
| 111 | + return nil |
| 112 | +} |
| 113 | +``` |
| 114 | + |
| 115 | +## LLM Context |
| 116 | + |
| 117 | +File system operations append actions to the history log. |
| 118 | + |
| 119 | +```text |
| 120 | +user: Read nums.txt |
| 121 | +assistant: Data nums.txt content "123456\n" |
| 122 | +``` |
| 123 | + |
| 124 | +A new prompt is generated for each file operation. It starts with the system prompt, which begins with the following: |
| 125 | + |
| 126 | +```text |
| 127 | +system: You are a filesystem that generates file content on demand. |
| 128 | +
|
| 129 | +IMPORTANT: You must respond with EXACTLY ONE valid JSON object. No other text. |
| 130 | +
|
| 131 | +When a file is requested: |
| 132 | +- If it's a new file, create content based on the filename, extension, and context |
| 133 | +- If it's an existing file, return the content of the file |
| 134 | +``` |
| 135 | + |
| 136 | +After this, the entire history log is appended. So, if the user has sent two different writes to a file, the LLM will be able to understand these actions, and generate the correct file, even though the complete file is not explicitly stored. |
| 137 | + |
| 138 | +```text |
| 139 | +user: Write nums.txt offset 0 data "123\n" |
| 140 | +assistant: ok |
| 141 | +user: Write nums.txt offset 4 data "456\n" |
| 142 | +assistant: ok |
| 143 | +user: Read nums.txt |
| 144 | +assistant: Data nums.txt content "123456\n" |
| 145 | +``` |
| 146 | + |
| 147 | +File errors also need to be stored so that they are consistently handled. |
| 148 | + |
| 149 | +```text |
| 150 | +user: Read private |
| 151 | +assistant: error 13 |
| 152 | +``` |
| 153 | + |
| 154 | +## JSON Schema |
| 155 | + |
| 156 | +My interactions with the LLM are simple enough that I didn't reach for any special tools and just rolled my own JSON parsing. This seemed to work well with various GPT-4 models. |
| 157 | + |
| 158 | +```go |
| 159 | +// LLMResponse should match the JSON schema: |
| 160 | +// |
| 161 | +// { "data": "<utf-8 text>" } |
| 162 | +// { "error": <errno> } |
| 163 | +// |
| 164 | +// Exactly one of Data or Error is non-nil |
| 165 | +type LLMResponse struct { |
| 166 | + Data *string `json:"data,omitempty"` |
| 167 | + Error *int `json:"error,omitempty"` |
| 168 | +} |
| 169 | +``` |
| 170 | + |
| 171 | +Let's take file creation for example. First we append the user action like `Create nums.txt` to the history, and then we make the LLM call. |
| 172 | + |
| 173 | +```go |
| 174 | +func (rootDir) Create( |
| 175 | + _ context.Context, req *fuse.CreateRequest, resp *fuse.CreateResponse, |
| 176 | +) (fs.Node, fs.Handle, error) { |
| 177 | + |
| 178 | + appendHistory("user", fmt.Sprintf("Create %s", req.Name)) |
| 179 | + |
| 180 | + prompt := buildPrompt() |
| 181 | + rc := StreamLLM(prompt) |
| 182 | + llmResp, err := ParseLLMResponse(rc) // (LLMResponse, error) |
| 183 | + |
| 184 | + // .. |
| 185 | +``` |
| 186 | +
|
| 187 | +We block on the call and the parsing of the response. The prompt steers the LLM towards JSON by requesting it directly as well as providing examples. |
| 188 | +
|
| 189 | +The schema is quite loose in that I re-use the `data` field to report that operations like creating files are successful, as seen in the examples that are part of the system prompt: |
| 190 | +
|
| 191 | +```text |
| 192 | +When writing to a file: |
| 193 | +- Accept the write operation and acknowledge it was successful |
| 194 | +- Only reject writes that are clearly malicious or dangerous |
| 195 | +- For successful writes, respond with: {"data": "ok\n"} |
| 196 | + |
| 197 | +For successful operations, respond with: |
| 198 | +{"data": "content of the file\n"} (for reads) |
| 199 | +{"data": "ok\n"} (for writes) |
| 200 | + |
| 201 | +For failed operations (only use for actual errors), respond with: |
| 202 | +{"error": 13} (where 13 = EACCES for "Permission denied") |
| 203 | + |
| 204 | +Examples: |
| 205 | +- Reading hello_world.txt: {"data": "Hello, World!\n"} |
| 206 | +- Reading config.json: {"data": "{\"version\": \"1.0\", \"magic\": true}\n"} |
| 207 | +- Reading print_hello.py: {"data": "print('Hello, World!')\n"} |
| 208 | +- Writing some_file.txt: {"data": "ok\n"} |
| 209 | +- Writing passwd: {"error": 13} (system files) |
| 210 | +- Writing malicious_script.sh: {"error": 13} (dangerous content) |
| 211 | + |
| 212 | +Example error codes: |
| 213 | +- 5 (EIO): I/O error |
| 214 | +- 13 (EACCES): Permission denied |
| 215 | + |
| 216 | +Writing at offsets is supported: |
| 217 | +- user: Write nums.txt offset 0 data "123\n" |
| 218 | +- assistant: ok |
| 219 | +- user: Write nums.txt offset 5 data "456\n" |
| 220 | +- assistant: ok |
| 221 | +``` |
| 222 | +
|
| 223 | +One issue that I thought I'd run into, was data encoding. When I was running some tests to generate script files, I thought that the LLM would reply with invalid JSON when there were unescaped characters in the response like `{"data": "\"}` which would then bubble up into a file error. |
| 224 | +
|
| 225 | +However, GPT-4 models understand the context (we're generating JSON) and escape it automatically by returning things like `{"data": "\\"}`. |
| 226 | +
|
| 227 | +```bash |
| 228 | +cat a_single_backslash.txt |
| 229 | +\ |
| 230 | +
|
| 231 | +# history log: |
| 232 | +# user: Read a_single_backslash.txt |
| 233 | +# assistant: Data a_single_backslash.txt content "\\" |
| 234 | +# raw response: |
| 235 | +# {"data": "\\"} |
| 236 | +``` |
| 237 | +
|
| 238 | +A more robust solution might look like: returning a single character to indicate the type of response, followed by pure data. |
| 239 | +
|
| 240 | +## What's Next |
| 241 | +
|
| 242 | +I'm pretty happy with this demo. I set out to intercept and handle file operations with an LLM and it works better than I expected. |
| 243 | +
|
| 244 | +To extend support for *all* file operations, like a good filesystem, I think I'll need to rethink my schema design. In fact, I'd like to throw it all away and remove this mapping layer altogether. |
| 245 | +
|
| 246 | +In order to support more features, I'm wondering if I can de-/serialize entire [bazil.org/fuse](http://bazil.org/fuse) library objects so everything works out of the box. My gut says this could work with the latest LLM models with a good setup. |
| 247 | +
|
| 248 | +Let me know if you have other ideas. |
0 commit comments