Running an AI example with Walrus project

This issue summarizes running a simpler AI model on Walrus (13/10/2025).

The llama project ( https://www.llama.com/ ) was chosen for running the model. A few changes were made since wasm is not a regular target for the project:

a simple build script were made, because the default one does not support wasm (with emscripten) compilation
the following features were disabled: llama_curl, multithreading, and ggml_native
some (minor) code changes were made: disabling process priority setting, fixed some include errors, emulating mmap (wasm has no concept of mmap), implement a few missing libc functions

Since Memory64 has not been supported by Walrus yet, the maximum available memory is 4GB, which limits the models that can be used. The following model were selected: lille-130m-instruct-f16.gguf:

https://huggingface.co/Nikity/lille-130m-instruct

x86 platform results

The x86 measurements were made on an Intel I7-7700 @ 3.6 GHz cpu with 32GB RAM.

64 bit results:

Interpreter: 234s
JIT no-reg-alloc: 73s (3.2 times as fast)
JIT: 27s (8.6 times as fast)

32 bit results:

Interpreter: 407s
JIT no-reg-alloc: 74s (5.5 times as fast)
JIT: 30s (13.5 times as fast)

ARM platform results

The ARM measurements were made on a Raspberry PI 4 ARM Cortex-A72 @ 1.8 Ghz with 4GB RAM.

ARM64 mode:

Interpreter: 1060.2s
JIT no-reg-alloc: 429.91s (2.46 times as fast)
JIT: 148.67s (7.1 times as fast)

ARM32 (Thumb2 instruction set) mode:

Interpreter: 1439.4s
JIT no-reg-alloc: 485.06s (2.9 times as fast)
JIT: 169.95s (8.4 times as fast)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Running an AI example with Walrus project

This issue summarizes running a simpler AI model on Walrus (13/10/2025).

x86 platform results

ARM platform results

Uh oh!

Clone this wiki locally