This is an spec compliant ao Compute Unit, implemented using NodeJS
- Prerequisites
- Usage
- Environment Variables
- Tests
- Debug Logging
- Manually Trigger Checkpointing
- Project Structure
- System Requirements
You will need Node >=18 installed. If you use nvm, then simply run
nvm install, which will install Node 22
In order to use wasm 64 with more 4GB of process memory, you will need to use Node
22
First install dependencies using npm i
You will need a .env file. A minimal example is provided in .env.example.
Make sure to set the
WALLETenvironment variable to the JWK Interface of the Arweave Wallet the CU will use.
Then simply start the server using npm start.
During development, you can npm run dev. This will start a hot-reload process.
Either command will start a server listening on PORT (6363 by default).
There are a few environment variables that you can set. Besides
WALLET/WALLET_FILE, they each have a default:
GATEWAY_URL: The url of the Arweave gateway to use. (Defaults tohttps://arweave.net)
GATEWAY_URLis solely used as a fallback for bothARWEAVE_URLandGRAPHQL_URL, if not provided (see below).
ARWEAVE_URL: The url for the Arweave http API server, to be used by the CU to fetch transaction data from Arweave, specifically aoModules, andMessageAssignments. (Defaults toGATEWAY_URL)GRAPHQL_URL: The url for the Arweave Gateway GraphQL server to be used by the CU. (Defaults to${GATEWAY_URL}/graphql)CHECKPOINT_GRAPHQL_URL: The url for the Arweave Gateway GraphQL server to be used by the CU specifically for querying for Checkpoints, if the default gateway fails. (Defaults toGRAPHQL_URL)UPLOADER_URL: The url of the uploader to use to upload ProcessCheckpointsto Arweave. (Defaults toup.arweave.net)WALLET/WALLET_FILE: the JWK Interface stringified JSON that will be used by the CU, or a file to load it fromPORT: Which port the web server should listen on (defaults to port6363)DB_MODE: Whether the database being used by the CU is embedded within the CU or is remote to the CU. Can be eitherembeddedorremote(defaults toembedded)DB_URL: the name of the embdeeded database (defaults toao-cache)DUMP_PATH: the path to sendheapsnapshots to. (See Heap Snapshots)PROCESS_WASM_MEMORY_MAX_LIMIT: The maximumMemory-Limit, in bytes, supported foraoprocesses (defaults to1GB)PROCESS_WASM_COMPUTE_MAX_LIMIT: The maximumCompute-Limit, in bytes, supported foraoprocesses (defaults to9 billion)PROCESS_WASM_SUPPORTED_FORMATS: the wasm module formats that this CU supports, as a comma-delimited string (defaults to['wasm32-unknown-emscripten', 'wasm32-unknown-emscripten2'])PROCESS_WASM_SUPPORTED_EXTENSIONS: the wasm extensions that this CU supports, as a comma-delimited string (defaults to no extensions)WASM_EVALUATION_MAX_WORKERS: The number of workers to use for evaluating messages (Defaults to3)WASM_BINARY_FILE_DIRECTORY: The directory to cache wasm binaries downloaded from arweave. (Defaults to the os temp directory)WASM_MODULE_CACHE_MAX_SIZE: The maximum size of the in-memory cache used for Wasm modules (Defaults to5wasm modules)WASM_INSTANCE_CACHE_MAX_SIZE: The maximum size of the in-memory cache used for loaded Wasm instances (defaults to5loaded wasm instance)PROCESS_CHECKPOINT_FILE_DIRECTORY: the directory to cache created/found Checkpoints, from arweave, for quick retrieval later (Defaults to the os temp directory)PROCESS_MEMORY_CACHE_MAX_SIZE: The maximum size, in bytes, of the LRU In-Memory cache used to cache the latest memory evaluated for ao processes.PROCESS_MEMORY_CACHE_TTL: The time-to-live for a cache entry in the process latest memory LRU In-Memory cache. An entries age is reset each time it is accessedPROCESS_MEMORY_CACHE_DRAIN_TO_FILE_THRESHOLD: The time in milliseconds that a process' Memory should kept hot in memory, as part of the cache entry, before being drained into a file. This is useful to free up memory from processes who've been evalated and cached, but have not been accessed recently. Set to zero to always keep process' Memory hot in memory, as the expense of more heap usage (defaults to1m)PROCESS_MEMORY_CACHE_FILE_DIR: The directory to store cached process memory (Defaults to the os temp directory)PROCESS_MEMORY_CACHE_CHECKPOINT_INTERVAL: The interval at which the CU should Checkpoint all processes stored in it's cache. Set to0to disabled (defaults to4h)PROCESS_CHECKPOINT_CREATION_THROTTLE: The amount of time, in milliseconds, that the CU should wait before creating a processCheckpointIFF it has already created a Checkpoint for that process, since it last started. This is effectively a throttle onCheckpointcreation, for a given process (defaults to30 minutes)DISABLE_PROCESS_CHECKPOINT_CREATION: Whether to disable processCheckpointcreation uploads to Arweave. Set to any value to disableCheckpointcreation. (You must explicitly enableCheckpointcreation by setting -DISABLE_PROCESS_CHECKPOINT_CREATIONto'false')EAGER_CHECKPOINT_THRESHOLD: If an evaluation stream evaluates this amount of messages, then it will immediately create a Checkpoint at the end of the evaluation stream.MEM_MONITOR_INTERVAL: The interval, in milliseconds, at which to log memory usage on this CU.BUSY_THRESHOLD: The amount of time, in milliseconds, the CU should wait for a process evaluation stream to complete before sending a "busy" respond to the client (defaults to0sie. disabled). If disabled, this could cause the CU to maintain long-open connections with clients until it completes long process evaluation streams.RESTRICT_PROCESSES: A list of process ids that the CU should restrict aka. ablacklist(defaults to none)ALLOW_PROCESSES: The counterpart to RESTRICT_PROCESSES. When configured the CU will only execute these processes aka. awhitelist(defaults to allow all processes)ALLOW_OWNERS: A list of process owners, whose processes are allowed to execute on the CU aka. an ownerwhitelist(defaults to allow all owners)
You can execute unit tests by running npm test
You can enable verbose debug logging on the Server, by setting the DEBUG
environment variable to the scope of logs you're interested in
All logging is scoped under the name ao-cu*. You can use wildcards to enable a
subset of logs ie. ao-cu:readState*
If you'd like to manually trigger the CU to Checkpoint all Processes it has in
it's in-memory cache, you can do so by sending the node process a SIGUSR2
signal.
First, obtain the process id for the CU:
pgrep node
# or
lsof -i $PORTThen send a SIGUSR2 signal to that process: kill -USR2 <process>
This ao Compute Unit project loosely implements the
Ports and Adapters
Architecture.
Driving Adapter <--> [Port(Business Logic)Port] <--> Driven Adapter
All business logic is in src/domain where each public api is implemented,
tested, and exposed via a index.js (see Entrypoint)
/domain/lib contains all of the business logic steps that can be composed into
public apis (ie. domain/readState.js, domain/readResults.js, and
domain/readScheduledMessages.js)
dal.js contains the contracts for the driven adapters aka side-effects.
Implementations for those contracts are injected into, then parsed and invoked
by, the business logic. This is how we inject specific integrations with other
ao components and providers while keeping them separated from the business
logic -- the business logic simply consumes a black-box API -- making them easy
to stub, and business logic easy to unit tests for correctness.
Because the contract wrapping is done by the business logic itself, it also ensures the stubs we use in our unit tests accurately implement the contract API. Thus our unit tests are simoultaneously contract tests.
All driven adapters are located in /domain/client
domain/client contains implementations, of the contracts in dal.js, for
various platforms. The unit tests for the implementations in client also
import contracts from dal.js to help ensure that the implementation properly
satisfies the API.
Finally, the entrypoint /domain/index.js sets up the appropriate
implementations from client and injects them into the public apis.
Anything outside of domain should only ever import from domain/index.js.
All public routes exposed by the ao Compute Unit can be found in /routes.
Each route is composed in /route/index.js, which is then composed further in
app.js, the Fastify server. This is the Driving Adapter.
This ao Compute Unit uses simple function composition to achieve middleware
behavior on routes. This allows for a more idiomatic developer experience -- if
an error occurs, it can simply be thrown, which bubbles and is caught by a
middleware that is composed at the top (see withErrorHandler.js).
In fact, our routes don't event import fastify, and instead are injected an
instance of fastify to mount routes onto.
fastifymiddleware is still leveraged, it is abstracted away from the majority of the developer experience, only existing inapp.js
Business logic is injected into routes via a composed middleware withDomain.js
that attached config and business logic apis to req.domain. This is how
routes call into business logic, thus completing the Ports and Adapters
Architecture.
The ao Compute Unit Server is a stateless application, and can be deployed to
any containerized environment using its Dockerfile or using node directly.
Make sure you set the
WALLETenvironment variable so that is available to the CU runtime.
It will need to accept ingress from the Internet over HTTP(S) in order to
fulfill incoming requests, and egress to other ao Units over HTTP(S).
It will also need some sort of file system available, whether it be persistent or ephemeral.
So in summary, this ao Compute Unit system requirments are:
- a Containerization Environment or
nodeto run the application - a Filesystem to store files and an embedded database
- an ability to accept Ingress from the Internet
- an ability to Egress to other
aoUnits and to the Internet