Why the release pipeline and Electron bundle work the way they do.
We ship separate Milady-arm64.dmg and Milady-x64.dmg because:
- Native Node addons (e.g.
onnxruntime-node,whisper-node) ship prebuilt.nodebinaries per OS and arch. There is no single "universal" npm artifact that contains both arm64 and x64; the addon is built for the arch of the machine that rannpm install/bun install. - CI runs on arm64 (macos-14). If we only ran
bun installandbun run buildin the host arch,node_moduleswould contain only arm64.nodefiles. The packaged app would then fail on Intel with "Cannot find module .../darwin/x64/onnxruntime_binding.node". - So for the macos-x64 artifact we run install and Electron build under Rosetta (
arch -x86_64 bun install,arch -x86_64 bun run build). That makes the install and any native rebuilds produce x64 binaries, so the Intel DMG works.
See .github/workflows/release.yml: the "Install root dependencies", "Install Electron dependencies", and "Build Electron app" steps branch on matrix.platform.artifact-name === "macos-x64" and wrap the command in arch -x86_64 when building the Intel artifact.
The packaged app runs the agent from milady-dist/ (bundled JS + node_modules). The main bundle is built by tsdown with dependencies inlined where possible, but:
- Plugins (
@elizaos/plugin-*) are loaded at runtime; their dist/ and any runtime-only dependencies (native addons, optional requires, etc.) must be present inmilady-dist/node_modules. - Why not rely on a single global node_modules at pack time? The app is built into an ASAR (and unpacked dirs); resolution at runtime is from the app directory. So we copy the subset we need into
apps/app/electron/milady-dist/node_modulesbeforeelectron-builderruns.
The script scripts/copy-electron-plugins-and-deps.mjs:
- Discovers which
@elizaos/*packages to copy (from root package.json; plugins must have adist/folder). - Copies those packages into
milady-dist/node_modules. - Walks each package's
package.jsondependencies (and optionalDependencies) recursively and copies those too. Why: Plugins declare what they need; we derive the full set so we don't maintain a manual list and miss new deps. - Skips known dev/renderer-only packages (e.g.
typescript,lucide-react) to avoid bloating the bundle. See script header andDEP_SKIPfor rationale.
We do not try to exclude deps that might already be inlined by tsdown into plugin dist/, because plugins can require() at runtime; excluding them would risk "Cannot find module" in the packaged app.
The release workflow (.github/workflows/release.yml) is designed for reproducible, fail-fast builds and diagnosable failures. Key choices and their reasons:
- Strict shell (
bash -euo pipefail) β Applied at job default forbuild-desktopso every step exits on first error, undefined variable, or pipe failure. Why: Without it, a failing command in the middle of a script can be ignored and the step still "succeeds", producing broken artifacts or confusing later failures. - Retry loops with final assertion β
bun installsteps retry up to 3 times, then run the same install command once more after the loop. Why: If all retries failed, the loop exits without failing the step; the final run ensures the step fails with a clear install error instead of silently continuing. - Crash dump uses
@electron/asarβ When electron-builder crashes, we list ASAR contents withnpx @electron/asar list, not the deprecatedasarpackage. Why: The deprecated package can be missing or incompatible;@electron/asaris the maintained tool and works when the build fails. find -print0andwhile IFS= read -r -d ''β Copying JS intomilady-distand removing node-gyp artifacts use null-delimited find + read. Why: Filenames with newlines or spaces would breakfind | while read; null-delimited iteration is safe for any path.- DMG path via
find+stat -fβ We pick the newest DMG withfind dist -name '*.dmg' -exec stat -f '%m\t%N' {} \; | sort -rn | head -1instead ofls -t dist/*.dmg. Why:ls -twith a glob can fail or behave oddly when no DMG exists or paths have spaces; find + stat is robust and this step runs only on macOS wherestat -fis available. - Remove node-gyp build artifacts before packaging β We delete
build-tmp*andnode_gyp_binsundernode_modules(root and milady-dist). Why: @tensorflow/tfjs-node and other native addons leave symlinks to system Python there; electron-builder refuses to pack symlinks to paths outside the app (security), so the pack step would fail without removal. - Size report includes
milady-distβ We report sizes of bothapp.asar.unpacked/node_modulesandapp.asar.unpacked/milady-dist(and its node_modules when present). Why: Both regions contribute to artifact size; reporting both makes it obvious where bloat comes from. - Size report
du | sort | headpipelines β We run each pipeline in a subshell and capture exit code with( pipeline ) || r=$?, then allow 0 or 141; we also redirectsortstderr to/dev/null. Why: Underbash -euo pipefail, whenheadcloses the pipe after N lines,sortgets SIGPIPE and exits 141; the step would exit beforer=$?ran. The subshell +||lets us treat 141 as success. Silencingsortavoids noisy "Broken pipe" in logs. - Windows: plugin prepare script uses
npx -p typescript tscβ Inpackages/plugin-bnb-identity/build.tswe invokenpx -p typescript tscinstead ofnpx tsc. Why: On Windows (and some CI environments),npx tsccan resolve to the npm packagetsc(a joke package that prints "This is not the tsc command you are looking for") instead of the TypeScript compiler. Explicitly using thetypescriptpackage avoids that and makes the release Windows build succeed. - Single Capacitor build step β One "Build Capacitor app" step runs
npx vite buildon all platforms. Why: The previous split (non-Windows vs Windows) was redundant; vite build works everywhere, so one step reduces drift and confusion. - Packaged DMG E2E: 240s CDP timeout in CI, stdout/stderr dump on timeout β In CI we use a longer CDP wait and on timeout we log app stdout/stderr before failing. Why: CI can be slower; a longer timeout reduces flaky failures. Dumping logs makes CDP timeouts debuggable instead of silent.
CI workflows that need Node (for node-gyp / native modules or npm registry) were timing out on Node download and install. We fixed this as follows.
useblacksmith/setup-node@v5on Blacksmith runners β Intest.yml, jobs that run onblacksmith-4vcpu-ubuntu-2404(app-startup-e2e, electron-ui-e2e Linux) useuseblacksmith/setup-nodeinstead ofactions/setup-node. Why: Blacksmithβs action uses their colocated cache (same DC as the runner), so Node binaries are served at ~400MB/s and we avoid slow or failing downloads from nodejs.org.actions/setup-node@v3(not v4) on GitHub-hosted runners β Release, test (macOS legs), nightly, publish-npm, and other workflows pin to@v3. Why: v4 has a known slow post-action step and often triggers nodejs.org downloads that time out; v3 uses the runner toolcache when the version is present and avoids the regression.check-latest: falseβ We set this explicitly on everyactions/setup-nodestep (Blacksmith jobs useuseblacksmith/setup-node, which has its own caching behavior). Why: With the default, the action can hit nodejs.org to check for a newer patch; that adds latency and can timeout. We want a fixed, cached Node version for reproducible CI.- Bun global cache (
~/.bun/install/cache) β test.yml, release.yml, benchmark-tests.yml, publish-npm.yml, and nightly.yml all cache this path withactions/cache@v4keyed bybun.lock. Why: Bun install is fast, but re-downloading every package every run was still a major cost; caching the global cache avoids re-downloading tarballs while lettingbun installdo its fast hardlink/clonefile intonode_modules. We do not cachenode_modulesitself β compression/upload cost exceeds the gain. timeout-minuteson jobs β We set explicit timeouts (e.g. 20β30 min for test jobs, 45 for release build-desktop). Why: So a hung or extremely slow run fails in a bounded time instead of burning runner hours; also makes flakiness visible.
- Release:
.github/workflows/release.ymlβ on version tag push; builds all platforms and uploads artifacts. - Local desktop build: From repo root, build core and app, then e.g.
cd apps/app/electron && bunx electron-builder build --mac --arm64 --publish never. For a full signed/notarized local test, seescripts/verify-build.sh(macOS).
- Electron startup and exception handling β why the agent keeps the API server up on load failure.
- Plugin resolution and NODE_PATH β why dynamic plugin imports need
NODE_PATHin dev/CLI/Electron. - CHANGELOG β concrete changes and WHYs per release.