AgentOS Workbench

An AgentOS product by Frame.dev

React + Vite dashboard for inspecting AgentOS sessions locally. The goal is to give builders a zero-config cockpit that mirrors how Frame.dev debugs adaptive agents.

GMIs, Agents, and Agency

GMIs (Generalised Mind Instances) package persona prompts, memory policies, tool permissions, language preferences, and guardrail hooks into reusable minds.
Agents wrap GMIs for product surfaces (labels, icons, availability) while preserving the GMI’s cognition and policy.
Agencies coordinate multiple GMIs (and humans) via workflows; the workbench visualises WORKFLOW_UPDATE and AGENCY_UPDATE events in the timeline.

Benefits:

Cohesive cognition: one unit to version, export, and reuse across apps
Guardrail-first: policy decisions are streamed and auditable
Portable: same GMI across cloud/desktop/mobile/browser (capability-aware)

Highlights

Sidebar session switcher backed by a lightweight zustand store
Timeline inspector that renders streaming @framers/agentos chunks with color-coded context
Request composer for prototyping turns or replaying transcripts (wire it to your backend when ready)
Adaptive execution dashboard (task-outcome KPI, fail-open overrides, tool-exposure recovery state)
Multi-tenant telemetry slices (scope + routing mode visibility for single-tenant and multi-tenant runs)
Discovery telemetry visibility (default tool-selection mode + recall profile from runtime config/stream payload)
Runtime inspector for the latest high-level and orchestration exports (generateText, generateImage, AgentGraph, workflow(), mission())
Dark, neon-drenched UI that matches the Frame.dev production command centre

Current orchestration status

The workbench can inspect whether the installed @framers/agentos package exports the new unified orchestration APIs.
The main compose stream path now forwards workflowRequest, agencyRequest, and preferred model selection into AgentOS through /api/agentos/stream.
The backend /api/agentos/agency/stream route now forwards real agency requests into AgentOS instead of emitting a timer-based fake stream.
The demo /api/agentos/agency/workflow/start + /stream pair now boot a real AgentOS-backed run, but they still emit demo-shaped UI events rather than exposing raw graph/checkpoint data.
The timeline can render legacy WORKFLOW_UPDATE and AGENCY_UPDATE chunks from the backend.
The Planning panel now reflects runtime-backed workflow and agency snapshots through the workbench planning store, including checkpoint history for inspection.
The plan inspector now supports restoring manual checkpoints and forking runtime-backed checkpoint snapshots into editable manual plans.
The backend now persists graph-run records for streamed workflow and agency executions, and the Planning panel includes a Runtime Runs browser plus recent runtime event traces for selected runtime-backed plans.
Runtime-run checkpoints can now be restored directly inside the persisted graph-run record and forked into new editable manual plans, even when there is no mirrored runtime plan row yet.
The workbench still lacks graph-native authoring and true GraphRuntime pause/resume controls.

Scripts

pnpm dev       # launch Vite dev server on http://localhost:5175
pnpm build     # production build (emits dist/)
pnpm bundle:report  # summarize dist/assets into output/bundle-report.{json,md}
pnpm bundle:baseline  # refresh bundle-baseline.json from the current report
pnpm bundle:check   # fail if the current bundle exceeds the workbench budgets
pnpm bundle:compare # fail if the current bundle regresses too far from bundle-baseline.json
pnpm build:check    # build, generate the bundle report, then enforce budgets
pnpm build:report   # build, then generate the bundle report
pnpm preview   # preview the built app
pnpm lint      # eslint
pnpm typecheck
pnpm e2e       # all Playwright suites (including smoke + screenshots)
pnpm e2e:chromium
pnpm e2e:firefox
pnpm e2e:workbench      # split workbench suites only
pnpm e2e:workbench:cross-browser  # chromium + firefox + serial webkit
pnpm e2e:workbench:chromium
pnpm e2e:workbench:firefox
pnpm e2e:workbench:webkit         # serialized for WebKit stability
pnpm e2e:core           # tabs/composer/personas/agency/header
pnpm e2e:eval-planning  # evaluation + planning flows
pnpm e2e:quality        # responsive/a11y/console scans
pnpm e2e:screenshots    # screenshot matrix
pnpm e2e:webkit         # full suite on WebKit, serialized
pnpm e2e:smoke:pw       # smoke.spec.ts via Playwright
pnpm e2e:smoke          # legacy smoke script (tsx e2e-test.ts)

The workbench WebKit run is currently serialized on purpose. The browser passes cleanly with one worker, but startup/navigation becomes flaky under the default parallelism used by Chromium and Firefox.

The CI workflow now uploads the generated bundle report as an artifact from the Typecheck + Build + Tests job, and it enforces both explicit bundle budgets and checked-in baseline deltas before the browser matrix starts. The current defaults cap total raw size, total gzip size, largest JS asset, entry JS, and largest CSS asset, and they also limit how far those metrics can drift above bundle-baseline.json. Both sets of thresholds can be overridden with WORKBENCH_BUNDLE_MAX_* and WORKBENCH_BUNDLE_BASELINE_MAX_* environment variables when needed.

For local fixture testing or ad hoc bundle analysis, the bundle scripts also support path overrides:

WORKBENCH_BUNDLE_DIST_ASSETS_DIR redirects bundle:report input away from the default dist/assets.
WORKBENCH_BUNDLE_OUTPUT_DIR redirects the generated bundle-report.{json,md} files and the bundle:baseline:update review artifacts.
WORKBENCH_BUNDLE_REPORT_PATH tells bundle:baseline, bundle:check, bundle:compare, and bundle:baseline:update which report JSON to read.
WORKBENCH_BUNDLE_BASELINE_PATH tells bundle:baseline, bundle:compare, and bundle:baseline:update which baseline JSON to write/read.

If a deliberate size increase is accepted, run the Refresh AgentOS Workbench Bundle Baseline workflow from GitHub Actions. It rebuilds the workbench, regenerates bundle-baseline.json, and uploads the refreshed baseline plus a patch, branch-name hint, PR body, apply script, summary, and suggested commit message as artifacts so the new snapshot can be reviewed and committed intentionally. The generated apply script now refuses to run unless the target repo is clean, the branch name is unused, and the patch still applies cleanly, and it accepts either the monorepo root or the nested apps/agentos-workbench repo root.

Workbench data modes

The workbench now exposes whether a surface is runtime, demo, or mixed instead of implying every panel is live.
The RAG workspace specifically can combine live retrieval with demo-backed document-library fallbacks.
The RAG upload flow also lets you choose a collection up front, and that choice now determines whether the ingest lands in the live runtime store or the demo library.
See RAG_RUNTIME_MODES.md for the current ingestion/search matrix and runtime setup requirements.

Storage, export, and import

Data is stored locally in your browser using IndexedDB (no server writes).
Stored: personas (remote + local), agencies, and sessions (timeline events).
Export per-session from the timeline header: "Export session", "Export agency", "Export workflow".
Export everything from Settings → Data → "Export all" (also available in the timeline).
Import from Settings → Data → "Import…" (schema: agentos-workbench-export-v1).
Clear local data from Settings → Data → "Clear storage" (export first if needed).

See docs/CLIENT_STORAGE_AND_EXPORTS.md for details.

Wiring it up

Copy .env.example → .env.local (or set env vars in your shell) and point the workbench at your backend:

# Option A: explicit API base URL
VITE_API_URL=http://localhost:3001

# Option B: same-origin `/api/*` with dev proxy target
VITE_BACKEND_PORT=3001
VITE_BACKEND_HOST=localhost
VITE_BACKEND_PROTOCOL=http

VITE_AGENTOS_* overrides are still supported for specialized stream/persona/workflow path tuning.

In the backend, ensure provider keys are set and configure runtime if needed:

AGENTOS_WORKBENCH_BACKEND_PORT=3001
AGENTOS_WORKBENCH_BACKEND_HOST=0.0.0.0
AGENTOS_WORKBENCH_EVALUATION_STORE_PATH=../.data/evaluation-store.json
AGENTOS_WORKBENCH_PLANNING_STORE_PATH=../.data/planning-store.json

Start the backend (pnpm --filter backend dev) and then run the workbench (pnpm --filter @framersai/agentos-workbench dev).
Use Compose for turns, Evaluation for benchmark runs, and Planning for plan lifecycle experiments.

The client mirrors the streaming contracts from @framers/agentos, so backend responses flow straight into the UI with no reshaping.

Onboarding

A first-run guided tour highlights tabs and controls. You can "Remind me later" or "Don't show again" (saved locally).

AgentOS HTTP endpoints (quick list)

POST /api/agentos/chat — send a turn (messages, mode, optional workflow)
GET /api/agentos/stream — SSE stream for incremental updates
GET /api/agentos/personas — list personas (filters: capability, tier, search)
GET /api/agentos/workflows/definitions — list workflow definitions
POST /api/agentos/agency/workflow/start — start the legacy agency workflow demo route
GET /api/agentos/graph-runs — list persisted runtime graph-run records mirrored from live workflow/agency streams
GET /api/agentos/graph-runs/:runId — inspect a single persisted runtime graph-run record
GET /api/evaluation/runs — list persisted evaluation runs
POST /api/evaluation/run — start a new evaluation run
GET /api/planning/plans — list persisted plans
POST /api/planning/plans — create a new plan

See backend/docs/index.html for the generated backend route docs.

Licensing

AgentOS core (@framers/agentos) — Apache 2.0
Marketplace and site components — MIT (vca.chat is the public marketplace we operate)

Links

Website: https://agentos.sh
Frame: https://frame.dev
Marketplace: https://vca.chat
GitHub: https://github.com/framersai/agentos
NPM: https://www.npmjs.com/package/@framers/agentos, https://www.npmjs.com/package/@framers/sql-storage-adapter

_{AgentOS product by Frame.dev}

Name		Name	Last commit message	Last commit date
Latest commit History 99 Commits
.github		.github
backend		backend
demo-automation		demo-automation
dist		dist
output		output
public		public
screenshots-e2e		screenshots-e2e
screenshots		screenshots
scripts		scripts
src		src
tests/e2e		tests/e2e
.DS_Store		.DS_Store
.env.example		.env.example
.env.local		.env.local
.eslintignore		.eslintignore
.eslintrc.cjs		.eslintrc.cjs
.gitignore		.gitignore
ACCESSIBILITY.md		ACCESSIBILITY.md
CHANGELOG.md		CHANGELOG.md
DESIGN_IMPROVEMENTS.md		DESIGN_IMPROVEMENTS.md
LICENSE		LICENSE
RAG_RUNTIME_MODES.md		RAG_RUNTIME_MODES.md
README.md		README.md
SEARCH_SETUP.md		SEARCH_SETUP.md
bundle-baseline.json		bundle-baseline.json
e2e-test.ts		e2e-test.ts
index.html		index.html
package.json		package.json
playwright.config.ts		playwright.config.ts
postcss.config.js		postcss.config.js
screenshot-all.ts		screenshot-all.ts
screenshot-chat-output.ts		screenshot-chat-output.ts
screenshot-chat.ts		screenshot-chat.ts
screenshot-full.ts		screenshot-full.ts
screenshot-output.ts		screenshot-output.ts
tailwind.config.ts		tailwind.config.ts
tsconfig.json		tsconfig.json
tsconfig.node.json		tsconfig.node.json
tsconfig.tsbuildinfo		tsconfig.tsbuildinfo
typedoc.json		typedoc.json
vite.config.ts		vite.config.ts

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AgentOS Workbench

An AgentOS product by Frame.dev

GMIs, Agents, and Agency

Highlights

Current orchestration status

Scripts

Workbench data modes

Storage, export, and import

Wiring it up

Onboarding

AgentOS HTTP endpoints (quick list)

Licensing

Links

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

AgentOS Workbench

An AgentOS product by Frame.dev

GMIs, Agents, and Agency

Highlights

Current orchestration status

Scripts

Workbench data modes

Storage, export, and import

Wiring it up

Onboarding

AgentOS HTTP endpoints (quick list)

Licensing

Links

About

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages