Skip to content

Latest commit

 

History

History
144 lines (128 loc) · 29.2 KB

File metadata and controls

144 lines (128 loc) · 29.2 KB

← Back to README

Roadmap

Release narrative

0.5.0   Agents with boundaries                       — shipped
0.6.0   Boundaries you can audit                     — shipped (epic [#1911](../../issues/1911))
0.7.0   Boundaries you can enforce externally        — shipped (epic [#2879](../../issues/2879))
0.8.0   Interoperable, multimodal agents (+ grants)  — shipped (A2A v1, multimodal, RAG, composition, Gemini, capability grants)
0.9.0   Layer-2 sandbox backends                     — next (Docker/proxy/read-confinement)

0.8.0 shipped: agent-to-agent interop (A2A v1 — server + typed client), full multimodal (audio STT/TTS, vision, image generation), the RAG seam, richer composition (handoff / firstOf / .speculative / loopUntil / built-in aggregators / forum captains), human-in-the-loop gates + the eval harness, history compression, an eighth model provider (Google Gemini #1917), agent.json serialization (#4516), and capability grants (grants { allow / confirm } #4545), plus the agentic-web standards groundwork (AGNTCY / AG-UI / x402 / NLWeb, PRD §12.6–§12.9). The "sandbox backends" originally pencilled for 0.8 slipped: WasmSandbox (#2894) was closed won't-do (embedded-WASM-for-tools isn't rational; agent → WASM export is the separate forward track #4547), and DockerSandbox (#2895), the egress hostname-allowlist proxy (#2893), and read confinement (#4546) move to 0.9.0 (they want a Linux-capable environment to build + verify).

0.7.0 shipped (epic #2879): runtime enforcement of declared tool policies — Layer 1 in-JVM filesystem gate (#2890) + Layer 2 OS sandbox (#1916): macOS Seatbelt, Linux bubblewrap, firejail setuid fallback, plain-ProcessBuilder fallback; write-root + env + cwd confinement; default-deny network — and the standalone agents-kt CLI (#1923) for manifest generate/inspect/verify outside Gradle. Deferred to 0.8: WasmSandbox (#2894), DockerSandbox (#2895), the network hostname-allowlist proxy (#2893), and the grants { } structure DSL.

0.6.0 hero feature: the permission manifest / capability graph (#1912) — a deterministic YAML/JSON artifact showing every agent / skill / tool / memory access / MCP endpoint / provider / budget / policy boundary in a system. Build-time evidence for security review; the manifest hash (#1913) propagates into every runtime audit event so dynamic behaviour ties back to the signed-off capability graph.

The 0.6.0 epic (#1911) tracks the full acceptance criteria. The phase layout below remains time-based; the release-arc tags below each item show which release that item targets.


Phase 1 — Core DSL (in progress)

  • Agent<IN, OUT> with SRP enforcement
  • Agent.prompt — base context string for the LLM
  • Skills-only execution — all agents run through skills { implementedBy { } }
  • Skill.description — sells the skill to the LLM alongside its type signature
  • Skill.knowledge("key", "description") { } — named lazy context providers; loadFile() inside lambdas
  • Skill.toLlmDescription() — auto-generated markdown (name, types, description, knowledge index); llmDescription("...") override
  • Skill.toLlmContext() — full context: description markdown + all knowledge content
  • Skill.knowledgeTools() / KnowledgeTool(name, description, call) — tools model with lazy per-entry loading
  • then — sequential pipeline with composed execution (no runtime casts)
  • / — parallel fan-out with coroutine concurrency
  • * — forum shorthand with concurrent participants, last-agent captain, and onMentionEmitted
  • forum { participant(...); captain(...); allowForumReturn(...) } — explicit forum roles and finalization permissions
  • Single-placement enforcement across all structure types
  • .loop {} — iterative execution with (OUT) -> IN? feedback block
  • .branch {} — conditional routing on sealed types, composable with then
  • @Generable("desc") / @Guide / @LlmDescription — runtime reflection: toLlmDescription(), jsonSchema(), promptFragment(), fromLlmOutput<T>(), PartiallyGenerated<T>
  • model { } — Ollama backend; host, port, temperature; injectable ModelClient for tests; auto-fallback to inline JSON tool-call format for models without native tool support (#706)
  • model { claude(name); apiKey = ... } — Anthropic Messages API adapter mapping LlmMessage/LlmResponse to/from Anthropic's structured tool_use / tool_result content blocks; live integration tests against claude-haiku-4-5-20251001 (#1644)
  • model { openai(name); apiKey = ... } — OpenAI Chat Completions adapter; tool_callstool_call_id paired by synthesized id, parameters schema field (vs Anthropic's input_schema); live integration tests against gpt-4o-mini (#1656)
  • Agentic execution loop — multi-turn tool calling with budget controls (maxTurns, maxToolCalls, maxDuration, perToolTimeout, maxTokens, maxConsecutiveSameTool) + onToolUse observability hook (#637, #963, #969)
  • Skill selection — manual skillSelection {} + automatic LLM routing when multiple skills match
  • onError { Throwable -> } — infrastructure-error observability hook (LLM transport, response parse, budget); pure observability — original exception always rethrows (#962)
  • Agent.observe { event -> } — sealed PipelineEvent bridges the four hooks (skill / tool / knowledge / error) into one typed stream; composes additively (#965)
  • Agent.toString() + Agent.describe() — readable single-line + multi-line debug output replacing the JVM identity-hash default (#970)
  • onBudgetThreshold(threshold) { reason, usedPercent -> } — pre-cap warning hook; fires once per BudgetReason when cumulative usage crosses the fraction, before the cap throws (#966)
  • loadResource(path) / loadResourceOrNull(path) — read agent system prompts from classpath resources; fail-fast at agent construction when path is missing; UTF-8 decoded; leading-slash normalized (#980)
  • wrap — teacher→student prompt-override operator (teacher wrap student returns a Pipeline<IN, OUT> where the teacher's String output becomes the student's system prompt for that one call; restored after). Two framings: education (one generalist student specialized by many teachers) and security (the student's task surface is locked to what the teacher emits). The PRD calls this the >> operator; Kotlin can't overload >> so the function is named wrap (#1698)

Phase 2 — Runtime + Distribution (Q2 2026)

Priority — 0.6.0 hero:

  • Permission manifest / capability graphpipeline.permissionManifest { } DSL on agents and compositions; writeYaml(file) / writeJson(file) emit deterministic output; Gradle task agentManifest plus verifyAgentManifest fails CI when high-risk boundaries widen. Captures agents, skills, tools, memory R/W, budgets, MCP client/server snapshots, providers (secrets masked), guardrail hooks, and composition structure. Lives in :agents-kt-manifest (zero vendor deps). The manifest SHA-256 is attached to every agent in the graph for runtime correlation. (#1912)
  • Manifest hash + request/session IDs in runtime audit eventsAgentRuntimeContext carries requestId (UUIDv4 per invoke), sessionId (per agent.session()), manifestHash (sha256 of the deterministic manifest, null until generated). Every PipelineEvent / AgentEvent includes these three; consumed by the OTel bridge (#1908) and the JSONL exporter (#1914). Closes the loop from build-time evidence to runtime behaviour. (#1913)
  • JSONL audit log exporter — append-only, one event per line, grep/jq-friendly. Schema covers requestId / sessionId / manifestHash / agentId / skillId / toolId / eventType / timestamp / inputType / outputType / budgetState / guardrailDecision / mcpClientId / toolPolicyRisk / usedDeclaredCapability / provider / model. Lives in :agents-kt-observability, masks raw args/results by omission, supports size/day rotation, and handles write backpressure without throwing into the agent path. Sibling to the OTel bridge (#1908) for teams that need a deterministic on-disk record. (#1914)
  • Declarative tool sandbox policy DSL (0.6.0 — declarative only, enforcement in 0.7.0)tool(..., policy { risk = ToolRisk.Medium; filesystem { read("/uploads/**"); writeNone() }; network { denyAll() } }). ToolPolicy captures risk, filesystem, network, and environment sub-policies with deterministic map/JSON/YAML manifest helpers. Audit events note toolPolicyRisk and usedDeclaredCapability. The enforcement layer is sibling #1916. (#1915)

Priority — 0.6.0 platform + follow-ups:

  • Tool<IN, OUT> hierarchy + McpTool<IN, OUT> — typed tool inheritance refining the current skills-shape (#1948). MCP capabilities still ship as Skill<Map<String,Any?>, String> via McpClient.toolSkills(), and now also as first-class McpTool<Map<String,Any?>, String> handles via McpClient.tools(). The typed-tool layer is additive and gives grants { tools(...) } / manifests a shared local+MCP boundary object.
  • MCP client integration — McpClient.toolSkills() / promptSkills() / resourceSkills() expose every MCP capability as a Skill consumable in skills { +... }. The McpTool type-hierarchy refinement (above) is a future ergonomic upgrade; the user-facing feature shipped in 0.5.0 as the skills-shape (#1795 / #1796 / #1810). McpServer ships DSLs to register prompts and resources alongside agents-as-tools, plus McpServerInfo for the full capability snapshot
  • McpServer hardening baseline — first-class incoming auth (McpServerAuth), origin/host allowlist on HTTP transport, ClientPrincipal, per-principal toolPolicy, capability negotiation filtered per client, and default-deny outside localhost. Rate limiting and structured request audit events remain gateway / observability follow-ups. (#1902)
  • DeepSeek provider adapter — fourth built-in ModelClient, implemented on the OpenAI-compatible Chat Completions shape with DeepSeek provider identity, token usage normalization, streaming through the OpenAI-compatible SSE path, and manifest metadata. Constrained decoding stays disabled until DeepSeek supports OpenAI-style response_format.json_schema.
  • Google Gemini provider adapter — eighth ModelClient alongside Ollama / Anthropic / OpenAI / DeepSeek / Kimi / OpenRouter / Perplexity. A full from-scratch adapter (Gemini is not OpenAI-compatible): contents/parts with user/model roles, systemInstruction, functionDeclarations tool calling, functionResponse paired by name, parametersJsonSchema/responseJsonSchema, native SSE streaming (:streamGenerateContent?alt=sse), inlineData vision, and thought-summary reasoning. 9 unit + 4 live tests (verified green against gemini-2.5-flash). (#1917)
  • grants { allow(...) / confirm(...) }agent-level capability grants (#4545). allow(...) tools are freely callable; confirm(...) tools require the granting agent's authorization at runtime (a GrantConfirmer, fail-closed — not a human gate). Opt-in; build-validates that every tool an agent's skills use is granted (allow ∪ confirm), that grants reference real tools, and that allow/confirm are disjoint. The full structure { root { delegates {} } } topology DSL remains a later (Phase-3) effort; permission-manifest surfacing of grants is a follow-up. References the typed Tool<IN,OUT> hierarchy (#1948)
  • Permission model: 3 states — Granted / Confirmed / Absent. Folded into the guardrails issue (#1907): Granted = Allow or no interceptor registered; Confirmed = Escalate(reason, reviewerRole) resumed by host app; Absent = existing pre-guardrail allowedToolMap rejection now surfaced via onUnauthorizedToolCall
  • KSP annotation processor — compile-time @Generable codegen: shape validation (#1700), schema emitter + field-type validation (#1701), sealed-root schema (#1702), toLlmDescription() + multi-constant cache (#1703), constructFromMap codegen (#1704), drop runtime kotlin-reflect + empty-variants gate (#1705). Ships as agents-kt-ksp module
  • Provider-level constrained decoding — agentic skills returning @Generable types pass JSON Schema to supporting providers automatically: OpenAI response_format.json_schema, Ollama format, and Anthropic's structured-output tool path. Unsupported adapters keep the existing repair-loop behavior. (#1949)
  • Native CLI binary (0.7.0) — standalone agents-kt CLI (:agents-kt-cli, application plugin) with generate / inspect / verify subcommands (verify fails on policy-relaxing diffs). Shares the Gradle-free ManifestEntrypointLoader with the plugin, so a build and the CLI emit byte-identical manifests. jlink/GraalVM single-file image is a packaging follow-up (the entrypoint-loading commands reflect into arbitrary user classes and need a JVM). See cli.md. (#1923)
  • jlink minimal JRE bundle for runtime (~35MB)

Secondary:

  • Session model — multi-turn AgentSession shipped (cold events: Flow<AgentEvent> + await() + snapshot/resume; #1736 — see "Streaming session surface" below). Remaining: automatic compaction (SUMMARIZE / SLIDING_WINDOW / CUSTOM).
  • onBefore* interceptor family — Rails-style onBeforeSkill / onBeforeToolCall / onBeforeTurn returning a sealed Decision { Proceed | ProceedWith(args) | Deny(reason) | Substitute(result) }. Sibling to today's post-hoc observer hooks (onToolUse / onSkillChosen / onError). Unifies per-client tool policy (McpServer), action confirmation, prompt-injection filtering (one-liner: onBeforeTurn { msgs -> if (filter.flag(msgs)) Decision.Deny(...) else Decision.Proceed }), and uniform perToolTimeout wrapping. Chain semantics: registration order, all run, first non-Proceed wins. (#1907, feeds #1908)
  • Agent memory — MemoryBank, memory_read/memory_write/memory_search auto-injected tools
  • .spawn {} — independent sub-agent lifecycle, AgentHandle<OUT>, parent-managed join
  • Streaming foundation — LlmChunk sealed type (TextDelta / ToolCallStarted / ToolCallArgumentsDelta / ToolCallFinished / End) + ModelClient.chatStream(messages): Flow<LlmChunk> with a default impl that wraps chat() so non-streaming providers keep working unchanged. Provider-native streaming (Anthropic SSE, OpenAI SSE, Ollama stream: true) overrides land per-adapter. LlmChunk stays narrow — no agentic concepts like skillName / agentId (#1722)
  • Streaming session surface — AgentEvent sealed hierarchy (Token / ToolCallStarted / ToolCallArgumentsDelta / ToolCallFinished / SkillStarted / SkillCompleted / Completed<OUT> / Failed, every event carrying agentId), AgentSession<OUT> (cold events: Flow<AgentEvent<OUT>> + suspend fun await(): OUT), and free function Agent<IN, OUT>.session(input): AgentSession<OUT> (#1736). Existing Agent.invokeSuspend delegates to a new internal invokeSuspendForSession with a no-op skill listener — backward-compat byte-for-byte. Today emits only bracket events (SkillStarted / SkillCompleted / Completed / Failed) — the Token / ToolCall* subtypes are defined and ready for consumers but not yet emitted (next entry). Integration coverage: failure-path identity-preserved cause, concurrent sessions, agentic-stub bracketing, live-LLM π-to-20-decimals against Ollama (#1737), and prompt-cancellation of the events collector (#1738).
  • Agentic-loop rewire onto FlowCollector<AgentEvent>Token and ToolCall* events fire mid-loop; tokensUsed threaded through SkillCompleted / Completed. Shipped in 0.5.0 (#1739 / #1740). Regular blocking tools still use a sacrificial worker thread for per-tool timeouts; session-aware suspend tools now use coroutine cancellation (#1903).
  • Enforce perToolTimeout on session-aware tool pathsessionExecutor calls now respect budget.perToolTimeout, emit failed ToolCallFinished events on timeout, and surface BudgetExceededException(PER_TOOL_TIMEOUT). (#1903)
  • Streaming docs reconcile — README Limitations / Roadmap bullets are tagged as shipped / experimental / planned; the stale "no per-adapter native streaming yet" wording is gone, and DeepSeek is called out as using the OpenAI-compatible SSE path. (#1901)
  • Per-adapter native streaming overrides — Anthropic SSE (ClaudeClient.chatStream), OpenAI SSE (OpenAiClient.chatStream), Ollama NDJSON stream: true (OllamaClient.chatStream) all emit real partial chunks at the wire. Live integration tests measure 19 / 2 / 19 chunks per response respectively. See v0.5.0 streaming premortem
  • Reactive event stream for UIs — shipped: AgentSession.events: Flow<AgentEvent> (#1736) + agent.observe { (PipelineEvent) -> } (#965); a UI consumes the typed agent stream today (Token / ToolCall* / SkillStarted / SkillCompleted / Completed / Failed). Remaining: composition-stage event types (StageStarted, PipelineCompleted) at the Pipeline level.
  • Multimodal input — image/document (vision): SHIPPED end-to-end across Anthropic / OpenAI / Ollama via Content.ImageImagePart → provider wire and agent.invokeWithAttachments (#2466–#2470). The Content sealed type (Text / Image / Audio / Video / Document, each with a typed ContentRef + closed mime) is in place. Remaining: audio/video input (below) + a Gemini provider to extend vision to.
    • Image/document input — [x] shipped for Anthropic / OpenAI / Ollama (image bytes + media type as a content block alongside text; Gemini pending — no provider yet).
    • Audio/video input — [ ] remaining: Content.Audio / Content.Video variants are typed but not yet sent to providers (Gemini, GPT-4o-audio). Optional STT-only helper audio.transcribe(file) for the Whisper-style use case.
    • Architectural change: LlmMessage.content: String needs to evolve into a List<LlmContent> sealed type (Text / Image / Audio blocks). Binary-compat risk: add a sibling contentBlocks: List<LlmContent>? field first with the existing String form auto-coerced into a single Text block; deprecate the String form once the API surface settles. Typed boundaries are unaffected — Agent<Image, String> (image classifier) and Agent<AudioClip, String> (transcriber) become coherent agent shapes.
  • Serialization — agent.json, A2A AgentCard
  • JAR bundles and folder-based assembly
  • Gradle plugin

Phase 3 — Production (Q3 2026)

  • Layer 2: Full Structure DSL with delegates, grants, authority, routing, escalation
  • All 37 compile-time validations enforced by Gradle plugin
  • AgentUnit testing framework — unit, semantic (LLM-as-judge), Skill Coverage metrics
  • A2A protocol support (server + client)
  • File-based knowledge: skill.md, reference, examples, checklist + RAG pipeline
  • Production observability foundation — vendor-neutral ObservabilityBridge + adapter modules. Core ships a zero-dep :agents-kt-observability module exposing ObservabilityBridge { onPipelineEvent / onAgentEvent / onInterceptorDecision } and an agent.observe(bridge) extension that wires both event surfaces plus the onBefore* decisions (#1907) into the bridge. Adapters live in separate Gradle modules so local-first users never pull vendor SDKs.
    • :agents-kt-otel — OpenTelemetry adapter using the GenAI semantic conventions: skill = root span (agent.invoke), model turn = child span (gen_ai.operation.name=chat, gen_ai.system, token-usage attrs), tool call = child span (gen_ai.operation.name=tool, tool.name, tool.call.id), errors as span status, and interceptor decisions as span events. Parent-context propagation via Context.current(). (#1908)
    • :agents-kt-langsmith — LangSmith run-tree adapter (chain → llm → tool runs), async batch dispatch with oldest-drop backpressure logging, and fixture-style tests with no live LangSmith calls. (#1909)
    • :agents-kt-langfuse — Langfuse traces / spans / generations adapter with native ingestion, async batch dispatch, oldest-drop backpressure logging, and fixture-style tests with no live Langfuse calls. (#1910)
  • Threat-model + deployment-pattern guideshipped: docs/threat-model.md (now the canonical what's-enforced-where table) with worked scenarios (safe local assistant; internal business tool; MCP server behind gateway; anti-patterns), each calling out which Agents.KT guardrails apply and which gaps the deployer must close themselves. Linked from README security section and SECURITY.md. (#1904)
  • Release-signing hardening — replace the no-passphrase GPG example in the publishing guide with a passphrase-protected default; add a CI-signing section (secrets-manager-injected passphrase, short-lived subkey, or OIDC-to-signing-service); demote the no-protection variant to a clearly-labelled "local-only sandbox keys" subsection. (#1905)
  • Post-release 0.6.x demos(1) safe MCP filesystem agent (read-only allowlist, rejection visible in audit log), (2) typed approval workflow for high-risk paths (the shipped primitives: humanApproval { }resumeWith(HumanDecision) #2489, or HumanGateRegistry #3868 — the originally-sketched Escalate decision type was never shipped), (3) multi-agent audit pipeline binding every model + tool call to the manifest hash. Each lives in examples/<name>/, runs against Ollama by default, emits manifest + JSONL audit on one invocation. Validates the 0.6.0 story end-to-end. (#1918)
  • Production hardening checklist + regulated deployment guidedocs/production-hardening.md checkbox list (tool allowlists, MCP auth, conservative budgets, output wrapping, audit logs, manifest review in CI, etc.) and docs/regulated-deployment.md for finance / healthcare / public-sector buyers (audit retention, evidence pack, manifest-hash chain-of-custody). Companion to threat-model (#1904). (#1919)
  • AI Act-aligned whitepaper — 8–12 page engineering-guidance document (explicitly not legal advice) on bounded agent systems, the manifest as static evidence, audit events as dynamic evidence, human-oversight hooks, shared-responsibility model. Timed for 2026 AI-governance attention. (#1921)
  • README + landing repositioning — boundary-first / auditable register; "what Agents.KT owns" + "what it doesn't try to own" sections; marketing-register and compliance-language audit (avoid "fastest" / "fully compliant"; keep "auditable" / "least privilege" / "compliance-supporting"). Feeds off the comparison page (#1906). (#1922)
  • Scarf integration + Maven adoption verification — set up Scarf on ai.deep-code:agents-kt:*, 30-day baseline before public adoption claims, keep public wording soft ("Maven pull-through stronger than GitHub stars suggest") until verified. Outreach template prepped but not sent. (#1920)
  • Team DSL — swarm coordination (if isolated execution available)
  • Generative outputs (image + audio) — sibling client interfaces to ModelClient for non-chat model families.
    • ImageModelClient.generate(prompt, options): ImageBytes — text → image. Adapters: OpenAI DALL-E 3, Google Imagen, Stability. Optional streaming via generateStream(...): Flow<LlmChunk.ImageDelta> for partial-preview UX.
    • TTSModelClient.synthesize(text, voice, options): AudioBytes — text → speech. Adapters: OpenAI TTS, ElevenLabs, Google Cloud TTS. Streaming via LlmChunk.AudioDelta(pcmChunk) for low-latency playback (relevant for IDE voice agents, chat UIs).
    • These keep the typed-boundary identity: Agent<String, ImageBytes> and Agent<TextRequest, AudioBytes> are first-class. Composition operators (then, wrap) work unchanged across modalities.
  • Tool-policy enforcement — Layer 1 (in-JVM filesystem gate) (0.7.0) (#2890, under #1916) — a tool's declared ToolPolicy is now enforced at runtime by default: a call whose absolute filesystem-path argument falls outside the declared read/write globs is denied before the executor runs (normalized first, so .. can't escape a glob), surfacing through onToolDenied / PipelineEvent.ToolDenied. Opt-in by declaration (undeclared tools are never gated); enforceToolPolicies = false restores the inert 0.6.0 behavior. In-process scope only — relative paths and network/environment isolation are deferred to the Layer-2 OS sandbox below. See tool-policy-enforcement.md.
  • Sandboxed tool execution — Layer 2 (OS sandbox) (0.7.0 — core shipped; some backends 0.8) (#1916) — ProcessSandbox confines subprocess-shaped tools at the OS level: macOS Seatbelt, Linux bubblewrap, firejail setuid fallback, and a plain-ProcessBuilder + loud UNCONFINED warning where no tool is present. forPolicy(policy) derives write roots, env allow-list, and cwd from the declared ToolPolicy; network is default-deny (only network { allowAll() } opens it). Auto-wired via processTool(...). Deferred to 0.8: WasmSandbox (#2894), DockerSandbox (#2895), the network hostname-allowlist proxy (#2893), and the grants { } structure DSL. See tool-policy-enforcement.md.
    • ProcessSandbox — subprocess executor with env / cwd / timeout / network constraints. Backends: Seatbelt on macOS (the framework behind sandbox-exec; built into the OS), bwrap (bubblewrap) on Linux as the primary, firejail as the fallback. On WSL2 same as Linux; WSL1 unsupported (no namespace support). Plain ProcessBuilder with a loud warning on platforms with no native sandboxing tool. Most pragmatic — every dev box has at least one path. Cribs profile shape + socat-proxy plumbing from anthropic-experimental/sandbox-runtime (Anthropic's open-source Linux bwrap reference).
    • Network sub-policy: outbound blocked by default; allowlist via sandbox.network.allowedDomains. A proxy server (running outside the sandbox) intercepts DNS + connections and gates by hostname. TLS caveat: the default proxy doesn't terminate TLS — it allows/denies by hostname only. Allowing broad domains (github.com, googleapis.com) leaves room for domain-fronting; consumers needing real traffic inspection plug in their own MITM proxy. Document this explicitly so it's not a surprise.
    • Permission/sandbox interaction: sandbox path config and grants { } path config merge — both layers apply (matches Claude Code semantics). Sandbox cannot accidentally widen what grants denies. A tool with both must satisfy both.
    • WasmSandbox — JAR-embedded WASM runtime via Chicory (pure-Java; no host setup). Tools compiled to WASM; filesystem and network capabilities granted explicitly at registration. Most truly embedded — works anywhere a JVM runs.
    • DockerSandbox — opt-in extras module (agents-kt-docker-sandbox) via docker-java. Talks to whatever Docker daemon the host already runs. Not embeddable — library ships in the JAR, daemon does not. For teams that already operate Docker.
    • Why this axis matters: today grants { tools(writeFile, compile) } controls which tools an agent can call; sandboxing controls what those tools can do once invoked. Pairs with frozen agents + typed args to give a security model that's strictly stronger than "trust the executor lambda."

Phase 4 — Ecosystem (Q4 2026)

  • Knowledge packs — battle-tested prompt libraries for common domains
  • Agent generation from natural language (NL → Kotlin DSL)
  • Skillify — extract reusable skills from session transcripts
  • Visual structure editor, UML bidirectional conversion
  • Knowledge marketplace
  • Comparison pagedocs/comparison.md with a feature matrix vs LangChain (Py + LangChain4j), Microsoft Semantic Kernel, AutoGen, and a raw MCP client; covers typed Agent<IN,OUT>, runtime tool allowlist, MCP client/server, native streaming, budgets, sandboxing, KSP/compile-time validation, language, local-first model support; honest "where Agents.KT is weaker" subsection. (#1906)