Last updated: 2026-06-21
This roadmap translates STRATEGY.md into the next few product phases. It is intentionally short: the goal is to keep priorities and boundaries visible without turning the roadmap into a plan dump.
- AgentV remains the repo-native, workspace-native runner, grader, and artifact source of truth.
- The AgentV Dashboard is the supported zero-infra local cockpit for AgentV-owned runs, traces, sessions, transcripts, and Git-backed artifacts.
- Phoenix is optional external trace infrastructure only when Codex, Arize, or another hook already emitted spans independently; AgentV may correlate with those sessions but does not write AgentV artifacts into Phoenix.
- Harbor stays an optional benchmark-grade runner boundary, not AgentV core.
- AgentV YAML remains the authoring surface even when execution moves behind another runner; prefer a lightweight translation layer over duplicated specs.
- Adapters, workers, and artifact projections are preferred over rebuilding adjacent platforms inside AgentV, except Phoenix: Phoenix integration is link-out correlation only and not an AgentV-to-Phoenix projection path.
- Keep the canonical handoff surface centered on completed run bundles,
index.jsonl, grading/timing/metrics artifacts, normalized transcripts, and optionalexternal_tracelink metadata. - Finish the vendor-neutral local export seams that let completed runs be re-read, compared, exported, and attached to non-Phoenix adapters without vendor-specific logic in core.
- Keep OTLP/OpenInference mapping generic and reusable before building backend-specific upload or import paths.
- Document the Phoenix boundary across public docs, CLI help, Dashboard copy, and AI-facing guides.
- Allow optional Phoenix links when safe
external_tracemetadata points at spans emitted independently by Codex, Arize, or another hook. - Keep Phoenix from owning AgentV transcript, index, run, dataset, experiment, or storage state.
- Keep Dashboard free of a Phoenix runtime dependency: no
pxCLI requirement, no Phoenix GraphQL/REST proxy, and no direct reads from Phoenix database tables.
- Keep the AgentV Dashboard focused on quick local inspection, comparison, and zero-infra review.
- Add lightweight links from local AgentV artifacts to external trace viewers when safe metadata exists.
- Clarify the split across docs, CLI help, Dashboard copy, and skills so users know that AgentV artifacts stay local/Git-backed and Phoenix is optional read-only context.
- Docs hygiene follow-up: split large AI-facing guidance such as
AGENTS.mdinto linked instruction files if that makes the boundary easier to maintain.
- Harbor: move toward Harbor becoming the benchmark-grade runner behind a lightweight translation layer from AgentV YAML, while AgentV stays the authoring, gating, import, and comparison surface.
- Harbor: in the near term, launch or import benchmark-grade runs through a runner boundary; over time, converge on Harbor as the execution layer for the suites it already owns.
- Opik and similar systems: consume completed AgentV projection bundles as post-run adapters rather than as runtime owners.
- Phoenix: remain excluded from AgentV post-run export/projection; use link-out correlation only.
- Additional observability backends should reuse the same projection and export seams instead of adding new core product models.
- Rebuilding Phoenix, Opik, Langfuse, or similar products inside AgentV.
- Turning AgentV into a hosted experiment platform, benchmark catalog, or generic dashboard platform.
- Deleting the local Dashboard in favor of Phoenix.
- Exporting or projecting AgentV runs, traces, transcripts, datasets, experiments, or indexes into Phoenix.
- Requiring
px, Phoenix database access, or Phoenix-owned storage for Dashboard runtime.