Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
87 changes: 87 additions & 0 deletions .github/workflows/e2e.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,87 @@
name: E2E (cross-agent matrix)

# Manual trigger only. This workflow spawns real agent CLIs against real
# provider APIs and a dedicated Deeplake test workspace — every run costs
# real money and takes ~10 minutes. We deliberately do NOT run it on
# every PR; the source + bundle byte-checks in `npm test` keep gating
# merges. Use this workflow as a release-readiness gate by triggering it
# manually from the Actions tab against your feature branch.

on:
workflow_dispatch:
inputs:
case_filter:
description: "Only run this case id (e.g. 01-capture-smoke). Leave blank for all."
required: false
type: string
agent_filter:
description: "Only run this agent id (e.g. claude-code). Leave blank for all."
required: false
type: string

permissions:
contents: read

jobs:
e2e:
name: Tier-1 cross-agent matrix
runs-on: ubuntu-latest
timeout-minutes: 30
# Gate the job on creds being present. Forks without the e2e secret
# see a clean skip in the Actions UI rather than a misleading red.
if: ${{ github.event.repository.full_name == 'activeloopai/hivemind' }}
steps:
- uses: actions/checkout@v4

- name: Setup Node.js
uses: actions/setup-node@v4
with:
node-version: 22

- name: Install dependencies
run: npm install

- name: Build bundles
# The harness drives the actual bundles for codex/cursor/hermes/pi
# (claude-code uses --plugin-dir against the source tree). Without
# build, `hivemind <agent> install` would copy stale or missing
# bundle files into the tmp HOME.
run: npm run build

- name: Install agent CLIs
# Each tier-1 agent CLI must be on PATH for its driver to spawn.
# We install the npm-distributed CLIs here; cursor-agent and
# hermes are typically installed via the agent vendor's own
# installer outside the npm ecosystem. If those binaries are
# not on a CI runner, their driver will fail with a clear
# "spawn error" and the matrix continues.
run: |
npm install -g @anthropic-ai/claude-code @openai/codex
# Pi ships via npm too.
npm install -g @piapp/cli || true
# cursor-agent and hermes — install via curl when available;
# if not, their points fail loudly rather than silently skip.
curl -fsSL https://cursor.com/install-cli.sh | bash -s -- --print 2>/dev/null || echo "cursor-agent install skipped"
Comment on lines +59 to +64
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

find . -name "e2e.yml" -o -name "e2e.yaml" | head -5

Repository: activeloopai/hivemind

Length of output: 93


🏁 Script executed:

cat -n .github/workflows/e2e.yml

Repository: activeloopai/hivemind

Length of output: 4198


Pin and verify the agent installers.

This step pulls unpinned CLI versions, making runs non-reproducible across days or re-runs. More significantly, the curl-piped installer at line 64 executes a mutable remote script from cursor.com without checksum verification—a supply-chain risk. Pin CLI versions and replace the curl installer with a verified binary or checksum-validated script.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @.github/workflows/e2e.yml around lines 59 - 64, Pin the CLI installs and
remove the insecure curl|bash by specifying explicit versions for the npm
installs (replace "npm install -g `@anthropic-ai/claude-code` `@openai/codex`" and
"npm install -g `@piapp/cli` || true" with locked version specifiers like
`@version`) and replace the cursor installer pipeline ("curl -fsSL
https://cursor.com/install-cli.sh | bash -s -- --print") with a verified
download-and-verify flow: download the release artifact to a temp file, validate
its SHA256 (or signature) against a checked-in or CI-managed fingerprint, then
execute the verified binary/installer; ensure CI fails if checksum verification
fails and avoid swallowing errors with "|| true".

# Hermes install would go here; install method varies by vendor.
which claude codex pi cursor-agent hermes 2>&1 || true

- name: Run e2e matrix
env:
HIVEMIND_E2E_CREDS_JSON: ${{ secrets.HIVEMIND_E2E_CREDS_JSON }}
ANTHROPIC_API_KEY: ${{ secrets.HIVEMIND_E2E_ANTHROPIC_API_KEY }}
OPENAI_API_KEY: ${{ secrets.HIVEMIND_E2E_OPENAI_API_KEY }}
GOOGLE_API_KEY: ${{ secrets.HIVEMIND_E2E_GOOGLE_API_KEY }}
run: |
args=()
if [ -n "${{ inputs.case_filter }}" ]; then args+=(--case "${{ inputs.case_filter }}"); fi
if [ -n "${{ inputs.agent_filter }}" ]; then args+=(--agent "${{ inputs.agent_filter }}"); fi
npm run e2e -- "${args[@]}"

- name: Upload summary artifact
if: always()
uses: actions/upload-artifact@v4
with:
name: e2e-summary
path: tests/e2e/results/
if-no-files-found: warn
retention-days: 30
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -18,3 +18,5 @@ bun.lock
deploy-to-cache.sh
.followups-pr97.md
.followups-pr98.md
# e2e harness per-run output artifacts (summary.json + sandbox dumps)
tests/e2e/results/
7 changes: 7 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -316,6 +316,13 @@ Interactive shell against Deeplake:
npm run shell
```

Cross-agent end-to-end matrix — drives all six agent runtimes (five CLI subprocess, OpenClaw via programmatic event firing) through real prompts against a Deeplake test workspace; manually triggered, not on every PR:

```bash
npm run e2e # full matrix; see tests/e2e/README.md for env vars
npm run e2e -- --list # print the matrix without spawning
```

## License

Apache License 2.0 — © Activeloop, Inc. See [LICENSE](LICENSE) for details.
Expand Down
1 change: 1 addition & 0 deletions package.json
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,7 @@
"cli": "tsx src/cli/index.ts",
"test": "vitest run",
"typecheck": "tsc --noEmit",
"e2e": "tsx tests/e2e/runner.ts",
"dup": "jscpd src",
"audit:openclaw": "node scripts/audit-openclaw-bundle.mjs",
"pack:check": "node scripts/pack-check.mjs",
Expand Down
Loading
Loading