Skip to content

embeddings: drop user-visible 'deps missing' banner, keep recycle#182

Merged
kaghni merged 6 commits into
mainfrom
fix/drop-embed-deps-missing-banner
May 19, 2026
Merged

embeddings: drop user-visible 'deps missing' banner, keep recycle#182
kaghni merged 6 commits into
mainfrom
fix/drop-embed-deps-missing-banner

Conversation

@kaghni
Copy link
Copy Markdown
Collaborator

@kaghni kaghni commented May 18, 2026

Summary

  • Strip the `enqueueNotification({id: "embed-deps-missing", title: "Hivemind embeddings disabled — deps missing", ...})` call from `handleTransformersMissing()` in `src/embeddings/client.ts`.
  • Keep the stuck-daemon recycle (SIGTERM + sock/pid cleanup) — that's the actual self-heal, fixes the issue silently on the next call.
  • Remove the now-orphaned `_signalledMissingDeps` flag, `embeddingsStatus()` user-disabled check, and `enqueueNotification` / `embeddingsStatus` imports.

Why

The banner kept stacking on top of the primary session-start message even for users whose embeddings work correctly (the daemon recycles silently and embeddings are fine on next call). The CLI's `embeddings status` already documents the install command for users with persistent failures, so the banner doesn't carry unique value. Removing it reduces session-start noise without losing self-heal capability.

Test plan

  • `npm run typecheck`
  • `npm run build`
  • `npx vitest run tests/claude-code/embeddings-client.test.ts tests/claude-code/embeddings-bundle-scan.test.ts tests/claude-code/notifications.test.ts tests/claude-code/notifications-queue-lock.test.ts` — 130/130 passing
  • Full suite: 2704/2705 (one unrelated flake in deeplake-fs.test.ts — confirmed pre-existing on origin/main at 40% failure rate over 5 runs)
  • After merge: confirm session-start no longer shows the embeddings-disabled warning even with a known-broken daemon

Tests pinned to the new contract

  • `embeddings-client.test.ts`: four cases in "transformers-missing handling" flipped to assert `enqueueNotificationMock` NEVER fires
  • `embeddings-bundle-scan.test.ts`: scan flipped from "capture.js carries embed-deps-missing" to "capture.js does NOT carry embed-deps-missing" — guards against accidental reintroduction
  • Queue tests using `embed-deps-missing` as a fixture id switched to neutral `dedup-fixture` (those tests validate queue dedup, not embeddings-specific behavior)

Summary by CodeRabbit

  • Bug Fixes

    • Removed unnecessary user notifications about missing embeddings dependencies; the system now silently manages daemon recovery without disrupting workflows.
  • Chores

    • Updated internal daemon lifecycle management and logging infrastructure across multiple bundles for improved reliability.

Review Change Stack

The handleTransformersMissing() path enqueued a session-start banner
("Hivemind embeddings disabled — deps missing · Run `hivemind embeddings
install` to enable") whenever the daemon returned a transformers
MODULE_NOT_FOUND. The recycle/self-heal step it pairs with usually fixes
the issue silently on the next call, so the banner kept stacking on top
of the primary session-start message for users whose embeddings WERE
actually working — and even for users with persistent failures, the
recommended action (`hivemind embeddings install`) is documented on the
CLI's `embeddings status` output, not a problem the banner uniquely solves.

What stays:
- Stuck-daemon recycle: SIGTERM the daemon + clear sock/pid on transformers
  MODULE_NOT_FOUND so the next call spawns fresh from the current bundle.
  This is the actual self-heal and is preserved unchanged.
- isTransformersMissingError() helper: still used by the recycle path.

What goes:
- enqueueNotification() call from handleTransformersMissing
- _signalledMissingDeps process-local flag (no longer needed)
- embeddingsStatus() user-disabled check (only existed to gate the now-
  removed notification)
- Bundle-scan tests asserting `embed-deps-missing` is shipped
- Source tests asserting the notification is enqueued

Tests updated to pin the new contract: handleTransformersMissing fires
zero notifications under any conditions. Fixture ids in queue tests
that used "embed-deps-missing" as a representative example switched to
neutral "dedup-fixture" — those tests validate queue dedup semantics,
not embeddings-specific behavior.
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 18, 2026

Important

Review skipped

Auto incremental reviews are disabled on this repository.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 8c95f130-c624-4757-936b-6967cbd2222c

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Embeddings client now recycles on transformers-missing without user notifications. Daemon lifecycle, socket/pid handling, and logging aliases are updated across multiple bundles and hooks. Capture/pre/stop/start flows add retry-on-missing-table and improved error logs. Tests and src updated to reflect silent behavior.

Changes

Embeddings recycle-only behavior propagation

Layer / File(s) Summary
Core embed client behavior and helpers
src/embeddings/client.ts, */bundle/*(embeddings client sections)
Removes notification enqueueing on transformers-missing; keeps recycle-only path. Updates shared daemon path/UID, verify/recycle/spawn/connect, and logging aliases across bundles.

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~120 minutes

Possibly related PRs

Suggested reviewers

  • efenocchi

Poem

I thump my paw on sockets’ glow,
No banners now—just cleanly go.
A sleepy loop, a pid goodbye,
Recycle burrow, logs reply.
Quiet warren, tests in tune—
Embeds will bloom again, and soon. 🐇✨

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/drop-embed-deps-missing-banner

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 18, 2026

Coverage Report

Scope: files changed in this PR. Enforced threshold: 90% per metric (per file via vitest.config.ts).

Status Category Percentage Covered / Total
🟢 Lines 96.68% (🎯 90%) 583 / 603
🟢 Statements 96.25% (🎯 90%) 667 / 693
🟢 Functions 97.14% (🎯 90%) 102 / 105
🔴 Branches 84.86% (🎯 90%) 297 / 350
File Coverage — 7 files changed
File Stmts Branches Functions Lines
src/deeplake-api.ts 🟢 98.8% 🟢 90.1% 🟢 97.7% 🟢 99.5%
src/embeddings/client.ts 🟢 96.4% 🔴 85.5% 🟢 96.7% 🟢 97.0%
src/hooks/capture.ts 🔴 85.4% 🔴 73.2% 🔴 83.3% 🔴 85.2%
src/notifications/index.ts 🟢 100.0% 🟢 93.8% 🟢 100.0% 🟢 100.0%
src/notifications/queue.ts 🟢 97.2% 🔴 79.2% 🟢 100.0% 🟢 98.4%
src/notifications/state.ts 🟢 98.0% 🔴 77.3% 🟢 100.0% 🟢 97.9%
src/notifications/types.ts 🟢 100.0% 🟢 100.0% 🟢 100.0% 🟢 100.0%

Generated for commit 241a22c.

@coderabbitai coderabbitai Bot requested a review from efenocchi May 18, 2026 23:53
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)
cursor/bundle/pre-tool-use.js (1)

1550-1558: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Use the bundle-local embed daemon path here.

With this file’s runtime location, dirname(fileURLToPath(import.meta.url)) is already cursor/bundle, so the extra .. resolves the daemon to cursor/embeddings/embed-daemon.js. That leaves getEmbedClient() pointing at a missing entry and silently drops semantic grep back to lexical-only fallback on installs without a shared daemon.

Suggested fix
 function resolveDaemonPath() {
-  return join7(dirname2(fileURLToPath(import.meta.url)), "..", "embeddings", "embed-daemon.js");
+  return join7(dirname2(fileURLToPath(import.meta.url)), "embeddings", "embed-daemon.js");
 }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@cursor/bundle/pre-tool-use.js` around lines 1550 - 1558, resolveDaemonPath
currently ascends one directory which makes it point to
cursor/embeddings/embed-daemon.js (missing in bundle installs); update
resolveDaemonPath so it joins dirname2(fileURLToPath(import.meta.url)) with
"embeddings" and "embed-daemon.js" (remove the ".." path segment) so
getEmbedClient uses the bundle-local daemon entry instead of a non-existent
parent-level path.
claude-code/bundle/pre-tool-use.js (1)

2581-2603: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Sanitize sessionId before joining it into the query-cache path.

getSessionQueryCacheDir() uses the raw sessionId, and writeCachedIndexContent() now eagerly mkdirSyncs that directory before writing. A session_id containing ../ can therefore escape DEFAULT_CACHE_ROOT and create/write files outside the cache tree. writeReadCacheFile() already does the right root-escape check; this helper needs the same protection.

Suggested fix
 function getSessionQueryCacheDir(sessionId, deps = {}) {
   const { cacheRoot = DEFAULT_CACHE_ROOT } = deps;
-  return join8(cacheRoot, sessionId);
+  const safeSessionId = sessionId.replace(/[^a-zA-Z0-9._-]/g, "_") || "unknown";
+  return join8(cacheRoot, safeSessionId);
 }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@claude-code/bundle/pre-tool-use.js` around lines 2581 - 2603, The sessionId
is used raw when constructing cache dirs which allows path-traversal (e.g.,
"../") to escape DEFAULT_CACHE_ROOT; update getSessionQueryCacheDir(sessionId,
deps) to first sanitize/validate sessionId (e.g., reject empty or containing
path separators, or normalize it and ensure path.relative(cacheRoot, joinedPath)
does not start with ".."), and make readCachedIndexContent and
writeCachedIndexContent call the sanitized/validated helper; if validation
fails, return null or throw/log appropriately so mkdirSync in
writeCachedIndexContent cannot create directories outside the intended cacheRoot
(mirror the root-escape check behavior used by writeReadCacheFile).
hermes/bundle/capture.js (1)

1419-1441: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Restrict permissions on the wiki worker temp config.

This writes token and other workspace identifiers into tmpdir() with default directory/file permissions. On multi-user machines that can leave the config readable outside the current user. spawnSkillifyWorker hardens the same pattern a few lines below, so this path should too.

🔒 Proposed fix
-  mkdirSync6(tmpDir, { recursive: true });
+  mkdirSync6(tmpDir, { recursive: true, mode: 448 });
   const pluginVersion = getInstalledVersion(bundleDir, ".claude-plugin") ?? "";
   const configFile = join11(tmpDir, "config.json");
-  writeFileSync4(configFile, JSON.stringify({
+  writeFileSync4(configFile, JSON.stringify({
     apiUrl: config.apiUrl,
     token: config.token,
     orgId: config.orgId,
     workspaceId: config.workspaceId,
     memoryTable: config.tableName,
     sessionsTable: config.sessionsTableName,
     sessionId,
     userName: config.userName,
     project: projectName,
     pluginVersion,
     tmpDir,
     hermesBin: findHermesBin(),
     hermesProvider: process.env.HIVEMIND_HERMES_PROVIDER ?? "openrouter",
     hermesModel: process.env.HIVEMIND_HERMES_MODEL ?? "anthropic/claude-haiku-4-5",
     wikiLog: WIKI_LOG,
     hooksDir: join11(HOME, ".hermes", "hooks"),
     promptTemplate: WIKI_PROMPT_TEMPLATE
-  }));
+  }), { mode: 384 });
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@hermes/bundle/capture.js` around lines 1419 - 1441, The temp config file is
created with default permissions; tighten them by creating tmpDir with mode
0o700 and ensuring configFile is created with mode 0o600 (or chmod'd immediately
after write) to prevent other system users from reading tokens and workspace
IDs. Update the mkdirSync6 call for tmpDir to include mode: 0o700 and ensure
writeFileSync4 is followed by a chmodSync(configFile, 0o600) (or open/write with
explicit mode) so tmpDir and configFile are permission-restricted, matching the
hardening used by spawnSkillifyWorker.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@claude-code/bundle/capture.js`:
- Around line 1459-1464: handleTransformersMissing currently sets a one-time
guard (_recycledStuckDaemon) and calls recycleDaemon(null) but never clears the
guard, preventing future self-heal attempts; change the logic in
handleTransformersMissing / recycleDaemon so the guard is cleared when a recycle
fails (or after a retry-delay) to allow subsequent attempts — e.g., on
recycleDaemon rejection or after a timeout reset _recycledStuckDaemon to false
so future calls to handleTransformersMissing can trigger recycleDaemon again;
reference _recycledStuckDaemon, handleTransformersMissing, and recycleDaemon
when making the change.

In `@claude-code/bundle/pre-tool-use.js`:
- Around line 1491-1515: Change migrationValueFromEnv() so it returns undefined
when process.env.HIVEMIND_EMBEDDINGS is unset (only return boolean for explicit
"false" or any other value), and update getEmbeddingsEnabled() so it only
writes/persists the embeddings enabled value (calls writeUserConfig or updates
_cache) when migrationValueFromEnv() returns a boolean; do not persist a false
default when the env var is absent—use the undefined result to skip
migration/persistence while still honoring existing cfg and the _migrated/_cache
logic.

In `@codex/bundle/pre-tool-use.js`:
- Around line 1214-1217: The current check only recycles when the daemon path
differs and the old path is missing; change it to recycle whenever
hello.daemonPath !== this.daemonEntry. Modify the block around the comparison
(referencing hello.daemonPath, this.daemonEntry, _recycledStuckDaemon, log3 and
this.recycleDaemon) so it unconditionally sets _recycledStuckDaemon = true and
calls this.recycleDaemon(hello.pid) when the paths differ; use
existsSync3(hello.daemonPath) only to choose the log wording (e.g., "(gone)" vs
"(present)"), not as a gate for recycling.

In `@codex/bundle/stop.js`:
- Around line 1340-1343: The current branch only recycles when the daemonPath
differs AND the old path is missing (uses existsSync8), which leaves a stale but
different daemon on disk active; change the condition to recycle whenever
hello.daemonPath !== this.daemonEntry regardless of existsSync8 so that
_recycledStuckDaemon is set and this.recycleDaemon(hello.pid) is always called
on a mismatch (update the log using hello.daemonPath and this.daemonEntry as
currently done); locate the check that references hello.daemonPath,
this.daemonEntry, existsSync8, _recycledStuckDaemon, and
this.recycleDaemon(hello.pid) and remove/ignore the existsSync8 gating so
mismatched daemon paths are always recycled.

In `@hermes/bundle/shell/deeplake-shell.js`:
- Around line 67849-67857: The global boolean _recycledStuckDaemon is being used
as a hard gate in the hello handler so once set by recycleDaemon it prevents
later transformer-missing failures from triggering another recycle; change this
to be per-daemon (e.g. key by hello.pid or hello.daemonPath) or clear/reset the
flag on a successful hello from the expected daemon. Concretely, replace or
augment _recycledStuckDaemon with a map/set keyed by hello.pid or
hello.daemonPath (or call a reset function when a hello matches this.daemonEntry
and is valid) and update the checks around recycleDaemon(hello.pid) and the
successful-hello path to remove the per-daemon entry so subsequent bad-hello
events can still call recycleDaemon; update all occurrences (including the
second occurrence around lines 67876-67880) to use the new scoped gate or
clearing logic.

In `@pi/bundle/wiki-worker.js`:
- Around line 6-7: The worker currently relies on its constructor default for
the embed-daemon path which causes it to miss the bundled embed-daemon.js;
update the worker instantiation(s) that spawn the embed daemon (e.g., the new
Worker(...) / EmbedDaemon constructor calls around where join6 is imported and
the other occurrences at the block referenced by lines ~771-773) to pass the
explicit path to the bundled embed-daemon.js (use join6 to build the path to the
bundled file and supply that value as the daemon path argument to the
constructor/spawn function) so the daemon always uses the bundled
embed-daemon.js instead of falling back to HIVE... or ~/.hivemind defaults.

---

Outside diff comments:
In `@claude-code/bundle/pre-tool-use.js`:
- Around line 2581-2603: The sessionId is used raw when constructing cache dirs
which allows path-traversal (e.g., "../") to escape DEFAULT_CACHE_ROOT; update
getSessionQueryCacheDir(sessionId, deps) to first sanitize/validate sessionId
(e.g., reject empty or containing path separators, or normalize it and ensure
path.relative(cacheRoot, joinedPath) does not start with ".."), and make
readCachedIndexContent and writeCachedIndexContent call the sanitized/validated
helper; if validation fails, return null or throw/log appropriately so mkdirSync
in writeCachedIndexContent cannot create directories outside the intended
cacheRoot (mirror the root-escape check behavior used by writeReadCacheFile).

In `@cursor/bundle/pre-tool-use.js`:
- Around line 1550-1558: resolveDaemonPath currently ascends one directory which
makes it point to cursor/embeddings/embed-daemon.js (missing in bundle
installs); update resolveDaemonPath so it joins
dirname2(fileURLToPath(import.meta.url)) with "embeddings" and "embed-daemon.js"
(remove the ".." path segment) so getEmbedClient uses the bundle-local daemon
entry instead of a non-existent parent-level path.

In `@hermes/bundle/capture.js`:
- Around line 1419-1441: The temp config file is created with default
permissions; tighten them by creating tmpDir with mode 0o700 and ensuring
configFile is created with mode 0o600 (or chmod'd immediately after write) to
prevent other system users from reading tokens and workspace IDs. Update the
mkdirSync6 call for tmpDir to include mode: 0o700 and ensure writeFileSync4 is
followed by a chmodSync(configFile, 0o600) (or open/write with explicit mode) so
tmpDir and configFile are permission-restricted, matching the hardening used by
spawnSkillifyWorker.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 7d15280e-df40-4702-94f2-dd3d79ace4ef

📥 Commits

Reviewing files that changed from the base of the PR and between 66ad723 and a2d6d13.

📒 Files selected for processing (25)
  • claude-code/bundle/capture.js
  • claude-code/bundle/pre-tool-use.js
  • claude-code/bundle/session-start-setup.js
  • claude-code/bundle/shell/deeplake-shell.js
  • claude-code/bundle/wiki-worker.js
  • codex/bundle/capture.js
  • codex/bundle/pre-tool-use.js
  • codex/bundle/shell/deeplake-shell.js
  • codex/bundle/stop.js
  • codex/bundle/wiki-worker.js
  • cursor/bundle/capture.js
  • cursor/bundle/pre-tool-use.js
  • cursor/bundle/shell/deeplake-shell.js
  • cursor/bundle/wiki-worker.js
  • hermes/bundle/capture.js
  • hermes/bundle/pre-tool-use.js
  • hermes/bundle/shell/deeplake-shell.js
  • hermes/bundle/wiki-worker.js
  • pi/bundle/wiki-worker.js
  • src/embeddings/client.ts
  • src/notifications/queue.ts
  • tests/claude-code/embeddings-bundle-scan.test.ts
  • tests/claude-code/embeddings-client.test.ts
  • tests/claude-code/notifications-queue-lock.test.ts
  • tests/claude-code/notifications.test.ts

Comment on lines +1459 to 1464
handleTransformersMissing(_detail) {
if (!_recycledStuckDaemon) {
_recycledStuckDaemon = true;
this.recycleDaemon(null);
}
if (_signalledMissingDeps)
return;
_signalledMissingDeps = true;
let status;
try {
status = embeddingsStatus();
} catch {
status = "enabled";
}
if (status === "user-disabled")
return;
enqueueNotification({
id: "embed-deps-missing",
severity: "warn",
title: "Hivemind embeddings disabled \u2014 deps missing",
body: `Semantic memory search is off because @huggingface/transformers is not installed where the daemon can find it. Run \`hivemind embeddings install\` to enable.`,
dedupKey: { reason: "transformers-missing", detail: detail.slice(0, 200) }
}).catch((e) => {
log4(`enqueue embed-deps-missing failed: ${e instanceof Error ? e.message : String(e)}`);
});
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Allow subsequent self-heal attempts after a failed recycle.

Line 1459 only recycles once behind _recycledStuckDaemon, but this path never resets that guard. If the first recycle doesn’t recover the daemon, later transformers-missing errors won’t trigger another recycle and embeddings can stay degraded for the rest of the process lifetime.

🔧 Proposed fix
-  handleTransformersMissing(_detail) {
-    if (!_recycledStuckDaemon) {
-      _recycledStuckDaemon = true;
-      this.recycleDaemon(null);
-    }
-  }
+  handleTransformersMissing(_detail) {
+    this.recycleDaemon(null);
+  }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
handleTransformersMissing(_detail) {
if (!_recycledStuckDaemon) {
_recycledStuckDaemon = true;
this.recycleDaemon(null);
}
if (_signalledMissingDeps)
return;
_signalledMissingDeps = true;
let status;
try {
status = embeddingsStatus();
} catch {
status = "enabled";
}
if (status === "user-disabled")
return;
enqueueNotification({
id: "embed-deps-missing",
severity: "warn",
title: "Hivemind embeddings disabled \u2014 deps missing",
body: `Semantic memory search is off because @huggingface/transformers is not installed where the daemon can find it. Run \`hivemind embeddings install\` to enable.`,
dedupKey: { reason: "transformers-missing", detail: detail.slice(0, 200) }
}).catch((e) => {
log4(`enqueue embed-deps-missing failed: ${e instanceof Error ? e.message : String(e)}`);
});
}
handleTransformersMissing(_detail) {
this.recycleDaemon(null);
}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@claude-code/bundle/capture.js` around lines 1459 - 1464,
handleTransformersMissing currently sets a one-time guard (_recycledStuckDaemon)
and calls recycleDaemon(null) but never clears the guard, preventing future
self-heal attempts; change the logic in handleTransformersMissing /
recycleDaemon so the guard is cleared when a recycle fails (or after a
retry-delay) to allow subsequent attempts — e.g., on recycleDaemon rejection or
after a timeout reset _recycledStuckDaemon to false so future calls to
handleTransformersMissing can trigger recycleDaemon again; reference
_recycledStuckDaemon, handleTransformersMissing, and recycleDaemon when making
the change.

Comment on lines +1491 to +1515
function getEmbeddingsEnabled() {
const cfg = readUserConfig();
if (cfg.embeddings && typeof cfg.embeddings.enabled === "boolean") {
return cfg.embeddings.enabled;
}
if (_migrated) {
return migrationValueFromEnv();
}
_migrated = true;
const enabled = migrationValueFromEnv();
try {
writeUserConfig({ embeddings: { enabled } });
} catch {
_cache = { ...cfg ?? {}, embeddings: { ...cfg?.embeddings ?? {}, enabled } };
}
return enabled;
}
function migrationValueFromEnv() {
const raw = process.env.HIVEMIND_EMBEDDINGS;
if (raw === void 0)
return false;
if (raw === "false")
return false;
return true;
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Don't migrate unset HIVEMIND_EMBEDDINGS to enabled=false.

When no config exists, migrationValueFromEnv() returns false for an unset env var and getEmbeddingsEnabled() persists that into config.json on first load. That silently turns embeddings off for fresh/upgraded users, so semantic grep and capture stop running even though this PR is supposed to keep the recycle/self-heal path working. Treat an unset env var as “no migration value” instead of an opt-out.

Suggested fix
 function getEmbeddingsEnabled() {
   const cfg = readUserConfig();
   if (cfg.embeddings && typeof cfg.embeddings.enabled === "boolean") {
     return cfg.embeddings.enabled;
   }
+  const migrated = migrationValueFromEnv();
+  if (migrated === void 0) {
+    return true;
+  }
   if (_migrated) {
-    return migrationValueFromEnv();
+    return migrated;
   }
   _migrated = true;
-  const enabled = migrationValueFromEnv();
+  const enabled = migrated;
   try {
     writeUserConfig({ embeddings: { enabled } });
   } catch {
     _cache = { ...cfg ?? {}, embeddings: { ...cfg?.embeddings ?? {}, enabled } };
   }
   return enabled;
 }
 function migrationValueFromEnv() {
   const raw = process.env.HIVEMIND_EMBEDDINGS;
-  if (raw === void 0)
-    return false;
+  if (raw === void 0)
+    return void 0;
   if (raw === "false")
     return false;
   return true;
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
function getEmbeddingsEnabled() {
const cfg = readUserConfig();
if (cfg.embeddings && typeof cfg.embeddings.enabled === "boolean") {
return cfg.embeddings.enabled;
}
if (_migrated) {
return migrationValueFromEnv();
}
_migrated = true;
const enabled = migrationValueFromEnv();
try {
writeUserConfig({ embeddings: { enabled } });
} catch {
_cache = { ...cfg ?? {}, embeddings: { ...cfg?.embeddings ?? {}, enabled } };
}
return enabled;
}
function migrationValueFromEnv() {
const raw = process.env.HIVEMIND_EMBEDDINGS;
if (raw === void 0)
return false;
if (raw === "false")
return false;
return true;
}
function getEmbeddingsEnabled() {
const cfg = readUserConfig();
if (cfg.embeddings && typeof cfg.embeddings.enabled === "boolean") {
return cfg.embeddings.enabled;
}
const migrated = migrationValueFromEnv();
if (migrated === void 0) {
return true;
}
if (_migrated) {
return migrated;
}
_migrated = true;
const enabled = migrated;
try {
writeUserConfig({ embeddings: { enabled } });
} catch {
_cache = { ...cfg ?? {}, embeddings: { ...cfg?.embeddings ?? {}, enabled } };
}
return enabled;
}
function migrationValueFromEnv() {
const raw = process.env.HIVEMIND_EMBEDDINGS;
if (raw === void 0)
return void 0;
if (raw === "false")
return false;
return true;
}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@claude-code/bundle/pre-tool-use.js` around lines 1491 - 1515, Change
migrationValueFromEnv() so it returns undefined when
process.env.HIVEMIND_EMBEDDINGS is unset (only return boolean for explicit
"false" or any other value), and update getEmbeddingsEnabled() so it only
writes/persists the embeddings enabled value (calls writeUserConfig or updates
_cache) when migrationValueFromEnv() returns a boolean; do not persist a false
default when the env var is absent—use the undefined result to skip
migration/persistence while still honoring existing cfg and the _migrated/_cache
logic.

Comment on lines +1214 to 1217
if (hello.daemonPath !== this.daemonEntry && !existsSync3(hello.daemonPath)) {
_recycledStuckDaemon = true;
log4(`daemon path no longer on disk \u2014 running=${hello.daemonPath} (gone) expected=${this.daemonEntry}; recycling`);
log3(`daemon path no longer on disk \u2014 running=${hello.daemonPath} (gone) expected=${this.daemonEntry}; recycling`);
this.recycleDaemon(hello.pid);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Recycle on any daemon-path mismatch.

Line 1214 only recycles when the running daemon path differs and the old path is already gone. After an upgrade, the previous bundle path can still exist on disk, so this hook will accept the stale daemon, set helloVerified = true, and keep old embed behavior until idle timeout. Recycle on any hello.daemonPath !== this.daemonEntry; keep the existence check only for log wording.

Suggested fix
-    if (hello.daemonPath !== this.daemonEntry && !existsSync3(hello.daemonPath)) {
+    if (hello.daemonPath !== this.daemonEntry) {
       _recycledStuckDaemon = true;
-      log3(`daemon path no longer on disk — running=${hello.daemonPath} (gone) expected=${this.daemonEntry}; recycling`);
+      const suffix = existsSync3(hello.daemonPath) ? "" : " (gone)";
+      log3(`daemon path mismatch — running=${hello.daemonPath}${suffix} expected=${this.daemonEntry}; recycling`);
       this.recycleDaemon(hello.pid);
       return true;
     }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if (hello.daemonPath !== this.daemonEntry && !existsSync3(hello.daemonPath)) {
_recycledStuckDaemon = true;
log4(`daemon path no longer on disk \u2014 running=${hello.daemonPath} (gone) expected=${this.daemonEntry}; recycling`);
log3(`daemon path no longer on disk \u2014 running=${hello.daemonPath} (gone) expected=${this.daemonEntry}; recycling`);
this.recycleDaemon(hello.pid);
if (hello.daemonPath !== this.daemonEntry) {
_recycledStuckDaemon = true;
const suffix = existsSync3(hello.daemonPath) ? "" : " (gone)";
log3(`daemon path mismatch — running=${hello.daemonPath}${suffix} expected=${this.daemonEntry}; recycling`);
this.recycleDaemon(hello.pid);
return true;
}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@codex/bundle/pre-tool-use.js` around lines 1214 - 1217, The current check
only recycles when the daemon path differs and the old path is missing; change
it to recycle whenever hello.daemonPath !== this.daemonEntry. Modify the block
around the comparison (referencing hello.daemonPath, this.daemonEntry,
_recycledStuckDaemon, log3 and this.recycleDaemon) so it unconditionally sets
_recycledStuckDaemon = true and calls this.recycleDaemon(hello.pid) when the
paths differ; use existsSync3(hello.daemonPath) only to choose the log wording
(e.g., "(gone)" vs "(present)"), not as a gate for recycling.

Comment thread codex/bundle/stop.js
Comment on lines +1340 to 1343
if (hello.daemonPath !== this.daemonEntry && !existsSync8(hello.daemonPath)) {
_recycledStuckDaemon = true;
log4(`daemon path no longer on disk \u2014 running=${hello.daemonPath} (gone) expected=${this.daemonEntry}; recycling`);
log3(`daemon path no longer on disk \u2014 running=${hello.daemonPath} (gone) expected=${this.daemonEntry}; recycling`);
this.recycleDaemon(hello.pid);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Recycle on any daemon-path mismatch here too.

This branch has the same upgrade hole as the pre-tool bundle: if the old daemon path still exists, the stop hook treats the stale daemon as valid and never switches to the current bundle. That can keep the pre-PR daemon behavior alive until idle timeout.

Suggested fix
-    if (hello.daemonPath !== this.daemonEntry && !existsSync8(hello.daemonPath)) {
+    if (hello.daemonPath !== this.daemonEntry) {
       _recycledStuckDaemon = true;
-      log3(`daemon path no longer on disk — running=${hello.daemonPath} (gone) expected=${this.daemonEntry}; recycling`);
+      const suffix = existsSync8(hello.daemonPath) ? "" : " (gone)";
+      log3(`daemon path mismatch — running=${hello.daemonPath}${suffix} expected=${this.daemonEntry}; recycling`);
       this.recycleDaemon(hello.pid);
       return true;
     }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if (hello.daemonPath !== this.daemonEntry && !existsSync8(hello.daemonPath)) {
_recycledStuckDaemon = true;
log4(`daemon path no longer on disk \u2014 running=${hello.daemonPath} (gone) expected=${this.daemonEntry}; recycling`);
log3(`daemon path no longer on disk \u2014 running=${hello.daemonPath} (gone) expected=${this.daemonEntry}; recycling`);
this.recycleDaemon(hello.pid);
if (hello.daemonPath !== this.daemonEntry) {
_recycledStuckDaemon = true;
const suffix = existsSync8(hello.daemonPath) ? "" : " (gone)";
log3(`daemon path mismatch — running=${hello.daemonPath}${suffix} expected=${this.daemonEntry}; recycling`);
this.recycleDaemon(hello.pid);
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@codex/bundle/stop.js` around lines 1340 - 1343, The current branch only
recycles when the daemonPath differs AND the old path is missing (uses
existsSync8), which leaves a stale but different daemon on disk active; change
the condition to recycle whenever hello.daemonPath !== this.daemonEntry
regardless of existsSync8 so that _recycledStuckDaemon is set and
this.recycleDaemon(hello.pid) is always called on a mismatch (update the log
using hello.daemonPath and this.daemonEntry as currently done); locate the check
that references hello.daemonPath, this.daemonEntry, existsSync8,
_recycledStuckDaemon, and this.recycleDaemon(hello.pid) and remove/ignore the
existsSync8 gating so mismatched daemon paths are always recycled.

Comment on lines 67849 to 67857
_recycledStuckDaemon = true;
log4(`daemon does not implement hello (older protocol); recycling`);
log3(`daemon does not implement hello (older protocol); recycling`);
this.recycleDaemon(hello.pid);
return true;
}
if (hello.daemonPath !== this.daemonEntry && !existsSync5(hello.daemonPath)) {
if (hello.daemonPath !== this.daemonEntry && !existsSync4(hello.daemonPath)) {
_recycledStuckDaemon = true;
log4(`daemon path no longer on disk \u2014 running=${hello.daemonPath} (gone) expected=${this.daemonEntry}; recycling`);
log3(`daemon path no longer on disk \u2014 running=${hello.daemonPath} (gone) expected=${this.daemonEntry}; recycling`);
this.recycleDaemon(hello.pid);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Don't make the recycle path a one-shot global.

Lines 67849 and 67855 already set _recycledStuckDaemon, and this handler now uses the same flag as a hard gate. After any earlier recycle, later transformers-missing failures stop triggering recycleDaemon(), so the new silent flow can get stuck failing for the rest of the session. Please reset this gate after a successful hello from the expected daemon, or scope it to the current daemon PID/path instead of a module-global boolean.

Also applies to: 67876-67880

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@hermes/bundle/shell/deeplake-shell.js` around lines 67849 - 67857, The global
boolean _recycledStuckDaemon is being used as a hard gate in the hello handler
so once set by recycleDaemon it prevents later transformer-missing failures from
triggering another recycle; change this to be per-daemon (e.g. key by hello.pid
or hello.daemonPath) or clear/reset the flag on a successful hello from the
expected daemon. Concretely, replace or augment _recycledStuckDaemon with a
map/set keyed by hello.pid or hello.daemonPath (or call a reset function when a
hello matches this.daemonEntry and is valid) and update the checks around
recycleDaemon(hello.pid) and the successful-hello path to remove the per-daemon
entry so subsequent bad-hello events can still call recycleDaemon; update all
occurrences (including the second occurrence around lines 67876-67880) to use
the new scoped gate or clearing logic.

Comment thread pi/bundle/wiki-worker.js
Comment on lines +6 to 7
import { join as join6 } from "node:path";

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Pass the bundled embed-daemon.js explicitly here too.

This worker still relies on the constructor default, but Lines 194-195 now only auto-spawn from HIVEMIND_EMBED_DAEMON or ~/.hivemind/embed-deps/embed-daemon.js. On installs that only have the bundled daemon, PI summaries will start writing NULL embeddings.

Suggested fix
-import { join as join6 } from "node:path";
+import { dirname as dirname2, join as join6 } from "node:path";
+import { fileURLToPath } from "node:url";
...
-            embedding = await new EmbedClient({}).embed(text, "document");
+            const daemonEntry = join6(dirname2(fileURLToPath(import.meta.url)), "embeddings", "embed-daemon.js");
+            embedding = await new EmbedClient({ daemonEntry }).embed(text, "document");

Also applies to: 771-773

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@pi/bundle/wiki-worker.js` around lines 6 - 7, The worker currently relies on
its constructor default for the embed-daemon path which causes it to miss the
bundled embed-daemon.js; update the worker instantiation(s) that spawn the embed
daemon (e.g., the new Worker(...) / EmbedDaemon constructor calls around where
join6 is imported and the other occurrences at the block referenced by lines
~771-773) to pass the explicit path to the bundled embed-daemon.js (use join6 to
build the path to the bundled file and supply that value as the daemon path
argument to the constructor/spawn function) so the daemon always uses the
bundled embed-daemon.js instead of falling back to HIVE... or ~/.hivemind
defaults.

kaghni added 5 commits May 19, 2026 00:25
Companion to the embeddings-banner removal in this same PR. When the
server returns 402 with the "insufficient balance, please top up" body,
the SDK currently logs "fatal" to ~/.deeplake/hook-debug.log (which only
exists with HIVEMIND_DEBUG=1) and throws. Callers catch + swallow:
captures fail silently, memory recall returns empty, the agent reasons
from no data, the user is told nothing. Verified live against beta with
balance_cents=0 — full session ran without the user seeing any signal.

This fix adds a catch in DeeplakeApi._queryWithRetry right before the
throw: when status === 402 AND body contains "balance_cents", enqueue a
warn-severity notification. Process-local dedup so retries within one
hook process don't spam the queue; the queue's existing sameDedupKey
guard handles cross-process dedup. DedupKey carries the UTC date so the
banner re-fires daily until the user tops up.

Banner copy:
  Title: "Hivemind credits exhausted — top up to keep capturing"
  Body:  "Sessions are not being saved and memory recall is returning
          empty. Top up at https://app.deeplake.ai/billing to restore
          capture and recall."

Caller contracts unchanged — the original throw still fires, callers'
existing try/catch logic keeps working.

Test plan: 5 new cases in tests/claude-code/deeplake-api-balance-
exhausted.test.ts pin the contract — enqueues on the specific 402 shape,
process-local dedup, ignores 402 without balance_cents, ignores other
statuses, preserves the original error.

Out of scope for this PR (filed as follow-up):
  - Mid-session surface via additionalContext injection (model tells
    user immediately rather than waiting for next session-start)
  - Proactive low-balance warning before reaching zero
Fix-up to the prior balance-exhausted commit. The billing URL was a
generic `https://app.deeplake.ai/billing` which doesn't exist; the
canonical shape is `https://deeplake.ai/{orgName}/workspace/{workspaceId}/billing`.

Read orgName + workspaceId from ~/.deeplake/credentials.json at banner
construction time via the existing loadCredentials() helper. Falls back
to the bare `https://deeplake.ai` host when creds are missing/malformed
(rather than producing a URL with literal `undefined` segments).

Test added: rmSync the planted creds and assert the body still has a
real URL (no `undefined`, no broken `/workspace/` segment).
… per session)

The balance-exhausted banner shipped earlier in this PR was deduped to
once-per-day via a date in dedupKey. That's wrong for a critical error
that recurs until the user takes action — by design a session-start
banner that fires only the first time the user runs claude on a given
day, then silently dedups for 23 hours, defeats its own purpose.

Add a `transient: true` flag on Notification that opts out of state.shown
tracking AND auto-releases the atomic claim file after firing. Marked
balance-exhausted with the flag and dropped the date from its dedupKey.

Semantics now:
- Real 402 → SDK enqueues balance-exhausted with stable dedupKey
- Multiple hook processes within one session → queue lock collapses to
  one entry (sameDedupKey check)
- Session-start drain → banner fires, state.shown NOT written, claim
  file released
- Balance still 0 next session → next 402 → fresh enqueue → fires again
- Balance restored → no 402, no enqueue, queue stays empty, silence

Welcome / savings recap / backend pushes / queued items keep their
existing dedup-across-sessions behavior (transient defaults to false).
A new control test pins that contract so a future change doesn't
accidentally turn off state.shown for non-transient notifications.

Tests:
- tests/claude-code/notifications.test.ts: 2 new cases —
  (1) transient: refires after re-enqueue, no state.shown record
  (2) non-transient (control): state.shown blocks refire on same key
- tests/claude-code/deeplake-api-balance-exhausted.test.ts: assert
  the SDK now marks the notification transient and drops date from key

Live-verified against beta with balance_cents=0:
- Run 1: 402 → enqueue → next session drain → banner fires with
  "Top up at https://deeplake.ai/mind/workspace/default/billing"
- state.shown has no balance-exhausted; claim file gone post-drain
- Queue already has a fresh entry ready for the next session
…text

The session-start banner shipped earlier in this PR only surfaces the
"credits exhausted" warning at the NEXT session start — a user mid-
session whose balance hits 0 sees nothing for the rest of their current
session. Captures keep failing, memory keeps returning empty, the model
keeps reasoning from no data, no signal reaches the user until they
exit and re-enter claude.

Add inline signaling: capture.ts's top-level catch detects the 402+
balance_cents error and emits a hook response with additionalContext
that Claude Code injects as a system reminder for the next turn. The
model surfaces the issue to the user in its response.

systemMessage was tried first but Claude Code only renders it for
SessionStart hooks — for PostToolUse / UserPromptSubmit / Stop it's
silently dropped. The additionalContext + model-relay path was
verified live: after one failing tool call, claude wrote "Hivemind
credits are exhausted — sessions are not being saved and memory recall
is empty. Top up at https://deeplake.ai/mind/workspace/default/billing
to restore." in its response.

URL construction mirrors deeplake-api.ts's billingUrl(): reads
orgName + workspaceId from ~/.deeplake/credentials.json, falls back to
the bare https://deeplake.ai host on missing/malformed creds. ESM
imports at the top (readFileSync, homedir) rather than dynamic
require() — the bundle is ESM and require isn't available.

Scope: claude-code only. Codex/Cursor/Hermes have their own capture.ts
files and still get the session-start banner via the SDK enqueue;
adding their inline signal is a small follow-up if/when needed.
CI coverage gate on src/notifications/state.ts:branches was failing at
63.63% (threshold 75%) because the new releaseClaim function added by
the transient-flag commit had uncovered error paths and a few existing
readState malformed-payload branches were also untouched.

Added six small tests:
  - releaseClaim unlinks the claim so a follow-up tryClaim succeeds
  - releaseClaim is a silent no-op on ENOENT
  - releaseClaim logs + continues on non-ENOENT errors (claim path is
    a directory → EISDIR/EPERM, hits the fail-soft branch)
  - readState treats `false` JSON as empty (the !parsed branch)
  - readState treats a number JSON as empty (typeof !== object branch)
  - readState treats missing `shown` key as empty

state.ts:branches now 77.27% (over the 75% gate). No source changes.
@kaghni
Copy link
Copy Markdown
Collaborator Author

kaghni commented May 19, 2026

Re: the 6 CodeRabbit findings — scope note

All 6 findings are on lines from #bfc8e07 (Emanuele's feat(embeddings): hello handshake + stuck-daemon recycle + visible signal) — not on code this PR changes. This PR's scope is narrow: remove the user-visible "Hivemind embeddings disabled — deps missing" enqueueNotification call. The recycle/daemon-path/migration logic is unchanged from bfc8e07.

Per-finding triage:

# File:Line Finding Disposition
1 capture.js:1615 _recycledStuckDaemon one-shot — no retry after failed recycle Pre-existing in bfc8e07. Single-hook-process scope so per-process degradation is bounded. Not addressed here.
2 pre-tool-use.js:1666 Unset HIVEMIND_EMBEDDINGS migrates to enabled=false Real UX regression but per src/user-config.ts:67, this is the documented migration rule from bfc8e07. Disagreement with design, not a bug in my PR. Filing follow-up issue.
3 codex/bundle/pre-tool-use.js:1368 Recycle should fire on any daemon-path mismatch Same pre-existing path; self-heals at 10-min daemon idle timeout. Not addressed here.
4 codex/bundle/stop.js:1494 Same as #3 in stop hook Same.
5 hermes/bundle/shell/deeplake-shell.js:68008 Same one-shot recycle gate as #1 Same.
6 pi/bundle/wiki-worker.js:7 Pass bundled embed-daemon.js explicitly Pi installs without shared deps fall back; minor. Not addressed here.

Leaving these to a future PR scoped to the embeddings recycle/migration area. Tagging @emanuele-fenocchi-activeloop for visibility — these are findings on your bfc8e07 work that I'm explicitly deferring.

@kaghni kaghni merged commit b95204a into main May 19, 2026
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants