Skip to content

fix(sync-gbrain): seed DATABASE_URL from ~/.gbrain/config.json into gbrain spawns#1508

Closed
thehashrocket wants to merge 1 commit into
garrytan:mainfrom
thehashrocket:fix/sync-gbrain-database-url-collision
Closed

fix(sync-gbrain): seed DATABASE_URL from ~/.gbrain/config.json into gbrain spawns#1508
thehashrocket wants to merge 1 commit into
garrytan:mainfrom
thehashrocket:fix/sync-gbrain-database-url-collision

Conversation

@thehashrocket
Copy link
Copy Markdown
Contributor

Summary

/sync-gbrain fails inside any project whose .env.local defines its own DATABASE_URL (Next.js, Prisma, Rails, etc.). gbrain auto-loads the project's .env.local from cwd via dotenv, picks up the app's local-Postgres URL, and tries to authenticate there with the password from ~/.gbrain/config.json. Auth fails. Code and memory stages crash; only the git-push brain-sync stage survives.

Repro

cd <Next.js project with DATABASE_URL=postgresql://postgres:***@localhost:5433/app in .env.local>
/sync-gbrain
# → ERR  code         source registration failed: gbrain not configured (run /setup-gbrain)
# → ERR  memory       gbrain import exited 1: password authentication failed for user "postgres"
# → OK   brain-sync   curated artifacts pushed

gbrain doctor run from the same cwd reports URL from env:DATABASE_URL — confirming gbrain is picking up the project's value, not the configured one. gbrain doctor from /tmp connects cleanly.

Fix

New helper buildGbrainEnv() in bin/gstack-gbrain-sync.ts:

  • Reads ~/.gbrain/config.json.
  • Returns { ...process.env, DATABASE_URL: cfg.database_url }.
  • Skipped when GSTACK_RESPECT_ENV_DATABASE_URL=1 is set (escape hatch for users whose brain genuinely lives in the project DB).
  • One-line log when it overrides a caller value.

Threaded through every gbrain spawn in runCodeImport, runMemoryIngest, and runBrainSyncPush. lib/gbrain-sources.ts already exposed an env option for exactly this case — wired up ensureSourceRegistered(... { env }) and sourcePageCount(id, env).

Why not just mutate process.env?

Tried that first. Doesn't work in Bun: child_process.spawnSync children receive Bun's startup env, not runtime mutations. Verified with a minimal repro — printenv in a Bun-spawned child returns the original .env.local value even after process.env.DATABASE_URL = .... Must pass env: explicitly. There's already a comment in lib/gbrain-sources.ts noting this exact caveat.

Verification

End-to-end run from the repro project after the patch:

[gbrain-sync] seeded DATABASE_URL from ~/.gbrain/config.json (overrode value from caller env / .env.local)
[gbrain-sync] mode=incremental engine=supabase

gstack-gbrain-sync (incremental):
  OK    code         registered + synced gstack-code-org-fd8a2521-d5b972 (page_count=575) (289.7s)
  OK    memory       gbrain import: 9 imported, 0 unchanged, 0 failed (36.2s)
  OK    brain-sync   curated artifacts pushed (0.4s)

  3 ok, 0 error, 0 skipped

Upstream gbrain consideration

The underlying root cause is on gbrain's side: the CLI shouldn't let CWD .env.local override its own ~/.gbrain/config.json. Either config wins over CWD dotenv, or gbrain namespaces its connection var (GBRAIN_DATABASE_URL) so collisions are impossible. This patch is a gstack-side workaround; happy to also file a gbrain bug.

Test plan

  • bun run bin/gstack-gbrain-sync.ts --help parses
  • bun run bin/gstack-gbrain-sync.ts --dry-run emits the seed-env log line
  • Real incremental sync from inside a Next.js project: all 3 stages OK (575 pages indexed)
  • Escape hatch: GSTACK_RESPECT_ENV_DATABASE_URL=1 bun run ... returns process.env unchanged (verified by reading the buildGbrainEnv code path)

🤖 Generated with Claude Code

…brain spawns

gbrain auto-loads .env.local from cwd via dotenv. When /sync-gbrain runs
inside a Next.js / Prisma / Rails project whose .env.local defines its
own DATABASE_URL (pointing at the app's local DB), gbrain picked that up
instead of its own ~/.gbrain/config.json URL — auth failed, code + memory
stages crashed; only the brain-sync git push survived.

Fix: new buildGbrainEnv() helper reads ~/.gbrain/config.json once, builds
a child-env dict with DATABASE_URL set to the gbrain-configured URL, and
threads it through every gbrain spawn in runCodeImport, runMemoryIngest,
and runBrainSyncPush (plus ensureSourceRegistered and sourcePageCount via
the env option lib/gbrain-sources.ts already exposes).

Cannot just mutate process.env — Bun's child_process.spawnSync children
get the original startup env, not runtime mutations. Must pass env:
explicitly. Comment in the helper records this caveat.

Escape hatch: GSTACK_RESPECT_ENV_DATABASE_URL=1 returns process.env
unchanged for the (rare) case where the user really does want gbrain to
use the project's local DB.

Repro before patch:
  cd <Next.js project with DATABASE_URL=localhost:5432/app in .env.local>
  /sync-gbrain
  → code: ERR source registration failed: gbrain not configured
  → memory: ERR password authentication failed for user "postgres"

After patch: all three stages OK, code source registered, 575 pages
indexed in the test repo.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@garrytan
Copy link
Copy Markdown
Owner

Thank you @thehashrocket. Cherry-picked into v1.40.0.0 with attribution preserved (Co-Authored-By: Jason Shultz). Restructured per codex review into a centralized lib/gbrain-exec.ts helper; widened scope to also thread env into the gstack-memory-ingest grandchild. See commit 0fb7fa6. Consolidated v1.40.0.0 fix wave lands as PR #1547. Closing this one as superseded — your contribution is recognized in CHANGELOG.md and the commit metadata.

@garrytan garrytan closed this May 16, 2026
garrytan added a commit that referenced this pull request May 17, 2026
…n) (#1547)

* fix(gbrain-sync): fold hostname into code-source id hash + migration (#1414)

Cherry-picked from #1468 by 0xDevNinja and extended with the
hostname-fold migration that codex review surfaced.

Pre-fix `deriveCodeSourceId` hashed the absolute repo path alone, so two
machines with identical home-dir layouts (chezmoi-managed dotfiles,
ansible-provisioned VMs) derived the same id and clobbered each other's
`local_path` in a federated brain. Last-writer-wins, with cryptic "Not a
git repository" errors on the loser.

Hash key is now `\${hostname}::\${path}`. Conductor worktrees on a single
host stay distinct (path entropy unchanged within a host); cross-machine
federations stop colliding.

Migration (D1=B + codex refinements): every existing user has a
pre-#1468 path-only-hash source id in their brain that no longer matches
what `deriveCodeSourceId` produces. Without migration, the next sync
registers a fresh source and orphans the old one. This commit adds:

- \`derivePathOnlyHashLegacyId\` — separate helper for the pre-#1468 form.
  Distinct from \`deriveLegacyCodeSourceId\` (pre-pathhash v1.x form);
  both probes run.

- \`planHostnameFoldMigration\` — feature-checks \`gbrain sources rename
  <old> <new>\` (exact argument shape, not just \`--help\`), gates on
  path-drift (skip migration if old source's \`local_path\` differs from
  current repo root), and falls back to register-new + sync-OK +
  remove-old when rename is unsupported. As of gbrain 0.35.0.0 the
  rename subcommand does not exist, so users go through the cleanup
  path; the rename path stays dormant until gbrain ships it.

- \`removeOrphanedSource\` — called only AFTER new-source sync verifies
  page_count > 0. Closes the data-loss window codex flagged where
  "register new, remove old before sync" can wipe pages if sync fails.

- \`sourceLocalPath\` — looks up a source's \`local_path\` from
  \`gbrain sources list --json\` for the drift gate.

- Helpers accept an optional \`env\` parameter so tests can inject a
  gbrain shim via PATH without process-wide PATH mutation (Bun's
  spawnSync doesn't pick up runtime PATH changes). Pre-positions for
  commit 4's centralized gbrain-exec helper.

- \`if (import.meta.main)\` guard around \`main()\` so the helpers can be
  imported for in-process unit tests.

Tests cover: pure derivation, ids-match degenerate case, no-legacy
short-circuit, path-drift skip path, rename path with shim, cleanup
fallback when rename unsupported, cleanup fallback when rename call
itself fails, source-lookup happy/missing/error paths.

\`GSTACK_HOSTNAME\` env var is a test-only knob; production uses
\`os.hostname()\`.

Fixes #1414

Co-Authored-By: Claude <noreply@anthropic.com>

* fix(gbrain-sync): cut source-id slugs on hyphen boundaries (+ #1357)

Cherry-picked from #1481 by drummerms and extended with the explicit
HTTPS-remote regression case for #1357 (decision D2=A).

`constrainSourceId` truncated the slug with `slug.slice(-tailBudget)`,
which cut mid-word when the boundary fell inside a token. For a repo
where the combined `prefix-org-repo-pathhash` exceeded 32 chars, this
produced embarrassing artifacts like `gstack-code-kill-270c0001-c32152`
(from `drummerms-av-sow-wiz-skill-270c0001`).

Two changes carried from #1481, adapted for the #1468 hostpathhash:

1. `constrainSourceId` now walks hyphen-separated tokens from the right,
   accumulating whole tokens until adding the next would exceed
   `tailBudget`. When no token fits, falls through to the existing
   `${prefix}-${hash}` form.

2. `deriveCodeSourceId` now retries with `repo-only-hostpathhash`
   (dropping the org segment) when the full `org-repo-hostpathhash`
   triggers truncation. Keeps the repo name readable when it fits at all.

Plus a new test asserting the source id is period-free for the exact
HTTPS-with-.git remote shape from #1357 (`https://github.com/foo/bar.git`).
canonicalizeRemote strips `.git`; the sanitizer strips any residual
non-alnum. The test closes #1357 by pinning the property.

Closes #1357

Co-Authored-By: Claude <noreply@anthropic.com>

* fix(gbrain): probe CLI without command builtin

* fix(gbrain-sync): centralize gbrain spawn surface + seed DATABASE_URL

Cherry-picked from #1508 by jasshultz, restructured per codex review #4
and #7 to widen scope and centralize the spawn surface.

The bug: gbrain auto-loads .env.local from cwd via dotenv. When
/sync-gbrain runs inside a Next.js / Prisma / Rails project whose
.env.local defines its own DATABASE_URL (pointing at the app's local
DB), gbrain reads that value instead of its own
~/.gbrain/config.json — auth fails, code + memory stages crash.

This commit:

- Adds lib/gbrain-exec.ts: buildGbrainEnv, spawnGbrain, execGbrainJson,
  execGbrainText, spawnGbrainAsync (the last one for memory-ingest's
  streaming gbrain import call). buildGbrainEnv seeds DATABASE_URL from
  ${GBRAIN_HOME:-$HOME/.gbrain}/config.json, returns a fresh env object
  (never the caller's by identity — codex review #11), and honors the
  GSTACK_RESPECT_ENV_DATABASE_URL=1 escape hatch.

- Routes every gbrain spawn in bin/gstack-gbrain-sync.ts and
  bin/gstack-memory-ingest.ts through the helpers. Both files now own
  zero direct spawnSync("gbrain"|spawn("gbrain"|execFileSync("gbrain"
  call sites.

- Threads buildGbrainEnv into the spawnSync("bun", [memory-ingest], ...)
  grandchild in runMemoryIngest (codex review #7). Without this, the
  parent fix is half-baked — the bun child inherits a clean env but
  needs DATABASE_URL pre-seeded too. spawnGbrainAsync inside
  memory-ingest provides defense in depth for standalone invocations.

- Adds GBRAIN_HOME support — aligns with detectEngineTier (already
  honors GBRAIN_HOME) so all gstack-side gbrain calls agree on which
  config file matters. Resolves baseEnv.HOME first, then homedir(), so
  test injection works without process-wide HOME mutation.

- Adds test/build-gbrain-env.test.ts: 10 unit tests covering all five
  env-seeding branches (seed from config / override caller /
  GSTACK_RESPECT escape hatch / missing config / unparseable config /
  no database_url field / GBRAIN_HOME path / object-identity guard /
  unrelated-vars preservation / idempotent-when-matches).

- Adds test/gbrain-exec-invariant.test.ts: static-source check that
  greps both bin/gstack-gbrain-sync.ts and bin/gstack-memory-ingest.ts
  for direct spawnSync("gbrain"|spawn("gbrain"|execFileSync("gbrain"|
  execSync(...gbrain matches and fails the build if any are found.
  Refactor-proof against future contributors adding a new gbrain spawn
  without env threading.

The invariant is intentionally narrow — only the two files where the
DATABASE_URL bug actually hurts users are guarded. Migrating the
spawn sites in lib/gbrain-local-status.ts, lib/gstack-memory-helpers.ts,
and bin/gstack-brain-context-load.ts is a follow-up.

Co-Authored-By: Jason Shultz <jasshultz@gmail.com>
Co-Authored-By: Claude <noreply@anthropic.com>

* fix(gbrain-sync): add .gbrain-source to consumer repo .gitignore (#1384)

The v1.29.0.0 changelog promised .gbrain-source would be added to the
consuming repo's .gitignore so the per-worktree pin stays local, but the
change actually only added it to gstack's own .gitignore. Without the
consumer-side entry, the pin gets committed and Conductor sibling
worktrees of the same repo + branch step on each other's pin every time
anyone commits.

Add ensureGbrainSourceGitignored after a successful gbrain sources
attach in runCodeImport. Idempotent on repeat runs (line-trim match),
creates .gitignore if missing, logs a warning and continues on
permission errors so a read-only checkout doesn't fail the sync.

Gate the top-level main() call behind import.meta.main so tests can
import the helper without triggering a full sync run on module load.

Tests in test/gbrain-source-gitignore.test.ts cover: create-when-missing,
append-without-trailing-newline, append-with-trailing-newline,
idempotent on repeat, recognize whitespace-surrounded entry, no-throw
on read-only file. 6 pass.

* fix(gbrain-sources): bump gbrain sources list --json timeout 10s → 30s

Supabase free-tier cold-starts can push `gbrain sources list --json` past
10s (observed 14.5s in the wild), causing probeSource() to throw ETIMEDOUT
during /sync-gbrain code stage even though the underlying CLI was healthy.
Matches the 30s ceiling already used by `sources add` / `sources remove`
in the same file.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(brain-allowlist): sync project-root eng-review-test-plan artifacts (#1452)

Cherry-picked from #1465 by genisis0x and extended with the v1.40.0.0
upgrade migration that codex review #5 surfaced.

#1465 alone only patches bin/gstack-artifacts-init, which means fresh
installs and re-inits pick up the new pattern. But existing users who
already ran v1.38.1.0 have a `.migrations/v1.38.1.0.done` marker — that
migration won't re-run no matter what we change. So their installed
`.brain-allowlist`, `.brain-privacy-map.json`, and `.gitattributes` stay
without the new pattern, and `/plan-eng-review` artifacts continue to
silently drop out of their federation queue.

This commit:

- bin/gstack-artifacts-init: adds projects/*/*-eng-review-test-plan-*.md
  to the three managed blocks. v1.38.1.0 covered design + test-plan; this
  completes the set for /plan-eng-review.

- gstack-upgrade/migrations/v1.40.0.0.sh: targeted in-place repair for
  existing installs. Same idempotent jq-based shape as v1.38.1.0. Adds
  the new pattern to .brain-allowlist (before the USER ADDITIONS marker),
  .brain-privacy-map.json (as class=artifact), and .gitattributes (as
  merge=union). NEVER commits + pushes — the user controls when the
  patches ship to their federated artifacts repo.

- test/artifacts-init-migration.test.ts: 5 new tests covering the
  v1.40.0.0 migration applied on top of a post-v1.38.1.0 state, jq
  patching, gitattributes append, idempotent re-run, and done-marker
  write when files are missing entirely.

Co-Authored-By: Claude <noreply@anthropic.com>

* fix(gbrain-install): skip postinstall on Windows MSYS/MINGW + post-install probe

Cherry-picked from #1487 by genisis0x and extended with the post-install
subcommand probe per T6 / codex review #19.

`bun install` in $INSTALL_DIR fails on Windows MSYS/MINGW/Cygwin shells
because gbrain's native postinstall script mis-parses path arguments
and aborts with a non-zero exit, breaking gstack-gbrain-install for
Windows users running git-bash/MSYS2. The package installs cleanly
without scripts.

This commit:

- Adds Windows shell detection via `uname -s` matching
  MINGW*/MSYS*/CYGWIN*/Windows_NT (#1487's case statement already covers
  all four — codex review #18 confirmed MINGW* is included). Windows
  paths get `bun install --ignore-scripts`; macOS and Linux unchanged.

- Adds a post-install probe of `gbrain sources --help`. `gbrain --version`
  already runs (D19 PATH-shadowing validation), but version success
  doesn't prove the subcommand surface is reachable — and
  `--ignore-scripts` may have skipped artifacts that subcommands need.
  Probe failure logs a clear warning (with Windows-specific remediation
  pointing at re-running `bun install` outside MSYS) but does NOT exit
  non-zero; users may still get value from gbrain even if the probe
  fails transiently.

Refs #1271

Co-Authored-By: Claude <noreply@anthropic.com>

* chore: v1.40.0.0 — gbrain sync hardening wave

Bumps VERSION 1.39.2.0 → 1.40.0.0 (MINOR — substantial gbrain capability
hardening across sync pipeline, install path, federation allowlist;
~600 net LOC added across 8 community PRs + plan-review refinements).

CHANGELOG entry follows the release-summary format: two-line headline,
lead paragraph, "numbers that matter" with before/after table across 8
user-visible surfaces, "what this means for builders" closer, itemized
Added/Changed/Fixed/NOT fixed/For contributors sections.

Per-commit contributor credits: 0xDevNinja, drummerms, Jayesh Betala,
Jason Shultz, genisis0x. Also names NikhileshNanduri and realcarsonterry
in the wave's "Fixed" section for independent submissions of the
.gbrain-source gitignore bug.

Co-Authored-By: Claude <noreply@anthropic.com>

---------

Co-authored-by: 0xDevNinja <manmit0x@gmail.com>
Co-authored-by: Claude <noreply@anthropic.com>
Co-authored-by: drummerms <mike@av2o.com>
Co-authored-by: Jayesh Betala <jayesh.betala7@gmail.com>
Co-authored-by: Jason Shultz <jasshultz@gmail.com>
Co-authored-by: genisis0x <manietdavv@gmail.com>
anbangr added a commit to anbangr/gstack that referenced this pull request May 18, 2026
….1.0

Resolved version-slot collision: upstream's v1.40.0.0 was renumbered to
v1.40.0.5 in the CHANGELOG so it slots cleanly between fork's v1.40.0.0
(supervised-restart sweep fix) and v1.40.1.0 (test-framework detection).
All upstream code, tests, and migrations land verbatim — only the version
header on the changelog entry changed.

Conflict resolutions:
- VERSION: kept fork 1.40.1.0
- package.json: kept fork 1.40.1.0
- CHANGELOG.md: kept fork entries (1.40.1.0, 1.40.0.0 supervised-restart)
  + added upstream's gbrain-sync entry renumbered to 1.40.0.5 with a note
  explaining the renumber
- bin/gstack-artifacts-init: took upstream's additive patterns
  (projects/*/*-design-*.md, *-test-plan-*.md, *-eng-review-test-plan-*.md)

Upstream PRs landed in this merge (credits in 1.40.0.5 CHANGELOG entry):
- 0xDevNinja (hostname fold garrytan#1468)
- drummerms (hyphen-boundary cut garrytan#1481)
- Jayesh Betala (probe CLI garrytan#1485)
- Jason Shultz (DATABASE_URL seeding garrytan#1508 + timeout garrytan#1507)
- genisis0x (consumer gitignore garrytan#1521, allowlist eng-review garrytan#1465, Windows postinstall garrytan#1487)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants