Skip to content

feat(deepinit): manifest-based incremental update to avoid full re-scan #1719

@anduinnn

Description

@anduinnn

Problem

Currently, /deepinit has no change detection mechanism. Every run performs a full scan of the entire project:

  1. explore agent scans all directories
  2. Creates todo items for every directory
  3. Reads files and existing AGENTS.md in every directory
  4. Relies on LLM judgment to decide what to update (non-deterministic)

This means even if you only added/deleted a few files, the entire project gets re-scanned, wasting significant tokens.

Token cost estimate (30-directory project, 2 files changed)

Step Current (full scan) With incremental
Directory scanning ~2K tokens ~200 tokens (manifest diff)
Read existing AGENTS.md 30 × ~500 = ~15K 2 × ~500 = ~1K
Read directory files 30 dirs = ~20K 2 dirs = ~1.5K
Content generation 30 × ~1K = ~30K 2 × ~1K = ~2K
Total ~67K tokens ~5K tokens

~90% token savings for incremental runs.

Proposed Solution

Manifest-based incremental deepinit

Use a manifest file to record directory snapshots at generation time. On subsequent runs, compare the current directory tree against the manifest to determine which directories need updating.

1. Manifest file (.omc/deepinit-manifest.json)

{
  "version": 1,
  "lastRun": "2026-03-17T10:00:00Z",
  "directories": {
    "src/hooks": {
      "files": ["bridge.ts", "index.ts", "codebase-map.ts"],
      "generatedAt": "2026-03-17T10:00:00Z"
    },
    "src/tools/lsp": {
      "files": ["index.ts", "client.ts", "manager.ts"],
      "generatedAt": "2026-03-17T10:00:00Z"
    }
  }
}

Each directory entry stores only the file name list (not content hashes). AGENTS.md describes directory-level structure and purpose — file content changes (code logic modifications) don't affect AGENTS.md accuracy.

2. Change detection (deterministic, code-level — not LLM judgment)

New TypeScript module: src/tools/deepinit-manifest.ts

function detectChanges(manifest: Manifest): DiffResult {
  const current = scanDirectoryTree();   // fs.readdirSync traversal
  const result = { added: [], deleted: [], modified: [], unchanged: [] };

  for (const [dir, files] of current) {
    const prev = manifest.directories[dir];
    if (!prev) {
      result.added.push(dir);
    } else if (!sameFileList(files.sort(), prev.files.sort())) {
      result.modified.push(dir);
    } else {
      result.unchanged.push(dir);
    }
  }

  for (const dir of Object.keys(manifest.directories)) {
    if (!current.has(dir)) result.deleted.push(dir);
  }

  return result;
}

Pure readdirSync traversal (excluding node_modules, .git, dist, etc.) — completes in hundreds of milliseconds, zero token cost.

3. Change categories

Category Detection Action
new Directory exists but not in manifest Generate new AGENTS.md
deleted In manifest but directory no longer exists Remove orphaned AGENTS.md
modified File list differs (files added/removed) Regenerate AGENTS.md
unchanged File list identical Skip entirely
ancestor Has child directory with changes Update Subdirectories table only

4. Run modes

Command Behavior
/deepinit Incremental mode (default) — only process changed directories
/deepinit --full Force full regeneration
/deepinit --dry-run Show which directories need updating, without executing
/deepinit --check Validate existing AGENTS.md consistency against directory state

5. Updated SKILL.md workflow

Before:
  Step 1: Full directory scan → Step 2: Plan all dirs → Step 3: Generate all → Step 4: Validate all

After:
  Step 1: Run manifest diff tool → Get list of changed directories
  Step 2: Create tasks only for changed directories
  Step 3: Process only changed directories → Generate/update AGENTS.md
  Step 4: Update manifest file

6. Edge cases

Case Handling
First run (no manifest) Full generation, create manifest afterward
Directory renamed Detected as deleted + new; old removed, new generated
Only code content changed, same file list Skipped (correct — AGENTS.md describes structure, not implementation)
Manifest file corrupted/deleted Treated as first run

Design Decisions

Why not use git diff?

In practice, users often run /deepinit after multiple commits. Git diff across many commits produces a final-state diff that may miss intermediate structural changes (e.g., file added in commit 1, deleted in commit 3 — git diff shows nothing). The manifest approach compares "state at last generation" vs "state now", which is always accurate regardless of how many commits occurred in between.

Why file lists instead of content hashes?

AGENTS.md documents describe directory-level purpose, file descriptions, and relationships. These don't change when code logic within a file is modified — only when files are added, removed, or renamed. File list comparison is sufficient and avoids the overhead of computing hashes.

Labels

enhancement, agents

Metadata

Metadata

Assignees

No one assigned

    Labels

    agentsRelated to agent definitions

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions