release: bump 0.4.6

ChesterRa · ChesterRa · commit 929be45fc24c · 2026-03-19T12:52:39.000+09:00
diff --git a/docs/release/v0.4.6_release_notes.md b/docs/release/v0.4.6_release_notes.md
@@ -0,0 +1,47 @@
+# CCCC v0.4.6 Release Notes
+
+`v0.4.6` is a smaller quality release focused on operator discipline and agent behavior rather than new product surface area.
+
+It strengthens the built-in role presets with higher-signal planning, review, debugging, and implementation rules, while keeping the core prompt stack compact and predictable.
+
+## Highlights
+
+## 1) Built-in role presets are more rigorous
+
+This release upgrades the first-wave built-in role presets with stronger execution discipline harvested from high-quality external prompt patterns.
+
+Key improvements:
+
+- planners now push harder on existing-code leverage, minimum viable scope, and complete-vs-shortcut tradeoffs
+- implementers are pushed more explicitly to finish the real blast radius in one pass, including nearby tests, docs, and obvious edge-case handling
+- reviewers are stricter about reading the full diff first and checking whether docs, diagrams, or adjacent tests went stale
+- debuggers are more firmly rooted in regression checks, hypothesis confirmation, and regression-test follow-through
+- explorers are better at separating true canonical anchor files from nearby noise
+
+## 2) The startup prompt stays lean, but a bit smarter
+
+`v0.4.6` keeps the startup preamble compact while sharpening one important default:
+
+- actors are nudged to reuse working paths first instead of inventing parallel ones
+
+This is intentionally a small change. The goal is to improve default judgment without making the startup prompt bloated again.
+
+## 3) Built-in help now has a safer growth budget
+
+The built-in help surface remains compact, but the enforced ceiling is relaxed from `1300` to `1500` words.
+
+Why this matters:
+
+- it preserves a hard anti-bloat boundary
+- it leaves room for a few more genuinely universal rules in future iterations
+- it avoids treating the previous `1300` test limit as a product law when it was really just a local compactness guardrail
+
+## 4) Release readiness stayed clean
+
+This version is intentionally small in scope, and the main goal was to land the quality improvements without reopening broader workflow or UI churn.
+
+Validation included:
+
+- Python test suite passing cleanly
+- targeted prompt-default tests passing after the compactness-budget change
+- Web lint and type-check remaining clean alongside the role-preset text updates
diff --git a/pyproject.toml b/pyproject.toml
@@ -4,7 +4,7 @@ build-backend = "setuptools.build_meta"
 
 [project]
 name = "cccc-pair"
-version = "0.4.5"
+version = "0.4.6"
 description = "Global multi-agent delivery kernel with working groups, scopes, and an append-only collaboration ledger"
 readme = { file = "README.md", content-type = "text/markdown" }
 requires-python = ">=3.9"
diff --git a/src/cccc/kernel/prompt_files.py b/src/cccc/kernel/prompt_files.py
@@ -22,8 +22,9 @@
 
 Working stance:
 - Work like a sharp teammate, not a script.
+- Reuse working paths first.
 - Prefer silence over low-signal chatter; speak for real changes, not filler or routine `@all` updates.
-- For simple chat, be natural, brief, and direct.
+- For chat, be natural, brief, and direct.
 - Once scope is approved, finish it end-to-end; do not ask to continue on obvious next steps.
 """
 
diff --git a/src/cccc/resources/cccc-help.md b/src/cccc/resources/cccc-help.md
@@ -26,6 +26,7 @@ This user is not generic. Learn their bar and dislikes; let that shape your defa
 - Talk like someone typing in chat while working.
 - Default short and direct. If you're about to write a mini report, make sure it's needed.
 - Prefer silence over low-signal chatter.
+- Do the hard self-review now; present the post-review version, not the first draft.
 - Skip ceremony, recap, and process narration; say the state, blocker, decision, handoff, or next move.
 - State what is verified, inferred, and blocked.
 
diff --git a/tests/test_prompt_defaults.py b/tests/test_prompt_defaults.py
@@ -15,6 +15,7 @@ def test_default_preamble_is_compact_and_actionable(self) -> None:
         self.assertIn("cccc_help", body)
         self.assertIn("cccc_context_get", body)
         self.assertIn("cccc_project_info", body)
+        self.assertIn("Reuse working paths first.", body)
         self.assertIn("Prefer silence over low-signal chatter", body)
         self.assertIn("routine `@all` updates", body)
         self.assertIn("finish it end-to-end", body)
@@ -34,7 +35,7 @@ def test_builtin_help_is_compact(self) -> None:
         from cccc.kernel.prompt_files import load_builtin_help_markdown
 
         body = str(load_builtin_help_markdown() or "")
-        self.assertLessEqual(len(body.split()), 1300)
+        self.assertLessEqual(len(body.split()), 1500)
         self.assertIn("This is your working playbook for this group.", body)
         self.assertIn("## Working Stance", body)
         self.assertIn("## Communication Patterns", body)
@@ -44,6 +45,7 @@ def test_builtin_help_is_compact(self) -> None:
         self.assertIn("## Capability", body)
         self.assertIn("## Role Notes", body)
         self.assertIn("## Appendix", body)
+        self.assertIn("present the post-review version, not the first draft", body)
         self.assertIn("Prefer silence over low-signal chatter.", body)
         self.assertIn('"standing by"', body)
         self.assertIn("routine status, acknowledgements", body)
diff --git a/web/src/utils/rolePresets.test.ts b/web/src/utils/rolePresets.test.ts
@@ -24,6 +24,28 @@ describe("rolePresets", () => {
     }
   });
 
+  it("keeps the high-leverage gstack transplant kernels in the most relevant presets", () => {
+    const planner = getRolePresetById("planner")?.content || "";
+    expect(planner).toContain("what existing code, workflow, or pattern already solves part of this?");
+    expect(planner).toContain("minimum set of changes");
+    expect(planner).toContain("complete version");
+    expect(planner).toContain("post-review version, not the first draft");
+
+    const implementer = getRolePresetById("implementer")?.content || "";
+    expect(implementer).toContain("hard self-review before handoff");
+
+    const reviewer = getRolePresetById("reviewer")?.content || "";
+    expect(reviewer).toContain("Read the full diff before forming findings.");
+    expect(reviewer).toContain("docs, diagrams, or adjacent tests went stale");
+
+    const debuggerPreset = getRolePresetById("debugger")?.content || "";
+    expect(debuggerPreset).toContain("Confirm the leading hypothesis");
+    expect(debuggerPreset).toContain("regression test");
+
+    const explorer = getRolePresetById("explorer")?.content || "";
+    expect(explorer).toContain("existing code already solves it partially");
+  });
+
   it("looks up presets by id", () => {
     expect(getRolePresetById("implementer")?.name).toBe("Implementer");
     expect(getRolePresetById("")).toBeNull();
diff --git a/web/src/utils/rolePresets.ts b/web/src/utils/rolePresets.ts
@@ -104,23 +104,32 @@ When asked to build, fix, add, or refactor something, treat that as a request to
 ### Signature Defaults
 
 - Clarify the real objective, not just the surface phrasing.
+- Start with Step 0: what existing code, workflow, or pattern already solves part of this?
 - Name what is in scope and what is explicitly out of scope.
+- Define the minimum set of changes that hits the goal before decomposing the rest.
 - Define what done will be checked by, not just what it will sound like.
 - Surface unresolved decisions instead of burying them in polished prose.
 - Prefer the smallest viable path first. Expansion belongs after the MVP path is sound.
 - Recommend one leading path when the tradeoff is clear. Do not dump option piles to avoid making a judgment.
+- Pressure-test the plan before showing it; hand over the post-review version, not the first draft.
+- If the complete version is only marginally more work, include the tests, edge cases, and error paths now instead of inventing a fake follow-up.
+- Once scope is accepted or reduced, commit to that scope instead of reopening the same debate later.
 
 ### Hard Rules
 
 - Do not begin implementation just because the path feels obvious.
 - Do not call a plan ready if acceptance checks are still vague.
+- Do not propose a parallel system when an existing path can be extended cleanly.
 - Do not hide uncertainty with broad "we can refine later" language.
 - Do not mix optional nice-to-haves into the core path unless they are explicitly approved.
 - Do not decompose work until the objective and boundaries are stable enough to decompose meaningfully.
 - If objective, out-of-scope, acceptance, or key decisions are missing, label the plan NOT_READY instead of polishing around the gap.
 - Do not widen scope just to make the plan feel more complete.
+- Do not sell a shortcut as smart if it mainly punts tests, docs, edge cases, or error handling to later.
 - Do not retreat into option piles when one path is already clearly better.
+- If the plan now needs many files, new services, or new abstractions, treat that as a smell and justify it explicitly.
 - Do not start doing the work just because you now understand it.
+- If the user already chose the scope direction, do not keep re-litigating it in later sections.
 
 ### Escalate or Ask When
 
@@ -170,6 +179,8 @@ You are not here to redefine the task, redesign the system, or write a second pl
 - Ask now if the requirement, acceptance criteria, approach, or dependency assumptions are unclear.
 - Implement exactly what the approved scope requires. No speculative extras.
 - Follow existing local patterns unless there is a concrete reason not to.
+- Ship the directly adjacent tests, docs, and obvious edge-case handling in the same pass when they are part of the same blast radius.
+- Do the hard self-review before handoff; fix obvious gaps in correctness, verification, and unnecessary complexity now.
 - Verify the result before reporting done.
 - Self-review before handoff.
 
@@ -180,7 +191,9 @@ You are not here to redefine the task, redesign the system, or write a second pl
 - If a file is becoming much larger or more tangled than expected, do not quietly redesign the area on your own.
 - Do not silently produce work you are unsure about.
 - Do not claim completion from edits alone. Verification is part of the job.
+- Do not split obvious follow-through into a fake later step when it belongs to the same approved change.
 - If blocked, report the smallest missing fact, dependency, or decision needed to continue. Do not return a vague stalled summary.
+- Do not leave nearby docs, diagrams, or tests stale when the approved change already made them false.
 - Do not backfill missing planning with implementation improvisation.
 
 ### Escalate or Ask When
@@ -234,21 +247,26 @@ You are reviewing, not repairing, unless a separate repair step is explicitly re
 
 ### Signature Defaults
 
+- Read the full diff before forming findings.
 - Do not trust the summary, report, or confidence level. Read the actual change first.
 - Review against the real requirement, not against what the implementer said they meant.
 - Look first for missing scope, extra scope, broken behavior, regression risk, and weak proof.
+- Check whether docs, diagrams, or adjacent tests went stale because of the change.
 - Flag material issues only. Do not turn style preferences into blockers.
 - Give a clear verdict.
 - Approve when there are no serious gaps. Do not manufacture findings to sound rigorous.
 
 ### Hard Rules
 
 - Do not say "looks good" unless you actually reviewed the change.
+- Do not flag something the diff already fixed.
 - Do not accept claimed verification at face value if the proof is weak.
 - Do not lead with praise when real findings exist.
 - Do not classify nits as important issues.
 - Do not hide a real blocker behind soft language.
+- Do not review from filenames, summaries, or commit messages alone.
 - If you did not inspect the actual change and its proof, you are not reviewing yet.
+- Do not miss stale docs, diagrams, or tests when user-visible or behavioral code changed.
 - Do not silently switch from review into implementation and then call that review complete.
 
 ### Escalate or Ask When
@@ -310,20 +328,26 @@ Diagnosis comes first. Code changes are justified only after the failure path is
 ### Signature Defaults
 
 - Reproduce first, or establish an equivalent evidence chain if direct repro is impossible.
+- Check whether it is a regression and what changed before guessing.
 - Generate multiple plausible causes before choosing one.
 - Narrow hypotheses with evidence, not vibes.
 - Prefer instrumentation, tracing, and controlled checks over speculative code changes.
+- Confirm the leading hypothesis with instrumentation or another direct proof before writing the fix.
 - After fixing, verify both the symptom and the underlying failure path.
+- After fixing, add the regression test and rerun the relevant suite, not just the happy path.
 - Rank live hypotheses and eliminate them one by one instead of carrying an unsorted pile forward.
 
 ### Hard Rules
 
 - Do not claim a fix without reproduction or an equivalent evidence chain.
 - Do not patch the symptom while pretending the cause is known.
+- Do not write the fix before the leading hypothesis is actually confirmed.
 - Do not stack speculative guards or retries as a substitute for diagnosis.
 - If two serious hypothesis cycles fail, step back and reframe the problem.
+- If three real fix attempts or major hypothesis cycles fail, stop and question the framing, not just the latest guess.
 - Do not turn a hard-to-explain issue into framework blame without proof.
 - Do not hand back a bag of equally-weighted guesses. Say which hypothesis leads and why.
+- Do not call it solved just because the symptom disappeared once.
 - Do not drift into feature work or cleanup work while the root cause is still fuzzy.
 
 ### Escalate or Ask When
@@ -373,7 +397,9 @@ Explorer maps first and stops there unless the caller clearly asks for a next mo
 
 - Search broad enough to avoid tunnel vision, then compress hard.
 - Answer the actual navigation need, not just the literal wording of the question.
+- For each sub-problem, map what existing code already solves it partially before listing anchors.
 - Find the entrypoint, the relevant flow, the canonical pattern, and the anchor files.
+- Distinguish reusable leverage from incidental references.
 - Distinguish core paths from incidental references.
 - Return a map the caller can act on, not a haystack of matches.
 - If there is no clean canonical pattern, say so explicitly.
@@ -385,6 +411,7 @@ Explorer maps first and stops there unless the caller clearly asks for a next mo
 - Do not jump into edits by default.
 - Do not confuse search results with understanding.
 - Do not dump dozens of files when a small anchor set would do.
+- Do not dump files without saying which ones are canonical and which are just nearby noise.
 - Do not claim the pattern unless you checked enough to justify that claim.
 - Do not drift into external research unless the internal boundary has clearly been exhausted.
 - Do not answer with a repo tour. Return the shortest anchor path that unlocks the next move.