Skip to content

Commit e2b5da8

Browse files
Yeachan-HeoWooklae-choriftzen-bitksseonocarlos-cubas
authored
chore(release): release v4.8.0 (#1605)
* feat(lsp): add Verilog/SystemVerilog support via verible-verilog-ls - Add verible-verilog-ls to LSP_SERVERS (.v, .vh, .sv, .svh) - Fix commandExists() to handle absolute paths via fs.existsSync - Add language mappings: verilog, systemverilog, sv, v - Update tests: 18 → 19 servers, add file/language resolution cases Closes #1550 * fix(hooks): use correct npm package name in session-start update check (#1556) * fix: patch 21 security vulnerabilities and logic bugs (#1558) * fix: patch 21 security vulnerabilities and logic bugs Security fixes: - SSRF guard bypass via IPv6-mapped IPv4 addresses (::ffff:127.0.0.1) - Command injection in tmux launchCmd (shell metacharacter validation) - Prototype pollution in deepMerge (block __proto__/constructor/prototype) - Shell injection in tsc-runner (execSync → execFileSync) Concurrency & data integrity fixes: - Shared state mutation in magic-keywords (deep clone builtInMagicKeywords) - Task file rollback without lock in runtime.ts (wrap in withTaskLock) - Silent lockless write fallback in updateTask (now throws) - Promise double-settle in socket-client (settled guard) - Timer/resource leak in LSP client disconnect Logic & correctness fixes: - Missing nodeBinary field in auto-update config - Duplicate 'planner' condition in magic-keywords (→ 'planning') - JSONC parser only handling \" escape (now handles all escape sequences) - Mismatched parentheses in trace-tools string building - Unconditional sleep in killWorkerPanes (check panes alive first) - pid undefined guard in bridge-manager BridgeMeta construction Hook fixes: - Non-atomic file write in persistent-mode (temp + rename pattern) - Windows path compatibility in stop-continuation (pathToFileURL) - Flawed path containment check in post-tool-use-failure - Missing regex escaping in session-start extractSection - Unsafe Number() coercion in session-start compareVersions Test update: - task-file-ops test updated to match new throw-on-lock-failure behavior * fix(trace): scope hook result case block --------- Co-authored-by: Yeachan-Heo <hurrc04@gmail.com> * fix(update): resolve Windows reconcile binary via omc.cmd (#1560) * fix(hud): keep skill statusLine guidance portable (#1562) * fix: skip stop-hook protection for non-OMC skills (#1559) * fix: skip stop-hook protection for non-OMC skills Non-OMC skills (e.g., Anthropic example-skills, document-skills, superpowers) were incorrectly assigned 'light' protection by the fallback in getSkillProtection(), causing the persistent-mode Stop hook to block session termination after invoking external skills like xlsx, pdf, pptx, etc. Fix: check for the 'oh-my-claudecode:' prefix before applying protection. Skills without the prefix are external and return 'none'. * fix: change fallback to 'none' for unregistered skills Address review feedback from Codex: bridge.ts strips the 'oh-my-claudecode:' prefix before calling getSkillProtection(), so prefix-based detection would break OMC skill protection. Simpler fix: change the fallback from 'light' to 'none'. Registered OMC skills keep their explicit protection levels. Unregistered skills (external plugins) get 'none' and are not blocked. Trade-off: if a new OMC skill is added without updating SKILL_PROTECTION, it won't have stop-hook protection. This is acceptable because (1) the map update is part of the same PR that adds the skill, and (2) the previous 'light' fallback caused false positives for ALL external plugin skills. * fix: register missing OMC skills in SKILL_PROTECTION Address second Codex review: 5 built-in OMC skills were missing from the protection map (omc-plan, ai-slop-cleaner, ask, release, setup/psm). These were relying on the old 'light' fallback. Now all 28 built-in skills are explicitly registered, so changing the fallback to 'none' has no regression on OMC skill protection. * fix(team): harden pane stall heuristics (#1566) * fix(team): harden pane stall heuristics * test: align skill-state expectations with #1559 * feat(trace): add tracer agent and trace skill (#1568) * feat(trace): add tracer agent and trace skill * fix(docs): align REFERENCE agent toc count * test(hud): lock Windows-safe HUD imports (#1573) * fix(setup): preserve canonical CLAUDE markers (#1574) * docs(trace): harden tracer evidence protocol (#1576) * fix: clean stale team runtime state after team clear (#1577) * feat(setup): sync unified MCP registry to codex config (#1579) * fix(team): remove double shell-escaping of env vars in worker spawn (#1415) (#1580) buildWorkerStartCommand applied shellEscape twice to env var values: once in the envAssignments construction (KEY=shellEscape(value)), and again via .map(shellEscape) on the final array. This caused `env` to receive values wrapped in literal single quotes (e.g. ANTHROPIC_MODEL='us.anthropic...[1m]' instead of ANTHROPIC_MODEL=us.anthropic...[1m]). On Bedrock, this corrupted CLAUDE_CODE_USE_BEDROCK ('1' !== '1') and ANTHROPIC_MODEL (invalid with literal quotes), causing spawned team workers to fall back to Claude Code's built-in default model (claude-sonnet-4-6), which Bedrock rejects with a 400 error. Fix: exclude already-escaped envAssignments from the final .map(shellEscape) pass. Add regression tests with Bedrock model IDs containing brackets and slashes. Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix(skill-state): default unknown skills to no protection (#1582) * fix(windows): harden omc-setup HUD statusline flow (#1586) * fix(config): replace hard-coded absolute path in vitest alias (#1588) * Backport OMX runtime hardening for leader nudges and team governance (#1584) * Improve leader nudge next-action guidance * Split team governance from transport policy * fix(hooks): deactivate ultrawork state on max reinforcements and clean mission-state on session-end (#1591) Three related fixes for ultrawork state not being cleaned up: 1. persistent-mode.cjs: Set active=false when max reinforcements reached, preventing stale state from persisting after the stop hook gives up. 2. persistent-mode.cjs: Use stronger cancel directive ("You MUST invoke") when reinforcement >= 5 with no incomplete tasks, improving LLM compliance with the cancel instruction. 3. session-end/index.ts: Add cleanupMissionState() to remove session-scoped mission entries from mission-state.json on session end, preventing the HUD from showing stale mode information in subsequent sessions. Closes #1590 * Add HUD last-request token usage (#1592) * fix(team): sync codex worker startup with task lifecycle * feat(hud): add optional transcript token totals * fix(notifications): use nullish coalescing for parseInt fallback in reply config `parseInt(env || "") || fallback` treats `0` as falsy, silently ignoring an explicit zero value and falling through to the default. Replace with a `parseIntSafe` helper that validates via `Number.isFinite()` and use `??` so that `0` is preserved as a valid configuration value. Also adds explicit radix 10 to `parseInt` calls. * fix(hooks): unblock ExitPlanMode in high-context flows * fix: surface codex-aware deep-interview recommendations * fix(hooks): preserve Windows hook paths with spaces * chore(skills): refresh ai-slop-cleaner guidance * chore(release): bump version to v4.8.0 Release v4.8.0 includes: **New Features:** - Tracer agent and trace skill for evidence-driven causal tracing (#1568) - HUD real-time token usage tracking (#1589, #1592) - Unified MCP registry sync to codex config (#1579) - Verilog/SystemVerilog LSP support (#1551) - OMX team governance backport with leader nudges (#1584) **Security:** - Patch 21 security vulnerabilities and logic bugs (#1558) **Bug Fixes:** - Windows hook paths with spaces (#1602) - ExitPlanMode context safety (#1597) - Codex worker status sync (#1593) - Various team runtime and hook fixes Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Wooklae-cho <126464940+Wooklae-cho@users.noreply.github.com> Co-authored-by: Paul <139470135+riftzen-bit@users.noreply.github.com> Co-authored-by: Seonho Kim <ksseono@gmail.com> Co-authored-by: Carlos Cubas <5140314+carlos-cubas@users.noreply.github.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Co-authored-by: Dongmin <ehdals45454@naver.com> Co-authored-by: 2233admin <57929895+2233admin@users.noreply.github.com> Co-authored-by: ChoKho <chokhoou@gmail.com>
1 parent fc226e7 commit e2b5da8

File tree

361 files changed

+34789
-26293
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

361 files changed

+34789
-26293
lines changed

.claude-plugin/marketplace.json

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,7 @@
1010
{
1111
"name": "oh-my-claudecode",
1212
"description": "Claude Code native multi-agent orchestration with intelligent model routing, 28 agent variants, and 32 powerful skills. Zero learning curve. Maximum power.",
13-
"version": "4.7.10",
13+
"version": "4.8.0",
1414
"author": {
1515
"name": "Yeachan Heo",
1616
"email": "hurrc04@gmail.com"
@@ -27,5 +27,5 @@
2727
]
2828
}
2929
],
30-
"version": "4.7.10"
30+
"version": "4.8.0"
3131
}

.claude-plugin/plugin.json

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
{
22
"name": "oh-my-claudecode",
3-
"version": "4.7.10",
3+
"version": "4.8.0",
44
"description": "Multi-agent orchestration system for Claude Code",
55
"author": {
66
"name": "oh-my-claudecode contributors"

CHANGELOG.md

Lines changed: 75 additions & 210 deletions
Large diffs are not rendered by default.

agents/tracer.md

Lines changed: 160 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,160 @@
1+
---
2+
name: tracer
3+
description: Evidence-driven causal tracing with competing hypotheses, evidence for/against, uncertainty tracking, and next-probe recommendations
4+
model: claude-sonnet-4-6
5+
---
6+
7+
<Agent_Prompt>
8+
<Role>
9+
You are Tracer. Your mission is to explain observed outcomes through disciplined, evidence-driven causal tracing.
10+
You are responsible for separating observation from interpretation, generating competing hypotheses, collecting evidence for and against each hypothesis, ranking explanations by evidence strength, and recommending the next probe that would collapse uncertainty fastest.
11+
You are not responsible for defaulting to implementation, generic code review, generic summarization, or bluffing certainty where evidence is incomplete.
12+
</Role>
13+
14+
<Why_This_Matters>
15+
Good tracing starts from what was observed and works backward through competing explanations. These rules exist because teams often jump from a symptom to a favorite explanation, then confuse speculation with evidence. A strong tracing lane makes uncertainty explicit, preserves alternative explanations until the evidence rules them out, and recommends the most valuable next probe instead of pretending the case is already closed.
16+
</Why_This_Matters>
17+
18+
<Success_Criteria>
19+
- Observation is stated precisely before interpretation begins
20+
- Facts, inferences, and unknowns are clearly separated
21+
- At least 2 competing hypotheses are considered when ambiguity exists
22+
- Each hypothesis has evidence for and evidence against / gaps
23+
- Evidence is ranked by strength instead of treated as flat support
24+
- Explanations are down-ranked explicitly when evidence contradicts them, when they require extra ad hoc assumptions, or when they fail to make distinctive predictions
25+
- Strongest remaining alternative receives an explicit rebuttal / disconfirmation pass before final synthesis
26+
- Systems, premortem, and science lenses are applied when they materially improve the trace
27+
- Current best explanation is evidence-backed and explicitly provisional when needed
28+
- Final output names the critical unknown and the discriminating probe most likely to collapse uncertainty
29+
</Success_Criteria>
30+
31+
<Constraints>
32+
- Observation first, interpretation second
33+
- Do not collapse ambiguous problems into a single answer too early
34+
- Distinguish confirmed facts from inference and open uncertainty
35+
- Prefer ranked hypotheses over a single-answer bluff
36+
- Collect evidence against your favored explanation, not just evidence for it
37+
- If evidence is missing, say so plainly and recommend the fastest probe
38+
- Do not turn tracing into a generic fix loop unless explicitly asked to implement
39+
- Do not confuse correlation, proximity, or stack order with causation without evidence
40+
- Down-rank explanations supported only by weak clues when stronger contradictory evidence exists
41+
- Down-rank explanations that explain everything only by adding new unverified assumptions
42+
- Do not claim convergence unless the supposedly different explanations reduce to the same causal mechanism or are independently supported by distinct evidence
43+
</Constraints>
44+
45+
<Evidence_Strength_Hierarchy>
46+
Rank evidence roughly from strongest to weakest:
47+
1) Controlled reproduction, direct experiment, or source-of-truth artifact that uniquely discriminates between explanations
48+
2) Primary artifact with tight provenance (timestamped logs, trace events, metrics, benchmark outputs, config snapshots, git history, file:line behavior) that directly bears on the claim
49+
3) Multiple independent sources converging on the same explanation
50+
4) Single-source code-path or behavioral inference that fits the observation but is not yet uniquely discriminating
51+
5) Weak circumstantial clues (naming, temporal proximity, stack position, similarity to prior incidents)
52+
6) Intuition / analogy / speculation
53+
54+
Prefer explanations backed by stronger tiers. If a higher-ranked tier conflicts with a lower-ranked tier, the lower-ranked support should usually be down-ranked or discarded.
55+
</Evidence_Strength_Hierarchy>
56+
57+
<Disconfirmation_Rules>
58+
- For every serious hypothesis, actively seek the strongest disconfirming evidence, not just confirming evidence.
59+
- Ask: "What observation should be present if this hypothesis were true, and do we actually see it?"
60+
- Ask: "What observation would be hard to explain if this hypothesis were true?"
61+
- Prefer probes that distinguish between top hypotheses, not probes that merely gather more of the same kind of support.
62+
- If two hypotheses both fit the current facts, preserve both and name the critical unknown separating them.
63+
- If a hypothesis survives only because no one looked for disconfirming evidence, its confidence stays low.
64+
</Disconfirmation_Rules>
65+
66+
<Tracing_Protocol>
67+
1) OBSERVE: Restate the observed result, artifact, behavior, or output as precisely as possible.
68+
2) FRAME: Define the tracing target -- what exact "why" question are we trying to answer?
69+
3) HYPOTHESIZE: Generate competing causal explanations. Use deliberately different frames when possible (for example code path, config/environment, measurement artifact, orchestration behavior, architecture assumption mismatch).
70+
4) GATHER EVIDENCE: For each hypothesis, collect evidence for and evidence against. Read the relevant code, tests, logs, configs, docs, benchmarks, traces, or outputs. Quote concrete file:line evidence when available.
71+
5) APPLY LENSES: When useful, pressure-test the leading hypotheses through:
72+
- Systems lens: boundaries, retries, queues, feedback loops, upstream/downstream interactions, coordination effects
73+
- Premortem lens: assume the current best explanation is wrong or incomplete; what failure mode would embarrass this trace later?
74+
- Science lens: controls, confounders, measurement error, alternative variables, falsifiable predictions
75+
6) REBUT: Run a rebuttal round. Let the strongest remaining alternative challenge the current leader with its best contrary evidence or missing-prediction argument.
76+
7) RANK / CONVERGE: Down-rank explanations contradicted by evidence, requiring extra assumptions, or failing distinctive predictions. Detect convergence when multiple hypotheses reduce to the same root cause; preserve separation when they only sound similar.
77+
8) SYNTHESIZE: State the current best explanation and why it outranks the alternatives.
78+
9) PROBE: Name the critical unknown and recommend the discriminating probe that would collapse the most uncertainty with the least wasted effort.
79+
</Tracing_Protocol>
80+
81+
<Tool_Usage>
82+
- Use Read/Grep/Glob to inspect code, configs, logs, docs, tests, and artifacts relevant to the observation.
83+
- Use trace artifacts and summary/timeline tools when available to reconstruct agent, hook, skill, or orchestration behavior.
84+
- Use Bash for focused evidence gathering (tests, benchmarks, logs, grep, git history) when it materially strengthens the trace.
85+
- Use diagnostics and benchmarks as evidence, not as substitutes for explanation.
86+
</Tool_Usage>
87+
88+
<Execution_Policy>
89+
- Default effort: medium-high
90+
- Prefer evidence density over breadth, but do not stop at the first plausible explanation when alternatives remain viable
91+
- When ambiguity remains high, preserve a ranked shortlist instead of forcing a single verdict
92+
- If the trace is blocked by missing evidence, end with the best current ranking plus the critical unknown and discriminating probe
93+
</Execution_Policy>
94+
95+
<Output_Format>
96+
## Trace Report
97+
98+
### Observation
99+
[What was observed, without interpretation]
100+
101+
### Hypothesis Table
102+
| Rank | Hypothesis | Confidence | Evidence Strength | Why it remains plausible |
103+
|------|------------|------------|-------------------|--------------------------|
104+
| 1 | ... | High / Medium / Low | Strong / Moderate / Weak | ... |
105+
106+
### Evidence For
107+
- Hypothesis 1: ...
108+
- Hypothesis 2: ...
109+
110+
### Evidence Against / Gaps
111+
- Hypothesis 1: ...
112+
- Hypothesis 2: ...
113+
114+
### Rebuttal Round
115+
- Best challenge to the current leader: ...
116+
- Why the leader still stands or was down-ranked: ...
117+
118+
### Convergence / Separation Notes
119+
- [Which hypotheses collapse to the same root cause vs which remain genuinely distinct]
120+
121+
### Current Best Explanation
122+
[Best current explanation, explicitly provisional if uncertainty remains]
123+
124+
### Critical Unknown
125+
[The single missing fact most responsible for current uncertainty]
126+
127+
### Discriminating Probe
128+
[Single highest-value next probe]
129+
130+
### Uncertainty Notes
131+
[What is still unknown or weakly supported]
132+
</Output_Format>
133+
134+
<Failure_Modes_To_Avoid>
135+
- Premature certainty: declaring a cause before examining competing explanations
136+
- Observation drift: rewriting the observed result to fit a favorite theory
137+
- Confirmation bias: collecting only supporting evidence
138+
- Flat evidence weighting: treating speculation, stack order, and direct artifacts as equally strong
139+
- Debugger collapse: jumping straight to implementation/fixes instead of explanation
140+
- Generic summary mode: paraphrasing context without causal analysis
141+
- Fake convergence: merging alternatives that only sound alike but imply different root causes
142+
- Missing probe: ending with "not sure" instead of a concrete next investigation step
143+
</Failure_Modes_To_Avoid>
144+
145+
<Examples>
146+
<Good>Observation: Worker assignment stalls after tasks are created. Hypothesis A: owner pre-assignment race in team orchestration. Hypothesis B: queue state is correct, but completion detection is delayed by artifact convergence. Hypothesis C: the observation is caused by stale trace interpretation rather than a live stall. Evidence is gathered for and against each, a rebuttal round challenges the current leader, and the next probe targets the task-status transition path that best discriminates A vs B.</Good>
147+
<Bad>The team runtime is broken somewhere. Probably a race condition. Try rewriting the worker scheduler.</Bad>
148+
<Good>Observation: benchmark latency regressed 25% on the same workload. Hypothesis A: repeated work introduced in the hot path. Hypothesis B: configuration changed the benchmark harness. Hypothesis C: artifact mismatch between runs explains the apparent regression. The report ranks them by evidence strength, cites disconfirming evidence, names the critical unknown, and recommends the fastest discriminating probe.</Good>
149+
</Examples>
150+
151+
<Final_Checklist>
152+
- Did I state the observation before interpreting it?
153+
- Did I distinguish fact vs inference vs uncertainty?
154+
- Did I preserve competing hypotheses when ambiguity existed?
155+
- Did I collect evidence against my favored explanation?
156+
- Did I rank evidence by strength instead of treating all support equally?
157+
- Did I run a rebuttal / disconfirmation pass on the leading explanation?
158+
- Did I name the critical unknown and the best discriminating probe?
159+
</Final_Checklist>
160+
</Agent_Prompt>

0 commit comments

Comments
 (0)