Skip to content

Commit b36b93d

Browse files
committed
merge dev and bump to v1.6.0
2 parents 9ce60c0 + 11f56bc commit b36b93d

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

46 files changed

+4254
-556
lines changed

README.md

Lines changed: 26 additions & 24 deletions
Original file line numberDiff line numberDiff line change
@@ -4,9 +4,8 @@
44

55
A continuous, real-time runtime diagnostics suite for ComfyUI featuring **LLM-powered analysis**, **interactive debugging chat**, and **50+ fix patterns**. Automatically intercepts all terminal output from startup, captures complete Python tracebacks, and delivers prioritized fix suggestions with node-level context extraction. Now supports **JSON-based pattern management** with hot-reload and **full i18n support** for 9 languages (en, zh_TW, zh_CN, ja, de, fr, it, es, ko).
66

7-
## Repository Structure (Quick Jump)
7+
## Table of Contents
88

9-
**README sections**
109
- [Latest Updates](#latest-updates-jan-2026---click-to-expand)
1110
- [Features](#features)
1211
- [Installation](#installation)
@@ -19,27 +18,25 @@ A continuous, real-time runtime diagnostics suite for ComfyUI featuring **LLM-po
1918
- [CSP Compatibility](#csp-compatibility)
2019
- [Contributing](#contributing)
2120

22-
**Key directories / files**
23-
- [`web/`](web) — Frontend extension (sidebar UI, Preact islands)
24-
- [`services/`](services) — PromptComposer, diagnostics (F14), token budget, log ring buffer
25-
- [`pipeline/`](pipeline) — Analysis pipeline stages + plugin system
26-
- [`patterns/`](patterns) — JSON error patterns (builtin/community/custom) + schema
27-
- [`tests/`](tests) — Pytest + Playwright E2E (`tests/e2e/`)
28-
- [`docs/`](docs) — Documentation + reference snapshots
29-
- [`scripts/`](scripts) — Local tooling (CI gates, validators, migration helpers)
30-
- [`ROADMAP.md`](ROADMAP.md) — Architecture diagram + development status
21+
## Latest Updates (Jan 2026) - Click to expand
3122

32-
```text
33-
web/ Frontend extension (Doctor sidebar UI)
34-
services/ Backend services (prompt composer, diagnostics, token budget, log ring buffer)
35-
pipeline/ Analysis pipeline stages + plugins
36-
patterns/ JSON error patterns + schema
37-
tests/ Python tests + Playwright E2E tests
38-
docs/ Documentation + reference snapshots
39-
scripts/ Dev/CI tooling and helpers
40-
```
23+
<details>
24+
<summary><strong>New: F14 Proactive Diagnostics (Health Check + Intent Signature)</strong></summary>
4125

42-
## Latest Updates (Jan 2026) - Click to expand
26+
- Added a **Diagnostics** section to the **Statistics** tab for proactive workflow troubleshooting (no LLM required).
27+
- **Health checks**: workflow lint + environment/deps + privacy/safety checks, with actionable issues.
28+
- **Intent Signature (ISS)**: deterministic intent inference with **top intents + evidence** to help triage what the workflow is “trying to do”.
29+
- Includes UX hardening: safe fallbacks (e.g. “No dominant intent detected”) and improved evidence sanitization.
30+
31+
</details>
32+
33+
<details>
34+
<summary><strong>(v1.5.8) QoL: Auto-open Right Error Report Panel Toggle</strong></summary>
35+
36+
- Added a **dedicated toggle** in **Doctor → Settings** to control whether the **right-side error report panel** auto-opens when a new error is detected.
37+
- **Default: ON** for new installs, and the choice is persisted.
38+
39+
</details>
4340

4441
<details>
4542
<summary><strong> (v1.5.0) Smart Token Budget Management</strong></summary>
@@ -653,7 +650,7 @@ The **Statistics Dashboard** provides real-time insights into your ComfyUI error
653650

654651
## Settings
655652

656-
You can customize ComfyUI-Doctor behavior via the ComfyUI Settings panel (Gear icon).
653+
You can customize ComfyUI-Doctor behavior via the **Doctor sidebar → Settings** tab.
657654

658655
### 1. Show error notifications
659656

@@ -662,8 +659,9 @@ You can customize ComfyUI-Doctor behavior via the ComfyUI Settings panel (Gear i
662659

663660
### 2. Auto-open panel on error
664661

665-
**Function**: Automatically expands the Doctor sidebar when a new error is detected.
666-
**Usage**: **Recommended**. Provides immediate access to diagnostic results without manual clicking.
662+
**Function**: Automatically opens the **right-side error report panel** when a new error is detected.
663+
**Default**: **ON** (recommended).
664+
**Usage**: Disable if you prefer to keep the panel closed and open it manually.
667665

668666
### 3. Error Check Interval (ms)
669667

@@ -709,6 +707,10 @@ You can customize ComfyUI-Doctor behavior via the ComfyUI Settings panel (Gear i
709707

710708
> Note: **Trust & Health** and **Anonymous Telemetry** have moved to the **Statistics** tab.
711709
710+
> Note: **F14 Proactive Diagnostics** is accessed from the **Statistics** tab → **Diagnostics** section.
711+
> Use **Run / Refresh** to generate a report, review the issues list, and use any provided actions (e.g. locate node / acknowledge).
712+
> If you want the report text in another language, set **Suggestion Language** in Settings first.
713+
712714
---
713715

714716
## API Endpoints

ROADMAP.md

Lines changed: 90 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -30,6 +30,8 @@ graph TD
3030
C --> K[DoctorLogProcessor]
3131
%% R14/R15: Log context ring buffer
3232
C --> LRB[services/log_ring_buffer.py]
33+
%% R18: Canonical data directory resolver (Desktop/Portable compatibility)
34+
C --> DP[services/doctor_paths.py]
3335
3436
%% A6 Pipeline Architecture
3537
D --> PIPE[pipeline/orchestrator.py]
@@ -106,7 +108,11 @@ graph TD
106108
AU --> BC["specs/telemetry.spec.js - 8 tests - requires ComfyUI backend"]
107109
AV --> AH
108110
AV --> AJ
109-
111+
112+
%% T14: Frontend unit tests (fast, no browser)
113+
ATU[tests/unit/] --> VIT[vitest.config.js]
114+
ATU --> WU[web/utils/]
115+
110116
%% R14/R15: Prompt composition + canonical system info
111117
B --> PC[services/prompt_composer.py]
112118
AB --> PC
@@ -132,6 +138,7 @@ graph TD
132138
| `rate_limiter.py` | ~130 | R7: Token bucket RateLimiter + async ConcurrencyLimiter |
133139
| `llm_client.py` | ~290 | R6: Retry with exponential backoff, Idempotency-Key, timeout budget |
134140
| `services/` | ~670 | R12: Token estimation, budget management, workflow pruning |
141+
| `services/doctor_paths.py` | ~120 | R18: Canonical, Desktop-safe data directory resolver for persistence |
135142
| `services/prompt_composer.py` | ~260 | R14: Unified structured context → prompt formatting (summary-first) |
136143
| `services/log_ring_buffer.py` | ~190 | R14/R15: Bounded execution log capture for context building |
137144
| `pattern_loader.py` | 300+ | JSON-based pattern management with hot-reload capability |
@@ -247,7 +254,7 @@ graph TD
247254
-**R13-P2**: Metadata contract + dependency policy (shared with **R0-P3**)
248255
-**R13-P3**: Context extraction provenance metadata
249256
-**R13-OPT1**: Optional signature policy (HMAC) for allowlisted plugins (shared-secret integrity check, not public signing)
250-
-**R13-OPT2**: Plugin trust UI + `/doctor/plugins` scan-only endpoint ✅ *Completed (2026-01-09)*
257+
-**R13-OPT2**: Trust & Health UI + `/doctor/plugins` scan-only endpoint ✅ *Completed (2026-01-09; moved to Statistics on 2026-01-15)*
251258

252259
### 3.1 Security (in progress)
253260

@@ -297,15 +304,37 @@ graph TD
297304
- Config: `config.py` `telemetry_enabled` setting (default: false)
298305
- 6 API endpoints: `/doctor/telemetry/{status,buffer,track,clear,export,toggle}`
299306
- Security: Origin check (403 for cross-origin), 1KB payload limit, field whitelist
300-
- Frontend: `doctor_telemetry.js`, Settings UI controls
307+
- Frontend: `doctor_telemetry.js`, Statistics UI controls (moved from Settings on *2026-01-15*)
301308
- i18n: 81 strings (9 keys × 9 languages)
302309
- E2E tests: 8 tests in `telemetry.spec.js`
303310
- **Implementation**: `.planning/260109-S1_S3_IMPLEMENTATION_RECORD.md`
311+
- **UI Migration Record**: `.planning/260115-SETTINGS_TO_STATS_IMPLEMENTATION_RECORD.md`
304312

305313
### 3.2 Robustness (in progress)
306314

307315
*Sorted by priority (High → Low):*
308316

317+
- [x] **R18**: ComfyUI Desktop/Portable Compatibility Hardening - 🔴 High ✅ *Completed (2026-01-23)*
318+
- **Problem**: Desktop packaging changes directory layout and stdout/stderr behavior; edge-case stream/logging failures can trigger log storms or break persistence (especially on Windows).
319+
- **Scope**:
320+
- Introduce a single **Doctor data-dir resolver** (prefer ComfyUI `--user-directory` / `folder_paths.get_user_directory()`; safe fallback when unavailable)
321+
- Migrate all persisted files to the resolved data dir (avoid writing under extension install dir):
322+
- `error_history.json`, `comfyui_debug_*.log`, API operation logs, diagnostics history, etc.
323+
- One-time migration/compat read for legacy `custom_nodes/ComfyUI-Doctor/logs/` locations
324+
- Persistence hardening for JSON stores:
325+
- Atomic writes (tmp → rename), corruption recovery (rotate + rebuild), and safety guardrails (max size/entries)
326+
- Runtime self-protection:
327+
- Reusable circuit breaker + 60s rate-limit + aggregation for repeated identical errors
328+
- Drop/ignore known non-actionable Desktop log spam signatures (e.g. flush failures), while still surfacing a single aggregated health issue
329+
- Record install mode hints (Desktop vs portable/git clone) + resolved paths in `system_info`/health for debugging
330+
- **Acceptance**:
331+
- Doctor never writes inside Desktop app resources/install directories
332+
- Corrupt JSON stores self-heal without infinite error loops
333+
- Flush/log storms do not grow history unbounded (aggregation + breaker)
334+
- **Reference**: `docs/reference/desktop/` (ComfyUI Desktop packaging + launch args)
335+
- **Plan**: `.planning/260123-R18_DESKTOP_PORTABLE_COMPAT_HARDENING_PLAN.md`
336+
- **Implementation Record**: `.planning/260123-R18_IMPLEMENTATION_RECORD.md`
337+
- **Tests**: `tests/test_paths.py`, `tests/test_history_store.py`, `tests/test_r18_migration.py`
309338
- [x] **R14**: Error context extraction & prompt packaging optimization - 🔴 High ✅ *Completed (2026-01-14)*
310339
- **Problem**: LLM context is often dominated by raw tracebacks; log context capture is unreliable; env/pip list can waste tokens.
311340
- **Approach**:
@@ -315,6 +344,26 @@ graph TD
315344
- Build structured LLM context via pipeline (`llm_builder.py`) + token budgets (R12) instead of ad-hoc string concatenation
316345
- **Plan**: `.planning/260113-R14_ERROR_CONTEXT_EXTRACTION_OPTIMIZATION_PLAN.md`
317346
- **Implementation Record**: `.planning/260113-R14_ERROR_CONTEXT_EXTRACTION_IMPLEMENTATION_RECORD.md`
347+
- [x] **R16**: Statistics Reset + Unbounded History - 🟡 Medium ✅ *Completed (2026-01-15)*
348+
- **Goal**: Let users reset local statistics on demand and remove hard history limits while keeping UI time windows (e.g. 30d/24h) meaningful.
349+
- **Scope**:
350+
- Add a Reset button in Statistics UI (with confirmation) and backend reset endpoint
351+
- Remove history maxlen cap (unbounded history) with guardrails and reset mechanism
352+
- **Plan**: `.planning/260115-R16_STATS_RESET_AND_UNBOUNDED_HISTORY_PLAN.md`
353+
- **Implementation Record**: `.planning/260115-R16_STATISTICS_RESET_IMPLEMENTATION_RECORD.md`
354+
- [ ] **R17 (P1)**: Limited Config Externalization (Compatibility Guardrails) - 🟡 Medium
355+
- **Goal**: Externalize a small, high-impact set of hardcoded guardrails so Doctor behaves consistently across ComfyUI Desktop / portable / git-clone installs (without turning everything into config).
356+
- **Scope**:
357+
- Consolidate runtime guardrails into a single config surface (prefer request-local settings; avoid global side-effects):
358+
- Error aggregation window (e.g. 60s), rate-limit thresholds, and breaker/backoff parameters
359+
- Persistence guardrails (max entries / max file size / rotation thresholds) for JSON stores
360+
- Path/persistence policy knobs used by **R16** (resolved data dir, migration toggles)
361+
- Support safe overrides via **ComfyUI settings** and/or env vars where appropriate (defaults unchanged).
362+
- Document which values are user-facing vs. developer-only (avoid accidental foot-guns).
363+
- **Acceptance**:
364+
- Default behavior remains unchanged for existing users
365+
- Desktop-only mitigations can be enabled/tuned without code edits
366+
- No new global CONFIG mutation patterns introduced
318367
- [x] **R15**: Canonicalize `system_info` + populate pipeline `execution_logs` - 🟡 Medium ✅ *Completed (2026-01-14)*
319368
- **Scope**:
320369
- Canonicalize `get_system_environment()` output into a PromptComposer-friendly schema (OS/Python/CUDA/PyTorch + capped packages)
@@ -373,15 +422,21 @@ graph TD
373422

374423
*Sorted by priority (High → Low):*
375424

376-
- [ ] **F14**: Proactive Diagnostics (Lint / Health Check + Intent Signature) - 🔴 High ⚠️ *Use dev branch*
425+
- [x] **F14**: Proactive Diagnostics (Lint / Health Check + Intent Signature) - 🔴 High *Completed (2026-01-23)*
377426
- **Goal**: Prevent failures before execution; Health Score is a core KPI
378427
- **Intent Signature (ISS)**: deterministic intent inference (signals + scoring), top intents with evidence
379428
- **Checks**: Workflow lint, environment/deps, model assets, runtime, privacy
380429
- **Outputs**: Actionable issues + node navigation; intent banner; health history
381430
- **i18n**: Health tab + intent banner strings across 9 languages
382431
- **Stats**: Top intents and intent-to-error correlation
383432
- **APIs**: `/doctor/health_check`, `/doctor/health_report`, `/doctor/health_history`, `/doctor/health_ack`
384-
- **Plan**: `.planning/260108-PROACTIVE_DIAGNOSTICS_PLAN.md`
433+
- **Plan**: `.planning/260122-F14_PROACTIVE_DIAGNOSTICS_AND_INTENT_SIGNATURE_PLAN.md`
434+
- **Implementation Records**:
435+
- `.planning/260122-F14_P0_IMPLEMENTATION_LOG.md`
436+
- `.planning/260122-F14_P1_IMPLEMENTATION_LOG.md`
437+
- `.planning/260123-F14_P2_IMPLEMENTATION_LOG.md`
438+
- `.planning/260123-F14_P3_IMPLEMENTATION_LOG.md`
439+
- `.planning/260123-F14_P3_FOLLOWUPS_LOG.md` (tracked as F14-P4 follow-ups)
385440
- [x] **F15**: Resolution Marking UI (Resolved / Unresolved / Ignored) - 🟡 Medium ✅ *Completed (2026-01-09)*
386441
- **Goal**: Let users update resolution status directly from UI
387442
- **Scope**: Statistics tab first; optional Chat tab parity
@@ -396,6 +451,15 @@ graph TD
396451
- **Conflict avoidance**: Append-only JSON files under `feedback/`
397452
- **Auth**: Server-side token (env var) or future device flow
398453
- **Plan**: `.planning/260108-F16_GITHUB_FEEDBACK_PR_PLAN.md`
454+
- [x] **F17**: Toggle Auto-Open for Right Error Report Panel - 🟡 Medium ✅ *Completed (2026-01-23)*
455+
- **Goal**: Add a user-facing switch to control whether the right-side error report panel auto-opens when new errors are detected (**default: ON**).
456+
- **Scope**:
457+
- Add toggle in Doctor Settings tab (Sidebar → Doctor → Settings)
458+
- Persist via ComfyUI settings (`Doctor.Behavior.AutoOpenOnError`)
459+
- Apply immediately (no restart required)
460+
- Full i18n across 9 languages
461+
- **Plan**: `.planning/260123-F17_AUTO_OPEN_RIGHT_PANEL_TOGGLE_PLAN.md`
462+
- **Implementation Record**: `.planning/260123-F17_IMPLEMENTATION_RECORD.md`
399463
- [x] **F7**: Enhanced Error Analysis (Multi-Language + Categorization) - 🔴 High ✅ *Completed (2026-01-01)*
400464
- **Phase 1**: Enhanced Error Context Collection
401465
- Python stack traces, execution logs (last 50 lines)
@@ -534,8 +598,19 @@ graph TD
534598

535599
### 3.5 Testing (in progress)
536600

537-
*Sorted by priority (High → Low), then by item number:*
601+
*Sorted by priority (High → Low):*
538602

603+
- [x] **T14 (P0)**: Frontend Unit Tests (Close E2E Gaps) - 🔴 High ✅ *Implemented (2026-01-23; CI wiring pending)*
604+
- **Goal**: Reduce UI regression risk by covering non-trivial logic with fast unit tests (E2E stays as end-to-end confidence, not the only safety net).
605+
- **Scope**:
606+
- Add a lightweight JS unit test runner (e.g. Vitest/Jest) focused on pure logic/helpers (formatters, guards, reducers, intent/diagnostics rendering decisions).
607+
- Cover critical edge-cases that are expensive/flaky in Playwright (empty/partial payloads, i18n key fallbacks, sanitizer behaviors, error state transitions).
608+
- Keep runtime fast (< ~10s) and CI-friendly (no browser required).
609+
- **Acceptance**:
610+
- `npm run test:unit` (or equivalent) runs in CI and locally
611+
- Unit tests catch at least 2-3 classes of prior regressions before E2E
612+
- No coupling to ComfyUI runtime APIs in unit tests (use small fixtures/mocks)
613+
- **Record**: `.planning/260123-T14_FRONTEND_UNIT_TESTS_LOG.md`
539614
- [x] **T11**: Phase 2 Release Readiness CI Gate (Plan 6.1) - 🔴 High ✅ *Completed (2026-01-09)*
540615
- **Goal**: Make Phase 2 hardening non-regressable (required checks before merge/release).
541616
- **Gate**: `pytest -q tests/test_plugins_security.py`, `tests/test_metadata_contract.py`, `tests/test_pipeline_dependency_policy.py`, `tests/test_outbound_payload_safety.py`, plus `npm test`.
@@ -601,9 +676,12 @@ graph TD
601676
- **Implementation Record**: `.planning/260103-T2_playwright_test_infrastructure.md`
602677
- **Follow-up Record**: `.planning/260109-PHASE2_FOLLOWUP_TRUST_HEALTH_UI_AND_E2E_RECORD.md`
603678
- **Foundation for**: CI/CD integration, UI regression detection
604-
- [x] **T10**: Playwright E2E Runner Hardening (WSL `/mnt/c`) - 🟢 Low ✅ *Completed (2026-01-09)*
605-
- **Goal**: Make `npm test` stable on WSL + Windows-mounted paths (transform cache / temp permissions + python shim).
606-
- **Implementation Record**: `.planning/260109-PHASE2_FOLLOWUP_TRUST_HEALTH_UI_AND_E2E_RECORD.md`
679+
- [ ] **T13**: Desktop-style Failure Injection Tests (Flush/OSError + Corrupt JSON) - 🟡 Medium
680+
- **Goal**: Prevent Desktop-only regressions (log storms, broken history) without requiring a real ComfyUI Desktop runtime in CI.
681+
- **Scope**:
682+
- Simulate stream flush failures (e.g. `OSError: [Errno 22] Invalid argument`) and assert rate-limit + aggregation behavior
683+
- Corrupt JSON recovery tests for history stores (`error_history.json`, diagnostics history)
684+
- Fixture corpus derived from `docs/reference/desktop/` (sanitized, no secrets)
607685
- [ ] **T9**: External Environment Test Coverage Expansion (Non-ComfyUI) - 🟡 Medium
608686
- **Goal**: Cover pipeline integration, SSE/REST contracts, and UI contracts without a live ComfyUI runtime
609687
- **Phases**:
@@ -617,6 +695,9 @@ graph TD
617695
- [ ] **T5**: Online API integration tests (OpenAI, DeepSeek, Anthropic) - 🟡 Medium
618696
- [ ] **T3**: End-to-end integration tests - 🟢 Low
619697
- [ ] **T4**: Stress tests - 🟢 Low
698+
- [x] **T10**: Playwright E2E Runner Hardening (WSL `/mnt/c`) - 🟢 Low ✅ *Completed (2026-01-09)*
699+
- **Goal**: Make `npm test` stable on WSL + Windows-mounted paths (transform cache / temp permissions + python shim).
700+
- **Implementation Record**: `.planning/260109-PHASE2_FOLLOWUP_TRUST_HEALTH_UI_AND_E2E_RECORD.md`
620701

621702
### 3.6 Documentation (in progress)
622703

__init__.py

Lines changed: 7 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -60,6 +60,7 @@
6060
from .llm_client import llm_request_with_retry, RetryConfig, RetryResult
6161
from .services.token_budget import TokenBudgetService, BudgetConfig
6262
from .services.prompt_composer import get_prompt_composer, PromptComposerConfig
63+
from .services.doctor_paths import get_doctor_data_dir
6364

6465
# Global R12 Service
6566
TOKEN_BUDGET_SERVICE = TokenBudgetService()
@@ -103,7 +104,8 @@ def is_anthropic(base_url: str) -> bool:
103104

104105
# --- 1. Setup Log Directory (Local to Node) ---
105106
current_dir = os.path.dirname(os.path.abspath(__file__))
106-
log_dir = os.path.join(current_dir, "logs")
107+
# R18: Prefer canonical Doctor data dir for persisted logs (Desktop-safe)
108+
log_dir = os.path.join(get_doctor_data_dir(), "logs")
107109

108110
if not os.path.exists(log_dir):
109111
try:
@@ -2333,6 +2335,10 @@ async def api_health(request):
23332335
payload = {
23342336
"logger": get_logger_metrics(),
23352337
"ssrf": get_ssrf_metrics(),
2338+
"storage": {
2339+
"data_dir": get_doctor_data_dir(),
2340+
"history_size_bytes": getattr(CONFIG, "history_size_bytes", 0),
2341+
},
23362342
"last_analysis": {
23372343
"timestamp": last_analysis.get("timestamp"),
23382344
"pipeline_status": analysis_meta.get("pipeline_status"),

0 commit comments

Comments
 (0)