Skip to content

Commit 47dec1c

Browse files
committed
chore(R0_R13): Phase 2 hardening, add trust/health UI, stabilize Playwright runner, and plan CI gates/migration tooling
1 parent f6ba93e commit 47dec1c

39 files changed

+1929
-171
lines changed

README.md

Lines changed: 25 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -716,6 +716,31 @@ Create `config.json` to customize behavior:
716716
- `enable_api`: Enable API endpoints
717717
- `privacy_mode`: PII sanitization level - `"none"`, `"basic"` (default), or `"strict"` (see Privacy Mode section above)
718718

719+
### Community Plugins (Advanced)
720+
721+
Community plugins extend pattern matching with custom Python logic. For safety, they are **disabled by default** and are only loaded if they pass the trust policy.
722+
723+
Enable via `config.json`:
724+
725+
```json
726+
{
727+
"enable_community_plugins": true,
728+
"plugin_allowlist": ["example.plugin"],
729+
"plugin_blocklist": [],
730+
"plugin_signature_required": false,
731+
"plugin_signature_key": "",
732+
"plugin_signature_alg": "hmac-sha256"
733+
}
734+
```
735+
736+
Notes:
737+
738+
- Plugins live under `pipeline/plugins/community/` and require a manifest (`*.json` next to the plugin, or `plugin.json` if there is only one plugin file).
739+
- Trust rules include allowlist, manifest/sha256 verification, filesystem hardening (containment, symlink rejection, size/scan limits), and optional HMAC verification.
740+
- **HMAC signature is a shared-secret integrity check**, not a public-key signature; keep `plugin_signature_key` secret and never commit it to Git.
741+
742+
See `docs/PLUGIN_GUIDE.md` for the manifest schema and details.
743+
719744
---
720745

721746
## Supported Error Patterns

ROADMAP.md

Lines changed: 75 additions & 21 deletions
Original file line numberDiff line numberDiff line change
@@ -13,22 +13,27 @@ graph TD
1313
B --> F[config.py]
1414
B --> G[nodes.py]
1515
B --> HH[statistics.py]
16+
B --> SEC[security.py]
17+
B --> OUT[outbound.py]
1618
1719
C --> I[AsyncFileWriter]
1820
C --> J[SafeStreamWrapper]
1921
C --> K[DoctorLogProcessor]
2022
2123
%% A6 Pipeline Architecture
2224
D --> PIPE[pipeline/orchestrator.py]
25+
PIPE --> MC[pipeline/metadata_contract.py]
2326
PIPE --> S1[SanitizerStage]
2427
PIPE --> S2[PatternMatcherStage]
2528
PIPE --> S3[ContextEnhancerStage]
2629
PIPE --> S4[LLMBuilderStage]
2730
2831
%% Stage Dependencies
32+
S1 --> SAN[sanitizer.py]
2933
S2 --> H[pattern_loader.py]
3034
S2 --> PLUG[pipeline/plugins/]
3135
S4 --> SERV[services/workflow_pruner.py]
36+
OUT --> SAN
3237
3338
H --> N[patterns/builtin/]
3439
H --> O[patterns/community/]
@@ -53,8 +58,8 @@ graph TD
5358
X --> AG["API: /doctor/chat"]
5459
X --> AGS["API: /doctor/statistics"]
5560
X --> AGM["API: /doctor/mark_resolved"]
56-
X --> AGS["API: /doctor/statistics"]
57-
X --> AGM["API: /doctor/mark_resolved"]
61+
X --> AH2["API: /doctor/health"]
62+
X --> AH3["API: /doctor/plugins"]
5863
5964
AH[web/doctor.js] --> TM[Tab Manager]
6065
TM --> TC[Chat Tab]
@@ -74,10 +79,10 @@ graph TD
7479
AT[tests/e2e/] --> AU[Playwright Test Suite]
7580
AU --> AV[test-harness.html]
7681
AU --> AW[mocks/comfyui-app.js]
77-
AU --> AX[specs/settings.spec.js - 12 tests]
78-
AU --> AY[specs/sidebar.spec.js - 10 tests]
79-
AU --> AZ[specs/statistics.spec.js - 18 tests]
80-
AU --> BA[specs/preact-loader.spec.js - 8 tests]
82+
AU --> AX[specs/settings.spec.js - 13 tests]
83+
AU --> AY[specs/sidebar.spec.js - 15 tests]
84+
AU --> AZ[specs/statistics.spec.js - 19 tests]
85+
AU --> BA[specs/preact-loader.spec.js - 14 tests]
8186
AV --> AH
8287
AV --> AJ
8388
```
@@ -87,10 +92,14 @@ graph TD
8792
| Module | Lines | Function |
8893
|--------|-------|----------|
8994
| `prestartup_script.py` | 102 | Earliest log interception hook (before custom_nodes load) |
90-
| `__init__.py` | 1900+ | Main entry: full Logger install, 9 API endpoints, LLM integration, env var support |
95+
| `__init__.py` | 1900+ | Main entry: full Logger install, API endpoints, LLM integration, env var support |
9196
| `logger.py` | 400+ | SafeStreamWrapper + queue-based processing, DoctorLogProcessor background thread, async writes |
9297
| `analyzer.py` | 320+ | Wrapper for AnalysisPipeline, legacy API compatibility |
9398
| `pipeline/` | 400+ | A6: Error analysis pipeline (Sanitizer, Matcher, Context, LLMBuilder) |
99+
| `pipeline/metadata_contract.py` | ~120 | Metadata schema versioning + end-of-run validation/quarantine |
100+
| `security.py` | 300+ | SSRF hardening helpers + counters for health endpoint |
101+
| `outbound.py` | 150+ | Non-bypassable outbound sanitization boundary for remote requests |
102+
| `sanitizer.py` | 400+ | PII sanitization engine with `none/basic/strict` modes |
94103
| `services/` | 50+ | R12: Workflow pruning and pip validation services |
95104
| `pattern_loader.py` | 300+ | JSON-based pattern management with hot-reload capability |
96105
| `i18n.py` | 1400+ | Internationalization: 9 languages (en, zh_TW, zh_CN, ja, de, fr, it, es, ko), 57 pattern translations |
@@ -108,10 +117,10 @@ graph TD
108117
| `web/doctor_chat.js` | 600+ | Multi-turn chat interface, SSE streaming, markdown rendering |
109118
| `tests/e2e/test-harness.html` | 104 | Isolated test environment for Doctor UI (loads full extension without ComfyUI) |
110119
| `tests/e2e/mocks/comfyui-app.js` | 155 | Mock ComfyUI app/api objects for testing |
111-
| `tests/e2e/specs/settings.spec.js` | 217 | Settings panel tests (12 tests): toggle, selectors, inputs, persistence |
112-
| `tests/e2e/specs/sidebar.spec.js` | 190 | Chat interface tests (10 tests): messages, input, buttons, error context, sanitization status |
113-
| `tests/e2e/specs/statistics.spec.js` | 470+ | Statistics dashboard tests (18 tests): panel, cards, patterns, categories, i18n |
114-
| `tests/e2e/specs/preact-loader.spec.js` | 200+ | Preact loader tests (8 tests): module loading, flags, error handling |
120+
| `tests/e2e/specs/settings.spec.js` | 217 | Settings panel tests (13 tests): toggle, selectors, inputs, persistence, trust/health refresh |
121+
| `tests/e2e/specs/sidebar.spec.js` | 190 | Chat interface tests (15 tests): messages, input, buttons, error context, sanitization status |
122+
| `tests/e2e/specs/statistics.spec.js` | 470+ | Statistics dashboard tests (19 tests): panel, cards, patterns, categories, i18n, resolution actions |
123+
| `tests/e2e/specs/preact-loader.spec.js` | 200+ | Preact loader tests (14 tests): module loading, flags, error handling, vendor fallback |
115124
| `playwright.config.js` | 89 | Playwright configuration for E2E tests |
116125

117126
---
@@ -176,12 +185,30 @@ graph TD
176185

177186
### 3.0 Risk & Refactor Mitigation (Highest Priority)
178187

179-
- [ ] **R0**: Risk & Refactor Mitigation (Security + Logger + Pipeline + Sanitization) - 🔴 Highest ⚠️ *Use dev branch*
188+
- [x] **R0**: Risk & Refactor Mitigation (Security + Logger + Pipeline + Sanitization) - 🔴 Highest *Completed (2026-01-10)*
180189
- **Scope**: SSRF hardening, logger backpressure, pipeline health, prestartup handoff, sanitization boundary
181190
- **Plan**: `.planning/260108-RISK_REFACTOR_MITIGATION_PLAN.md`
182-
- [ ] **R13**: Pipeline + Plugin Hardening (Phase 2 Assessment) - 🔴 Highest ⚠️ *Use dev branch*
191+
- **Implementation Record**: `.planning/260110-R0_R13_PIPELINE_GOVERNANCE_IMPLEMENTATION_RECORD.md`
192+
- **Progress**:
193+
-**R0-P0**: SSRF hardening + redirect blocking for outbound requests
194+
-**R0-P1**: Single outbound payload sanitization funnel (privacy_mode=none only for verified local)
195+
-**R0-P2**: Logger backpressure + dropped-message counters
196+
-**R0-P3**: Metadata contract versioning + end-of-run validation + dependency-aware pipeline
197+
-**R0-P4**: Prestartup logger handoff (close/uninstall)
198+
-**R0-P5**: Observability/health endpoint (`/doctor/health`)
199+
- [x] **R13**: Pipeline + Plugin Hardening (Phase 2 Assessment) - 🔴 Highest ✅ *Completed (2026-01-10)*
183200
- **Scope**: plugin gating + manifest/allowlist, metadata contract, pipeline dependency policy, expanded sanitization boundary, context extraction provenance
184201
- **Plan**: `.planning/260109-PHASE2_PIPELINE_PLUGIN_HARDENING_PLAN.md`
202+
- **Implementation Record**: `.planning/260110-R0_R13_PIPELINE_GOVERNANCE_IMPLEMENTATION_RECORD.md`
203+
- **Follow-up Record**: `.planning/260109-PHASE2_FOLLOWUP_TRUST_HEALTH_UI_AND_E2E_RECORD.md`
204+
- **Progress**:
205+
-**R13-P0**: Plugin loader safe-by-default (default OFF + allowlist/manifest/sha256 + trust taxonomy + filesystem hardening)
206+
-**R13-P0-TESTS**: Dedicated security tests for plugin loader
207+
-**R13-P1**: Outbound payload safety funnel (shared with **R0-P1**)
208+
-**R13-P2**: Metadata contract + dependency policy (shared with **R0-P3**)
209+
-**R13-P3**: Context extraction provenance metadata
210+
-**R13-OPT1**: Optional signature policy (HMAC) for allowlisted plugins (shared-secret integrity check, not public signing)
211+
-**R13-OPT2**: Plugin trust UI + `/doctor/plugins` scan-only endpoint ✅ *Completed (2026-01-09)*
185212

186213
### 3.1 Security (in progress)
187214

@@ -193,6 +220,7 @@ graph TD
193220
- ✅ Sanitize Linux/macOS home: `/home/username/``<USER_HOME>/`
194221
- ✅ Email addresses, private IP addresses (regex-based)
195222
- ✅ Configurable sanitization levels: `none`, `basic`, `strict`
223+
-**S6-FP1**: Reduce false positives (BASIC no longer redacts arbitrary long hex; STRICT keeps long-hex redaction)
196224
- ✅ Zero runtime overhead, GDPR-friendly
197225
- **Frontend** (Privacy Controls):
198226
- ✅ Settings panel: "Privacy Mode" dropdown with 3 levels
@@ -391,14 +419,31 @@ graph TD
391419
- **Foundation for**: v2.0 advanced chat features, v3.0 multi-workspace features
392420
- **Design Reference**: See `.planning/ComfyUI-Doctor Architecture In-Depth Analysis and Optimization Blueprint.md`
393421
- [ ] **A5**: Create `LLMProvider` Protocol for unified LLM interface - 🟡 Medium ⚠️ *Use dev branch*
422+
- [ ] **A8**: Plugin Migration Tooling (Plan 6.3) - 🟡 Medium
423+
- **Goal**: Reduce configuration friction for safe-by-default plugin policy (manifest + allowlist helpers; optional HMAC signer).
424+
- **Deliverables**: `scripts/plugin_manifest.py`, `scripts/plugin_allowlist.py`, optional `scripts/plugin_hmac_sign.py`
425+
- **Acceptance**: Generate valid manifests + allowlist snippet in one command; safe defaults (`--dry-run`); never prints/writes secret keys.
426+
- **Plan Update Record**: `.planning/260109-PHASE2_CI_GATE_AND_MIGRATION_TOOLING_PLAN_UPDATE_RECORD.md`
394427
- [ ] **A4**: Convert `NodeContext` to `@dataclass(frozen=True)` + validation - 🟡 Medium ⚠️ *Use dev branch*
395428
- [x] **A1**: Add `py.typed` marker + mypy config in pyproject.toml - 🟢 Low ✅ *Completed (Phase 3A)*
396429
- [x] **A2**: Integrate ruff linter (replace flake8/isort) - 🟢 Low ✅ *Completed (Phase 3A)*
397430
- [x] **A3**: Add pytest-cov with `--cov-report=term-missing` - 🟢 Low ✅ *Completed (Phase 3A)*
398431

399432
### 3.5 Testing (in progress)
400433

401-
*Sorted by priority (High → Low):*
434+
*Sorted by priority (High → Low), then by item number:*
435+
436+
- [ ] **T11**: Phase 2 Release Readiness CI Gate (Plan 6.1) - 🔴 High
437+
- **Goal**: Make Phase 2 hardening non-regressable (required checks before merge/release).
438+
- **Gate**: `pytest -q tests/test_plugins_security.py`, `tests/test_metadata_contract.py`, `tests/test_pipeline_dependency_policy.py`, `tests/test_outbound_payload_safety.py`, plus `npm test`.
439+
- **Acceptance**: Branch protection requires the gate; stable runtime (< ~3 minutes typical).
440+
- **Plan Update Record**: `.planning/260109-PHASE2_CI_GATE_AND_MIGRATION_TOOLING_PLAN_UPDATE_RECORD.md`
441+
442+
- [ ] **T12**: Outbound Funnel Static CI Gate (Plan 6.2) - 🟡 Medium
443+
- **Goal**: Fail CI if new outbound call paths bypass `outbound.py` / `sanitize_outbound_payload(...)`.
444+
- **Approach**: Grep guard (fast) or AST scan (precise) to detect suspicious raw-field usage outside `outbound.py`.
445+
- **Acceptance**: CI fails on bypass attempts; low false-positive rate.
446+
- **Plan Update Record**: `.planning/260109-PHASE2_CI_GATE_AND_MIGRATION_TOOLING_PLAN_UPDATE_RECORD.md`
402447

403448
- [x] **T8**: Pattern Validation CI - 🟡 Medium ✅ *Completed (2026-01-03)*
404449
- **Problem**: Pattern format errors and i18n gaps can break the system
@@ -422,12 +467,12 @@ graph TD
422467
- **Implementation**:
423468
- Test harness loads full Doctor UI without ComfyUI ✅
424469
- Mock ComfyUI environment (app, api, extensionManager) ✅
425-
- Settings panel tests (12 tests): toggle, language selector, provider selector, inputs
426-
- Chat interface tests (8 tests): messages area, input/send/clear buttons, error context ✅
427-
- Statistics dashboard tests (18 tests): panel, cards, patterns, categories, i18n ✅
428-
- Preact loader tests (8 tests): module loading, flags, error handling
470+
- Settings panel tests (13 tests): toggle, selectors, persistence, trust/health refresh
471+
- Chat interface tests (15 tests): messages, inputs, error context, sanitization status, analyze CTA
472+
- Statistics dashboard tests (19 tests): panel, cards, categories, i18n, resolution actions
473+
- Preact loader tests (14 tests): module loading, flags, vendor failure fallback
429474
- API endpoint mocks for backend calls ✅
430-
- **Test Results**: 100% pass rate (46/46 tests)
475+
- **Test Results**: 100% pass rate (61/61 tests)
431476
- **Execution time**: ~16 seconds for full test suite (Chromium, 10 workers)
432477
- **How to Run Tests**:
433478
<details>
@@ -449,7 +494,11 @@ graph TD
449494

450495
</details>
451496
- **Implementation Record**: `.planning/260103-T2_playwright_test_infrastructure.md`
497+
- **Follow-up Record**: `.planning/260109-PHASE2_FOLLOWUP_TRUST_HEALTH_UI_AND_E2E_RECORD.md`
452498
- **Foundation for**: CI/CD integration, UI regression detection
499+
- [x] **T10**: Playwright E2E Runner Hardening (WSL `/mnt/c`) - 🟢 Low ✅ *Completed (2026-01-09)*
500+
- **Goal**: Make `npm test` stable on WSL + Windows-mounted paths (transform cache / temp permissions + python shim).
501+
- **Implementation Record**: `.planning/260109-PHASE2_FOLLOWUP_TRUST_HEALTH_UI_AND_E2E_RECORD.md`
453502
- [ ] **T9**: External Environment Test Coverage Expansion (Non-ComfyUI) - 🟡 Medium
454503
- **Goal**: Cover pipeline integration, SSE/REST contracts, and UI contracts without a live ComfyUI runtime
455504
- **Phases**:
@@ -469,6 +518,10 @@ graph TD
469518
- [ ] **D1**: OpenAPI/Swagger spec - 🟡 Medium ⚠️ *Use dev branch*
470519
- [ ] **D2**: Architecture documentation - 🟢 Low
471520
- [ ] **D3**: Contribution guide - 🟢 Low
521+
- [x] **D4**: Plugin Trust/Signature Documentation - 🟢 Low ✅ *Completed (2026-01-09)*
522+
- **Scope**: clarify plugin trust model + HMAC threat model (shared-secret integrity, not public signing)
523+
- **Files**: `README.md`, `docs/PLUGIN_GUIDE.md`, `ROADMAP.md`
524+
- **Record**: `.planning/260109-PHASE2_FOLLOWUP_TRUST_HEALTH_UI_AND_E2E_RECORD.md`
472525

473526
> [Note]
474527
> Items marked with ⚠️ should be developed on a separate `dev` branch. Merge to `main` only after thorough testing.
@@ -611,10 +664,11 @@ graph TD
611664
**Completed Tasks**:
612665
613666
- [x] **T2** Frontend Interaction Tests (Playwright) ✅ *Completed (2026-01-04)*
614-
- 46 end-to-end tests for Doctor UI (settings panel, chat interface, statistics dashboard, preact loader)
667+
- 61 end-to-end tests for Doctor UI (settings panel, chat interface, statistics dashboard, preact loader, trust/health)
615668
- 100% pass rate, execution time ~16 seconds (Chromium, 10 workers)
616669
- Ready for CI/CD integration
617670
- See `.planning/260103-T2_playwright_test_infrastructure.md`
671+
- Runner hardening for WSL `/mnt/c` environments: `.planning/260109-PHASE2_FOLLOWUP_TRUST_HEALTH_UI_AND_E2E_RECORD.md`
618672
619673
**Pending UI i18n Completion** (from Phase 4B):
620674
@@ -636,7 +690,7 @@ graph TD
636690
- API: `/doctor/statistics` (GET) and `/doctor/mark_resolved` (POST)
637691
- Frontend: Collapsible statistics panel in sidebar with error trends, top patterns, category breakdown, and resolution tracking
638692
- Features: 24h/7d/30d time ranges, Top 5 error patterns, resolution rate tracking (resolved/unresolved/ignored)
639-
- Testing: 17/17 backend unit tests passed; statistics E2E tests 18/18 passed; full Playwright suite 46/46 passed
693+
- Testing: 159/159 Python tests passed; full Playwright suite 61/61 passed (latest)
640694
- i18n: Fully translated across all 9 languages
641695
- See `.planning/260104-F4_STATISTICS_RECORD.md` for implementation details
642696
- [ ] **R6-R7** Network reliability improvements

0 commit comments

Comments
 (0)