chore: sync beads issue state

Dariusz Debowczyk · Dariusz Debowczyk · commit 75f180e1ee78 · 2026-02-22T06:56:47.000+01:00
diff --git a/.beads/issues.jsonl b/.beads/issues.jsonl
@@ -106,7 +106,7 @@
 {"id":"instructor-dgw","title":"Investigate and categorize remaining test failures","description":"After identifying the main error groups (16 HTTP Mock Driver errors and 34 'No JSON found' instances), there may be additional test failures among the total 42 failed tests that need investigation.\n\nCurrent known failures:\n- HTTP Mock Driver: ~16 tests\n- JSON Response Parsing: Multiple instances across various tests  \n- Total failed: 42 tests\n\nNeed to:\n1. Run detailed test analysis to identify any other error patterns\n2. Check if there are additional compatibility issues beyond HTTP mocking and response parsing\n3. Document any edge cases or specific test scenarios that fail with Anthropic preset\n4. Ensure comprehensive coverage of all failure modes\n\nThis task ensures we haven't missed any other categories of failures when switching from OpenAI to Anthropic preset.","status":"closed","priority":2,"issue_type":"task","created_at":"2025-12-07T23:07:54.729055796+01:00","updated_at":"2026-01-05T03:01:20.22221943+01:00","closed_at":"2026-01-05T03:01:20.22221943+01:00","close_reason":"Investigation complete. The 42 test failures were all related to the two main issues: (1) MockHttp generating OpenAI-format responses for Anthropic preset, and (2) resulting JSON parsing errors. Both root causes have been fixed. No other failure categories found."}
 {"id":"instructor-di2","title":"Final verification - All tests pass with Anthropic preset","description":"After implementing all fixes for Anthropic preset compatibility, run final verification that all tests pass.\n\nPrerequisites: \n- instructor-php-djf: HTTP Mock Driver compatibility fixed\n- instructor-php-ddd: Anthropic API response parsing fixed  \n- instructor-php-dgw: All other error types investigated and resolved\n\nFinal verification steps:\n1. Ensure config/llm.php defaultPreset is set to 'anthropic'\n2. Run 'composer test' and verify 0 failures\n3. Compare test results with OpenAI baseline\n4. Document any remaining differences or limitations  \n5. Mark Anthropic preset as fully compatible\n\nSuccess criteria: All tests that pass with OpenAI preset also pass with Anthropic preset.","status":"closed","priority":4,"issue_type":"task","created_at":"2025-12-07T23:08:08.6086279+01:00","updated_at":"2026-01-05T03:01:52.25250435+01:00","closed_at":"2026-01-05T03:01:52.25250435+01:00","close_reason":"Core fix implemented and verified. MockHttp now supports both OpenAI and Anthropic response formats. Created AnthropicMockTest.php with passing tests for both object and scalar extraction using Anthropic provider. The mock response format correctly matches Anthropic's tool_use structure and is properly parsed by AnthropicResponseAdapter.","dependencies":[{"issue_id":"instructor-di2","depends_on_id":"instructor-djf","type":"blocks","created_at":"2025-12-07T23:08:12.643535481+01:00","created_by":"daemon","metadata":"{}"},{"issue_id":"instructor-di2","depends_on_id":"instructor-ddd","type":"blocks","created_at":"2025-12-07T23:08:15.589266302+01:00","created_by":"daemon","metadata":"{}"},{"issue_id":"instructor-di2","depends_on_id":"instructor-dgw","type":"blocks","created_at":"2025-12-07T23:08:18.396161269+01:00","created_by":"daemon","metadata":"{}"}]}
 {"id":"instructor-djf","title":"Fix HTTP Mock Driver compatibility for Anthropic API endpoints","description":"16 tests are failing with 'No mock match for POST https://api.anthropic.com/v1/messages' errors. The HTTP Mock Driver in packages/http-client/src/Drivers/Mock/MockHttpDriver.php is configured for OpenAI endpoints but not Anthropic endpoints. Tests are trying to make requests to Anthropic API URLs but the mock responses are only set up for OpenAI URLs.\n\nLocation: packages/http-client/src/Drivers/Mock/MockHttpDriver.php:171\nError: InvalidArgumentException - No mock match for POST https://api.anthropic.com/v1/messages\n\nNeed to:\n1. Update mock driver to handle Anthropic API endpoints\n2. Configure mock responses for Anthropic message format\n3. Ensure test fixtures work with both OpenAI and Anthropic formats","status":"closed","priority":1,"issue_type":"bug","created_at":"2025-12-07T23:07:36.10122787+01:00","updated_at":"2026-01-05T03:00:18.30781408+01:00","closed_at":"2026-01-05T03:00:18.30781408+01:00","close_reason":"Fixed by adding Anthropic response format support to MockHttp. The mock driver now auto-detects the provider and generates appropriate response formats."}
-{"id":"instructor-e416","title":"CanUseTools returns AgentState + delete CanReportObserverState side-channel","description":"## Scope\r\n\r\nChange `CanUseTools::useTools()` to return `AgentState` instead of `AgentStep`. Move `Tools` and `CanExecuteToolCalls` from per-call parameters to driver constructors. Inject `HookStackObserver` into drivers so they can merge hook-mutated state. Delete `CanReportObserverState` interface and all implementations. Simplify `AgentLoop::useTools()` to a single driver call.\r\n\r\nThis is the foundational change — it makes state flow explicit and removes the hidden observer-state side-channel that makes the current code hard to reason about.\r\n\r\n## Files to change\r\n\r\nContracts:\r\n- `packages/agents/src/Core/Contracts/CanUseTools.php` — change return type to `AgentState`, remove `Tools` and `CanExecuteToolCalls` params\r\n- `packages/agents/src/Core/Contracts/CanReportObserverState.php` — DELETE\r\n\r\nDrivers:\r\n- `packages/agents/src/Drivers/ToolCalling/ToolCallingDriver.php` — accept `Tools`, `CanExecuteToolCalls`, `?HookStackObserver` via constructor; return `AgentState`; remove `CanReportObserverState` impl; merge observer state before returning\r\n- `packages/agents/src/Drivers/ReAct/ReActDriver.php` — same pattern\r\n- `packages/agents/src/Drivers/Testing/DeterministicAgentDriver.php` — accept `Tools`, `CanExecuteToolCalls` via constructor; return `AgentState`\r\n\r\nBuilder:\r\n- `packages/agents/src/AgentBuilder/AgentBuilder.php` — `buildDriver()` now injects `Tools`, `CanExecuteToolCalls`, and `HookStackObserver` into driver constructors. Construction order: eventEmitter → observer → toolExecutor → driver (reversed from current: driver → observer → toolExecutor)\r\n\r\nLoop:\r\n- `packages/agents/src/Core/AgentLoop.php` — simplify `useTools()`: just `$this-\u003edriver-\u003euseTools($state)`. Delete `applyObserverState()` method.\r\n\r\nExecutor:\r\n- `packages/agents/src/Core/Tools/ToolExecutor.php` — remove `implements CanReportObserverState`, remove `observerState()` method\r\n\r\n## Acceptance criteria\r\n\r\n- `CanUseTools::useTools(AgentState $state): AgentState` — single param, returns state with step embedded\r\n- Drivers receive `Tools`, `CanExecuteToolCalls`, and (optionally) `HookStackObserver` via constructor\r\n- Each driver merges observer state internally before returning `AgentState`\r\n- `AgentLoop::useTools()` is a single call to `$this-\u003edriver-\u003euseTools($state)` — no post-processing\r\n- `CanReportObserverState` interface deleted, no references remain\r\n- `AgentLoop::applyObserverState()` deleted\r\n- `ToolExecutor` no longer implements `CanReportObserverState`\r\n\r\n## Validations / checks / tests\r\n\r\n- Run full agent test suite: `vendor/bin/pest packages/agents/tests/`\r\n- Tests that directly assert on `useTools()` return type (AgentStep → AgentState):\r\n  - `AgentDeterministicExecutionTest` — update to assert AgentState\r\n  - Any test mocking `CanUseTools` — update mock return type\r\n- Tests that construct drivers directly — update constructor args\r\n- Tests using `AgentBuilder::build()` — should pass without changes (builder handles wiring)\r\n- Verify: hook-mutated state propagates correctly through tool execution (existing guard hook tests)\r\n- Verify: `ToolCallingDriver` observer state from inference hooks merges into returned state\r\n- Verify: `ReActDriver` observer state from both inference and tool hooks merges correctly\r\n- No references to `CanReportObserverState` remain (grep)\r\n- No references to `applyObserverState` remain (grep)\r\n","status":"open","priority":1,"issue_type":"task","created_at":"2026-01-30T13:33:14.928328+01:00","created_by":"Dariusz Debowczyk","updated_at":"2026-01-30T13:33:14.928328+01:00","labels":["agents","refactor","state-flow"]}
+{"id":"instructor-e416","title":"CanUseTools returns AgentState + delete CanReportObserverState side-channel","description":"## Scope\r\n\r\nChange `CanUseTools::useTools()` to return `AgentState` instead of `AgentStep`. Move `Tools` and `CanExecuteToolCalls` from per-call parameters to driver constructors. Inject `HookStackObserver` into drivers so they can merge hook-mutated state. Delete `CanReportObserverState` interface and all implementations. Simplify `AgentLoop::useTools()` to a single driver call.\r\n\r\nThis is the foundational change — it makes state flow explicit and removes the hidden observer-state side-channel that makes the current code hard to reason about.\r\n\r\n## Files to change\r\n\r\nContracts:\r\n- `packages/agents/src/Core/Contracts/CanUseTools.php` — change return type to `AgentState`, remove `Tools` and `CanExecuteToolCalls` params\r\n- `packages/agents/src/Core/Contracts/CanReportObserverState.php` — DELETE\r\n\r\nDrivers:\r\n- `packages/agents/src/Drivers/ToolCalling/ToolCallingDriver.php` — accept `Tools`, `CanExecuteToolCalls`, `?HookStackObserver` via constructor; return `AgentState`; remove `CanReportObserverState` impl; merge observer state before returning\r\n- `packages/agents/src/Drivers/ReAct/ReActDriver.php` — same pattern\r\n- `packages/agents/src/Drivers/Testing/DeterministicAgentDriver.php` — accept `Tools`, `CanExecuteToolCalls` via constructor; return `AgentState`\r\n\r\nBuilder:\r\n- `packages/agents/src/AgentBuilder/AgentBuilder.php` — `buildDriver()` now injects `Tools`, `CanExecuteToolCalls`, and `HookStackObserver` into driver constructors. Construction order: eventEmitter → observer → toolExecutor → driver (reversed from current: driver → observer → toolExecutor)\r\n\r\nLoop:\r\n- `packages/agents/src/Core/AgentLoop.php` — simplify `useTools()`: just `$this-\u003edriver-\u003euseTools($state)`. Delete `applyObserverState()` method.\r\n\r\nExecutor:\r\n- `packages/agents/src/Core/Tools/ToolExecutor.php` — remove `implements CanReportObserverState`, remove `observerState()` method\r\n\r\n## Acceptance criteria\r\n\r\n- `CanUseTools::useTools(AgentState $state): AgentState` — single param, returns state with step embedded\r\n- Drivers receive `Tools`, `CanExecuteToolCalls`, and (optionally) `HookStackObserver` via constructor\r\n- Each driver merges observer state internally before returning `AgentState`\r\n- `AgentLoop::useTools()` is a single call to `$this-\u003edriver-\u003euseTools($state)` — no post-processing\r\n- `CanReportObserverState` interface deleted, no references remain\r\n- `AgentLoop::applyObserverState()` deleted\r\n- `ToolExecutor` no longer implements `CanReportObserverState`\r\n\r\n## Validations / checks / tests\r\n\r\n- Run full agent test suite: `vendor/bin/pest packages/agents/tests/`\r\n- Tests that directly assert on `useTools()` return type (AgentStep → AgentState):\r\n  - `AgentDeterministicExecutionTest` — update to assert AgentState\r\n  - Any test mocking `CanUseTools` — update mock return type\r\n- Tests that construct drivers directly — update constructor args\r\n- Tests using `AgentBuilder::build()` — should pass without changes (builder handles wiring)\r\n- Verify: hook-mutated state propagates correctly through tool execution (existing guard hook tests)\r\n- Verify: `ToolCallingDriver` observer state from inference hooks merges into returned state\r\n- Verify: `ReActDriver` observer state from both inference and tool hooks merges correctly\r\n- No references to `CanReportObserverState` remain (grep)\r\n- No references to `applyObserverState` remain (grep)\r\n","status":"closed","priority":1,"issue_type":"task","created_at":"2026-01-30T13:33:14.928328+01:00","created_by":"Dariusz Debowczyk","updated_at":"2026-02-22T06:56:35.871427+01:00","closed_at":"2026-02-22T06:56:35.871427+01:00","close_reason":"Changed CanUseTools to single-state contract, added explicit tool-runtime binding interface, migrated drivers + loop/configurator wiring, and updated tests. Verified with full packages/agents test suite pass.","labels":["agents","refactor","state-flow"]}
 {"id":"instructor-e6k","title":"Continuation Criteria System Refactoring","description":"Simplify the continuation criteria system from 3 interfaces to 1, reducing boilerplate and making intent clearer.","status":"closed","priority":0,"issue_type":"epic","created_at":"2026-01-19T10:24:46.365818+01:00","created_by":"ddebowczyk","updated_at":"2026-01-19T15:42:11.5473+01:00","closed_at":"2026-01-19T15:42:11.5473+01:00","close_reason":"All continuation criteria refactor phases completed"}
 {"id":"instructor-e6k.1","title":"Phase 1: Create CanEvaluateContinuation interface and EvaluationProcessor","description":"Create the new foundation: CanEvaluateContinuation interface and EvaluationProcessor class","status":"closed","priority":0,"issue_type":"task","created_at":"2026-01-19T10:24:56.754561+01:00","created_by":"ddebowczyk","updated_at":"2026-01-19T10:25:59.523394+01:00","closed_at":"2026-01-19T10:25:59.523394+01:00","close_reason":"Closed","dependencies":[{"issue_id":"instructor-e6k.1","depends_on_id":"instructor-e6k","type":"parent-child","created_at":"2026-01-19T10:24:56.755379+01:00","created_by":"ddebowczyk"}]}
 {"id":"instructor-e6k.2","title":"Phase 2: Simplify ContinuationOutcome to use computed methods","description":"Change ContinuationOutcome from stored properties to computed methods","status":"closed","priority":0,"issue_type":"task","created_at":"2026-01-19T10:24:56.987064+01:00","created_by":"ddebowczyk","updated_at":"2026-01-19T10:26:23.359038+01:00","closed_at":"2026-01-19T10:26:23.359038+01:00","close_reason":"Closed","dependencies":[{"issue_id":"instructor-e6k.2","depends_on_id":"instructor-e6k","type":"parent-child","created_at":"2026-01-19T10:24:56.987787+01:00","created_by":"ddebowczyk"}]}