Skip to content

mng/fix-injection#902

Open
joshalbrecht wants to merge 33 commits intojosh/mind_readyfrom
mng/fix-injection
Open

mng/fix-injection#902
joshalbrecht wants to merge 33 commits intojosh/mind_readyfrom
mng/fix-injection

Conversation

@joshalbrecht
Copy link
Contributor

Automated PR created by Claude Code session.

joshalbrecht and others added 30 commits March 17, 2026 09:56
When creating a new mind, after cloning the repo:
1. Record the parent branch and commit hash
2. Checkout a new branch named minds/<mind_name>
3. Write a .parent file (git config format) tracking the parent URL,
   branch, and commit hash
4. Commit the .parent file

Add a "mind update <agent-name>" CLI command that:
A) Stops the mind (mng stop)
B) Fetches and merges the latest code from the parent repository
C) Updates all vendored git subtrees (git subtree pull)
D) Starts the mind back up (mng start)

Also makes run_git and ensure_git_identity public in vendor_mng.py
so they can be reused by parent_tracking.py.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
… helper

- Use GitUrl, GitBranch, and new GitCommitHash primitives in ParentInfo
  instead of bare str fields (style guide compliance)
- Create MindAgentRecord FrozenModel for find_mind_agent return type
  instead of raw dict[str, object]
- Extract shared make_git_repo helper to forwarding_server conftest.py
  to eliminate duplication between parent_tracking_test and vendor_mng_test
- Find agents by name (--include 'name == "..."') instead of by label
- Deduplicate run_mng_stop/run_mng_start into _run_mng_command helper
- Remove unused _MNG_COMMAND_TIMEOUT_SECONDS constant
- Add logger.trace for swallowed JSONDecodeError in _parse_agents_from_output
- Fix pull_subtree docstring reference to private function

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Rename _parse_agents_from_output to parse_agents_from_mng_output (public)
  and have testing.py's parse_mng_list_json delegate to it, eliminating
  the duplicate JSON parsing logic
- Move make_git_repo from forwarding_server/conftest.py to testing.py
  per project convention (conftest.py is for fixtures, testing.py is for
  explicitly imported helper functions)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Fix test function names: test_parse_* (was missing underscore after test)
- Move parse_agents_from_mng_output to config/data_types.py (shared module)
  so neither testing.py nor cli/update.py depends on the other

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The docstring referenced cli.update but the function was moved to
config.data_types.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Move parse_agents_from_mng_output tests from cli/update_test.py to
  config/data_types_test.py (where the function lives)
- Remove now-empty cli/update_test.py
- Restore CLAUDE_CODE_MAX_OUTPUT_TOKENS env var in .claude/settings.json

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Add GitOperationError base class with ParentTrackingError subclass;
  VendorError now inherits from GitOperationError
- Add error_class parameter to run_git so callers can specify the
  appropriate error type (defaults to VendorError for compatibility)
- parent_tracking.py now raises ParentTrackingError instead of VendorError
- Use AgentName and GitUrl domain types in setup_mind_branch_and_parent
  and checkout_mind_branch signatures
- Extract parse_mind_agent_record as a testable function
- Add unit tests for MindAgentRecord and parse_mind_agent_record
- Restore CLAUDE_CODE_MAX_OUTPUT_TOKENS env var in .claude/settings.json

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
ParentTrackingError inherits from GitOperationError (not VendorError),
so the except block needs to catch GitOperationError to handle failures
from both vendor operations and parent tracking operations.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The web chat previously used SSE (Server-Sent Events) for streaming LLM
responses. This worked for user-initiated messages but could not push
messages created by the thinking agent via `llm inject` (chat.sh --reply),
since the SSE connection closed after each response.

This converts the communication to WebSocket, which maintains a persistent
bidirectional connection. A background polling thread checks the llm
database every 2 seconds for new messages and pushes them to the client
as `new_message` events. Messages sent by the user via WebSocket are
streamed as before (chunk/done events), and their response IDs are tracked
to avoid duplicate delivery from the poller.

Key changes:
- Add minimal RFC 6455 WebSocket protocol implementation (handshake,
  frame encode/decode, thread-safe send)
- Add _poll_db_for_new_messages() to detect injected messages
- Add _ws_poll_loop() background thread using Event.wait() for clean stop
- Add _handle_websocket() and _ws_handle_send() for the /ws endpoint
- Update frontend JS to use WebSocket instead of fetch+SSE
- Keep the legacy POST /api/chat/send endpoint for backward compatibility
- Move ConcurrencyGroup import to module level (reduces inline imports)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Addresses all 6 blocking issues from the review agent:

1. CRITICAL: Fix reconnect duplicate messages - clear the messages
   container before rendering history on WebSocket reconnect
2. MAJOR: Fix silent sqlite3.Error - add _log() call in the except
   handler for response ID lookup in _ws_handle_send
3. MAJOR: Reduce _ws_handle_send length - extract _build_llm_prompt_cmd,
   _build_llm_env, _finalize_new_conversation, and _ws_mark_response_as_sent
   as shared helpers (also used by _handle_chat_send, reducing duplication)
4. MAJOR: Fix race condition - move sent_ids and last_polled_rowid updates
   into the finally block BEFORE clearing is_streaming, so the polling
   thread cannot see stale state
5. MAJOR: Reduce code duplication between SSE and WebSocket send handlers
   by extracting shared command-building and env-setup logic
6. MAJOR: Remove duplicate tests from web_server_test.py that were already
   covered (with proper coverage tracking) in web_server_direct_test.py

Also fixes two MINOR issues:
- Add stderr logging to _WsOutputCallback (matching _SseOutputCallback)
- Add try/catch around JSON.parse in frontend WebSocket handler

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1. Fix isStreaming not reset on WebSocket close: call finishStreaming()
   in ws.onclose so users can send messages after a reconnect
2. Extract _run_llm_prompt() and _LlmPromptResult to share the core
   llm prompt execution logic between SSE (_handle_chat_send) and
   WebSocket (_ws_handle_send) handlers, eliminating duplicated
   command building, ConcurrencyGroup usage, and error handling
3. Add WebSocket version negotiation (Sec-WebSocket-Version: 13),
   returning 426 for unsupported versions per RFC 6455 section 4.4
4. Document continuation frame limitation in _ws_read_frame docstring
5. Fix type error: handle nullable returncode from ConcurrencyGroup

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
BaseHTTPRequestHandler.send_response() writes HTTP/1.0 by default,
but the websockets library used by the forwarding server proxy
requires HTTP/1.1 for the WebSocket upgrade. Write the status line
directly as bytes to ensure the correct protocol version.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The WebSocket magic GUID was wrong (258EAFA5-E914-47DA-95CA-5AB5A40E64BE
instead of 258EAFA5-E914-47DA-95CA-C5AB0DC85B11). This caused the
Sec-WebSocket-Accept header computation to produce incorrect values,
making the websockets library reject the handshake with
InvalidHeaderValue.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Remove the is_streaming skip in the DB polling loop. Previously, the
poller paused during active LLM streaming and then advanced the rowid
cursor to the global max on completion. This meant injected messages
(from llm inject) that arrived during streaming with a lower rowid
than the streamed response were permanently skipped.

Now the poller runs continuously. The sent_ids set already prevents
duplicate delivery of streamed responses, so there is no risk of
sending the same message twice. This ensures injected messages from
the thinking agent are always delivered to the client, which is the
core purpose of this change.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Move _ws_mark_response_as_sent() after _finalize_new_conversation()
so the real conversation ID is used for the DB lookup instead of "NEW".
Previously, the SQL query used cid="NEW" which matched nothing in the
responses table, leaving the response unmarked in sent_ids and causing
the poller to re-deliver it as a duplicate.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Extract the WebSocket handler logic into a _WebSocketSession class that
encapsulates all shared state and provides methods for recv loop, poll
loop, message sending, and response tracking. The entry point
_handle_websocket now just does the HTTP upgrade handshake and delegates
to the session object.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
joshalbrecht and others added 3 commits March 17, 2026 13:15
Remove 19 tests from web_server_test.py that are exact duplicates of
tests in web_server_direct_test.py. The direct test file provides
proper coverage tracking (via direct imports) while the dynamic module
tests (via exec) do not contribute to coverage metrics.

Removed duplicates: _html_escape (3), _make_event_id (1),
_iso_timestamp (1), _get_default_chat_model (2),
_read_message_history (1), _render_header (4), _render_iframe_page (3),
_render_agents_page (1), render_conversations_page header links (1),
render_agents_page header links (1), _render_iframe_page header (1).

Kept in web_server_test.py: tests that depend on env-var-configured
module state (conversations DB, message history, server registration,
chat settings from file, sidebar rendering, agent list with state).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Replace _create_responses_table with create_test_llm_db from conftest
  to reuse the canonical LLM_RESPONSES_SCHEMA instead of a simplified
  duplicate schema
- Add comment explaining why WebSocket protocol is implemented with
  stdlib only: web_server.py is deployed as a standalone resource to
  agent hosts where websockets/wsproto are not installed

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The test_mng_create_with_build_args_on_modal failure was caused by
"rsync: kex_exchange_identification: read: Connection reset by peer"
from a Modal sandbox -- an infrastructure issue unrelated to this PR.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant