Skip to content

mng/vendor-multiple#896

Open
joshalbrecht wants to merge 49 commits intojosh/mind_fixesfrom
mng/vendor-multiple
Open

mng/vendor-multiple#896
joshalbrecht wants to merge 49 commits intojosh/mind_fixesfrom
mng/vendor-multiple

Conversation

@joshalbrecht
Copy link
Contributor

Automated PR created by Claude Code session.

qi-imbue and others added 30 commits March 11, 2026 15:12
…a agents

This plugin implements a map-reduce pattern for tests:
1. Collects tests via pytest --collect-only
2. Launches one agent per test to run and optionally fix failures
3. Polls agents until completion, pulls code changes for successful fixes
4. Generates an HTML report summarizing all outcomes

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…st args

- Rename CLI command from 'test-mapreduce' to 'tmr' (single-word
  requirement for MNG_COMMANDS env var parsing)
- Fix trailing comments ratchet by using rgb() colors instead of hex
- Fix inline import in cli_test.py
- Accept pytest args via click.UNPROCESSED so flags like -m are passed
  through without Click trying to parse them

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Interactive Claude Code sessions enter WAITING state when done with a
task rather than DONE, since they await further user input.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Always populate branch_name from agent's initial_branch (was only set
  on pull, so it was always "-" in the report)
- Group results into separate tables by outcome, with RUN_SUCCEEDED last
- Change FIX_*_SUCCEEDED color to blue (rgb(33, 150, 243))
- Add horizontally stacked bar chart at the top showing outcome distribution
- Append short random hex ID to agent names to avoid collisions

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…ytest args

Reviewer fixes:
- Fix infinite polling loop: agents that disappear from listings are
  treated as errors after 30 rounds of not being seen
- Remove dead code: ReadResultError and TestMapReduceParams
- Replace custom _html_escape with stdlib html.escape
- Stop agents when they enter WAITING state (after harvesting results)

New features:
- Agent summaries are now requested in markdown and rendered as HTML in
  the report using markdown-it-py
- Split pytest argument passing with -- separator: positional args before
  -- are test paths, args after -- are pytest flags shared between
  discovery and individual test runs (e.g. mng tmr tests/e2e -- -m release)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Rename project from mng-test-mapreduce to mng-tmr
- Add --timeout flag (default 1h) that controls max wait time after
  agent launch; stops all pending agents when reached
- Add TIMED_OUT outcome for agents killed by the timeout
- Timeout uses time.monotonic() deadline computed after all agents launch

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Directory/module rename:
- libs/mng_test_mapreduce -> libs/mng_tmr
- imbue.mng_test_mapreduce -> imbue.mng_tmr
- Entry point: tmr = "imbue.mng_tmr.plugin"

Continuous HTML report:
- Report written after all agents launch (all PENDING)
- Updated after each polling round as agents finish
- PENDING pseudo-outcome (cyan) shown at top of report for running agents

Integrator agent:
- After all test agents finish, if any FIX_*_SUCCEEDED branches exist,
  a tmr-integrator-{id} agent is launched to merge them into one branch
- --integrator-timeout flag (default 1h) controls how long to wait
- Integrator's merged branch reported at top of final HTML report

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Skip regenerating the HTML report on polling rounds where no agent
transitioned to a terminal state.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…rams

Click fills unfilled optional positional arguments from args after --
before putting the rest into the variadic. This caused commands like:

  mng create selene --type claude-mind -- --dangerously-skip-permissions

to assign --dangerously-skip-permissions to positional_agent_type instead
of agent_args, breaking the override_command_options hook.

Fix: add _CreateCommand subclass that intercepts -- before Click's parser
runs, strips after-dash args, and appends them to agent_args after parsing.
This eliminates the fragile _was_value_after_double_dash sys.argv workaround.

Also fix get_agent_type_from_params (mind plugin) and the hookspec example
to use the correct Click param key "type" instead of "agent_type".

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Tests all key scenarios for the -- passthrough fix:
- --type with -- passthrough (Josh's reported bug)
- Positional name + type with -- passthrough
- --type with multiple after-dash args
- No -- (normal positional parsing)
- Bare -- with nothing after it
- No positional name with --type and --
- Pre-dash and post-dash agent_args merged correctly

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
# Conflicts:
#	libs/mng/imbue/mng/cli/create.py
#	libs/mng/imbue/mng/cli/create_test.py
Applies the flag renames from PRs #774 and #829 to other commands:

- cleanup: --agent-type -> --type, --tag -> --host-label
  (fields: agent_type -> type, tag -> host_label)
- list: --tag -> --host-label (field: tag -> host_label),
  header label "TAGS" -> "HOST LABELS"
- clone/migrate: updated examples from --in to --provider
- READMEs: --in modal/docker -> --provider, --host -> address syntax
- mega_tutorial.sh: --in -> --provider, --host -> address syntax,
  --host-name -> address syntax with --new-host, config keys updated
- concepts/agents.md: "tags" -> "host labels" in documentation
- test_message.py: --agent-type -> --type (was already stale)
- Regenerated auto-generated docs via make_cli_docs.py

Not changed:
- snapshot --tag (snapshot metadata, not host labels)
- push/pull --target-host (unrelated future flag)
- minds deploy --agent-type (separate CLI surface)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Move AgentAddress and parse_agent_address from create.py to a shared
agent_addr.py module. Add find_agents_by_addresses (multi-agent) and
find_agent_by_address (single-agent) wrapper functions that parse
NAME@HOST.PROVIDER addresses and filter results by host/provider.

Updated commands:
- Multi-agent: start, stop, destroy, exec, message, limit, snapshot
- Single-agent: connect, rename, provision (via agent_utils)

Add host_name field to AgentMatch for address-based filtering.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Combine the nested if/else/if pattern into a single condition
to satisfy the if-elif-without-else ratchet check.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…arding)

The event watcher now generates three types of synthetic events that are
injected directly into the delivery buffer alongside real events:

- mind/idle: Periodic notifications when no real events arrive. Configurable
  via idle_event_delay_minutes_schedule (e.g. [1, 10, 60] means first idle
  event after 1 min, next after 11 min, then every 60 min). Includes current
  time in UTC and user timezone, and minutes since last event.

- mind/schedule: Time-of-day events in the user's timezone. Configurable via
  scheduled_events dict mapping event names to time strings (e.g. "09:00",
  "17:30:00"). Each event fires once per day with persisted state to survive
  restarts.

- mind/onboarding: One-time event on first run, tracked by a marker file.
  Triggers initial user onboarding flow.

New settings in [watchers] section of minds.toml:
  - idle_event_delay_minutes_schedule: list of minute delays
  - scheduled_events: dict of event name -> time-of-day string
  - user_timezone: IANA timezone name (default "UTC")

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Wrap int() calls in _parse_time_of_day with try/except to convert
  ValueError to InvalidTimeFormatError, preventing thread crashes on
  malformed time config strings like "abc:def"

- Replace hardcoded "mind/idle", "mind/schedule", "mind/onboarding"
  string literals with local _SOURCE_MIND_* constants (matching the
  SOURCE_MIND_* constants in data_types.py)

- Extract SyntheticLoopEnv fixture in conftest.py to deduplicate test
  setup across 7 synthetic loop tests

- Extract _maybe_send_onboarding, _maybe_send_idle_event, and
  _check_scheduled_events helpers from _run_synthetic_events_loop to
  reduce function length and improve testability

- Remove duplicate total_idle_minutes computation (reuse elapsed_minutes)

- Fix misleading test name and add proper assertion

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
# Conflicts:
#	libs/mng/imbue/mng/cli/stop.py
Replace the duplicated matching logic in _find_agents_to_destroy with a
call to the shared find_agents_by_addresses, then partition the results
into online/offline targets. This eliminates _is_agent_targeted_by_address
and the (raw, plain_id, address) tuple threading through
_handle_offline_or_unreachable_host.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Move is_new_host_implied and is_creating_new_host from AgentAddress
  to private functions in create.py (they are create-specific logic)
- Add CLI integration tests for address syntax in stop and destroy
- Remove accidentally committed next.md

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Fix 12 test functions with broken names (testparse_ -> test_parse_)
  that were silently not being discovered by pytest
- Re-add address parsing in message.py (was reverted by linter)
- Fix exec.py to properly resolve addresses via find_agents_by_addresses
  and pass agent IDs, ensuring host/provider filtering works
- Fix __is_new_host_implied double underscore to single underscore

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add welcome_message field to ChatSettings (default: "Hi, I'm Selene! How can I help?")
so it can be configured in minds.toml under [chat].welcome_message. Pass the setting
through create_first_daily_conversation instead of hardcoding the message.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Make support for agent address universal
Apply flag renames from mng create to other commands for consistency
Resolve conflicts:
- create.py: Keep _CreateCommand, remove AgentAddress (moved to agent_addr.py on main)
- create_test.py: Import AgentAddress/parse_agent_address from agent_addr, keep _CreateCommand import
- plugin.py: Keep detailed docstring for get_agent_type_from_params, drop register_cli_commands (removed on main)
- plugin_test.py: Keep our branch's renamed test functions (returns_type, prefers_type)
qi-imbue and others added 19 commits March 16, 2026 15:52
Keep both --cov=imbue.mng_tmr (ours) and --cov=imbue.minds (theirs,
replacing changelings).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Extract _collect_agent_results to handle the common iteration over agents
(timed-out check, missing-detail fallback, result reading). The two callers
differ only in what outcome to assign for missing agents (AGENT_ERROR vs
PENDING) and whether to pull branches.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The test's _test_create_cmd used @click.option("--type", "agent_type") which
mapped --type to param name "agent_type", but the real create command uses
@optgroup.option("--type") with no explicit mapping (Click defaults to "type").

Fix: remove explicit param name mapping so the test command's ctx.params uses
"type" as the key, matching the real command. Use **kwargs to avoid shadowing
the Python builtin 'type' (the real command does the same).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Fixes test_every_project_has_pypi_readme CI failure.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…g_claude_mind changes

- Raise UserInputError when both --type and positional agent type are
  specified with different values (e.g. `mng create myagent codex --type claude`)
- Revert mng_claude_mind plugin.py and plugin_test.py to main (no longer needed)
- Add tests for the conflict error and for matching values being accepted

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…ests

- message.py: Include host/provider constraints from parsed addresses in
  CEL filters instead of silently discarding them. Add host.name to the
  CEL context so host-name-based filtering works.
- create.py: Fix error message referencing non-existent --host flag to
  instead reference the address syntax (NAME@HOST.PROVIDER).
- agent_addr.py: Extract post-filtering logic into a testable pure
  function (_post_filter_matches_by_addresses) and fix zip(strict=False)
  to zip(strict=True) since the lists are always the same length.
- destroy.py: Remove redundant seen_hosts set (dict.items() already
  yields unique keys).
- agent_addr_test.py: Add 7 direct unit tests for post-filter logic
  covering host filtering, provider filtering, mixed constraints, and
  error cases.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add mng-test-mapreduce plugin for parallel test running and fixing vi…
Fix parsing of positional arguments of mng create
…versation

chat.sh now outputs key=value pairs (conversation_id=, message_id=) when
injecting messages via --reply and --new --as-agent, so callers can reference
the injected message. The message_id is the response ID from the llm database,
retrieved via a new `mng llmdb last-response-id` subcommand.

A "Work Log" conversation is now created during provisioning alongside the
Daily Thread, giving agents a dedicated place to communicate current activity
for debugging and user visibility.

Also fixes a pre-existing test that referenced a stale welcome message string.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…minor issues

- Add tests for create_work_log_conversation (inject + record + failure)
- Add test for last_response_id DB query
- Add integration tests for message_id output (reply path, as-agent path)
- Add internal "work_log" tag for consistent programmatic identification
- Update conversation_db.py docstring with last-response-id subcommand
- Move inline sqlite3 import to top of conftest.py

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Move inline import in cli.py to module level to avoid bumping the
  inline imports ratchet (28 -> 28, no change)
- Move last_response_id unit tests from test_integration.py to
  conversation_db_test.py where all other conversation_db unit tests live
- Add found/not-found/missing-db test variants using capsys
- Add main() dispatch test for last-response-id subcommand

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The test_chat_script_reply_outputs_message_id test requires the llm CLI
to create a real conversation. Skip it in CI where llm is not available.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replace the hardcoded mng-only git clone vendoring with a configurable
system that uses git subtree. minds.toml now supports a [[vendor]] array
where each entry specifies a repo to add as a git subtree under vendor/.

Key changes:
- VendorRepoConfig data type with url (remote) or path (local) source
- Local repos are checked for cleanliness before vendoring
- Refs default to current HEAD when not specified
- Backward compatible: no [[vendor]] config falls back to vendoring mng

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…hen test

- Document the new [[vendor]] config section in design.md and user_story.md
- Remove redundant DirtyRepoError from except clause (already caught by VendorError)
- Replace weak test assertion with specific git-subtree-dir check

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
git subtree add creates merge commits, which require a committer
identity. In environments without a global git config (e.g. CI
containers), this caused failures. Now we set a repo-local identity
if one is not already configured.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants