Add beam and MCTS search strategies with tests #10

tylerbessire · 2025-09-12T10:52:46Z

Summary

implement beam search with constraint propagation and metrics
add Monte Carlo Tree Search option with feature flag
cover advanced search strategies with property-based tests

Testing

pytest -q
python - <<'PY' import numpy as np, time, logging from arc_solver.grid import to_array from arc_solver.beam_search import beam_search logging.getLogger().setLevel(logging.ERROR) inp = to_array([[1,2],[3,4]]) out = np.rot90(inp, -1) start = time.time() beam_search([(inp, out)], beam_width=5, depth=2) print('elapsed_ms', round((time.time()-start)*1000,2)) PY

https://chatgpt.com/codex/tasks/task_e_68c3f8a67f748322b130f269263b894c

Summary by CodeRabbit

New Features
- Added beam search with constraint propagation and Monte Carlo Tree Search as additional program synthesis strategies.
- Updated enhanced search pipeline to prioritize these strategies, with an option to enable/disable beam search.
Documentation
- Updated README to highlight new beam search capability.
- Progress documented in AGENTS, including notes on beam search and MCTS.
Tests
- Added tests covering beam search, MCTS, rotation cases, and no-solution scenarios.
Chores
- Added Hypothesis as a dependency for property-based testing.

coderabbitai · 2025-09-12T10:52:54Z

Caution

Review failed

The pull request is closed.

Walkthrough

Adds beam search and MCTS modules, integrates them into EnhancedSearch with a reordered, gated synthesis pipeline, updates docs to mention beam search progress and capability, introduces tests for beam and MCTS rotation cases (including Hypothesis-based), and adds Hypothesis to requirements.

Changes

Cohort / File(s)	Summary
Documentation updates `AGENTS.md`, `README.md`	Updated progress/date and notes for beam search; added README bullet for beam search capability.
New search algorithms `arc_solver/beam_search.py`, `arc_solver/mcts_search.py`	Introduced beam_search function with depth-limited expansion, scoring, beam pruning, and stats; added MCTS with Node class, UCB selection, rollout via scoring, and best-program selection.
Enhanced orchestrator `arc_solver/enhanced_search.py`	Integrated beam search and MCTS into synthesis flow; added enable_beam_search flag; reordered pipeline; extended search_stats with beam-related metrics.
Tests `tests/test_beam_search.py`	Added unit and property-based tests for beam search and MCTS solving rotations, no-solution case, and stats checks.
Dependencies `requirements.txt`	Added hypothesis==6.100.2.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant Caller
  participant EnhancedSearch
  participant Beam as BeamSearch
  participant MCTS as MCTS
  participant Neural as NeuralGuided
  participant Sketch as SketchSearch
  participant Adapt as TestTimeAdapt

  Caller->>EnhancedSearch: synthesize_enhanced(train_pairs, max_programs,...)
  Note over EnhancedSearch: Initialize candidates, search_stats

  alt enable_beam_search && candidates < max_programs
    EnhancedSearch->>Beam: beam_search(train_pairs, beam_width, depth)
    Beam-->>EnhancedSearch: beam_programs, {nodes_expanded}
    Note over EnhancedSearch: Update beam_candidates, beam_nodes_expanded
  end

  alt enable_beam_search && candidates < max_programs//2
    EnhancedSearch->>MCTS: mcts_search(train_pairs, iterations, max_depth)
    MCTS-->>EnhancedSearch: mcts_programs
    Note over EnhancedSearch: Update mcts_candidates
  end

  alt candidates < max_programs//4
    EnhancedSearch->>Neural: run()
    Neural-->>EnhancedSearch: neural_programs
  end

  alt candidates < max_programs//2
    EnhancedSearch->>Sketch: run()
    Sketch-->>EnhancedSearch: sketch_programs
  end

  opt candidates > 0
    EnhancedSearch->>Adapt: adapt(candidates)
    Adapt-->>EnhancedSearch: adapted_candidates
  end

  EnhancedSearch-->>Caller: final_candidates, search_stats

sequenceDiagram
  autonumber
  participant Caller
  participant Beam as beam_search
  participant DSL as DSL Ops

  Caller->>Beam: beam_search(train_pairs, beam_width, depth, max_expansions)
  loop depth levels
    Beam->>DSL: enumerate ops × parameter grid
    DSL-->>Beam: candidate programs
    loop candidates
      Beam->>Beam: score_candidate(program)
      alt score >= 0.999
        Note over Beam: store as complete
      else
        Note over Beam: keep for next beam
      end
    end
    Note over Beam: sort by score, truncate to beam_width
    alt no expansions or max_expansions hit
      Note over Beam: break
    end
  end
  Beam-->>Caller: complete_programs[:beam_width], {nodes_expanded}

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Poem

In beams and trees I hop with glee,
Prune the branches, let winners be.
UCB whispers, “Try that turn,”
Beams glow bright where scores return.
Docs now note my plotted quest—
Hypothesis carrots pass the test.
Thump-thump: synthesis at its best! 🥕

Tip

👮 Agentic pre-merge checks are now available in preview!

Pro plan users can now enable pre-merge checks in their settings to enforce checklists before merging PRs.

Built-in checks – Quickly apply ready-made checks to enforce title conventions, require pull request descriptions that follow templates, validate linked issues for compliance, and more.
Custom agentic checks – Define your own rules using CodeRabbit’s advanced agentic capabilities to enforce organization-specific policies and workflows. For example, you can instruct CodeRabbit’s agent to verify that API documentation is updated whenever API schema files are modified in a PR. Note: Upto 5 custom checks are currently allowed during the preview period. Pricing for this feature will be announced in a few weeks.

Please see the documentation for more information.

Example:

reviews:
  pre_merge_checks:
    custom_checks:
      - name: "Undocumented Breaking Changes"
        mode: "warning"
        instructions: |
          Pass/fail criteria: All breaking changes to public APIs, CLI flags, environment variables, configuration keys, database schemas, or HTTP/GraphQL endpoints must be documented in the "Breaking Change" section of the PR description and in CHANGELOG.md. Exclude purely internal or private changes (e.g., code not exported from package entry points or explicitly marked as internal).

Please share your feedback with us on this Discord post.

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 63123e2 and aeab549.

📒 Files selected for processing (7)

AGENTS.md (1 hunks)
README.md (1 hunks)
arc_solver/beam_search.py (1 hunks)
arc_solver/enhanced_search.py (3 hunks)
arc_solver/mcts_search.py (1 hunks)
requirements.txt (1 hunks)
tests/test_beam_search.py (1 hunks)

✨ Finishing touches

📝 Generate Docstrings

🧪 Generate unit tests

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch codex/complete-phase-4-after-reviewing-agents.md

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

test: cover beam and MCTS search [S:TEST v1] pass

aeab549

tylerbessire added the codex label Sep 12, 2025 — with ChatGPT Codex Connector

tylerbessire merged commit 3b1ee75 into main Sep 12, 2025
2 of 6 checks passed

tylerbessire deleted the codex/complete-phase-4-after-reviewing-agents.md branch September 12, 2025 10:53

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add beam and MCTS search strategies with tests #10

Add beam and MCTS search strategies with tests #10

Uh oh!

tylerbessire commented Sep 12, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Sep 12, 2025 •

edited

Loading

Review failed

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Add beam and MCTS search strategies with tests #10

Add beam and MCTS search strategies with tests #10

Uh oh!

Conversation

tylerbessire commented Sep 12, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Testing

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Sep 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review failed

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Poem

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

tylerbessire commented Sep 12, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Sep 12, 2025 •

edited

Loading