Fix array comparison and add solver logging #11

tylerbessire · 2025-09-12T12:10:00Z

Summary

add heuristic training on combined ARC training and evaluation sets
expose tools/train_guidance_on_arc.py CLI and mark multi-modal reasoning step complete
store trained guidance weights in models/guidance_arc.json

Testing

pip install -r requirements.txt
pytest
python tools/train_guidance_on_arc.py --epochs 5

https://chatgpt.com/codex/tasks/task_e_68c3fda055488322804ea540d87df916

Summary by CodeRabbit

New Features
- Added neural guidance training from episodes and tasks, plus a CLI and shell script to train and save models.
- Introduced model configuration for guidance and integrated operation priors into search to bias program selection.
- Added structured logging and an enhanced/baseline ensemble with automatic fallback.
Bug Fixes
- Improved array equality handling and prevented duplicate two-attempt predictions.
Tests
- Added tests for operation-prior influence, episodic storage/retrieval, and guidance training.
Documentation
- Updated progress documentation with completion dates, results, and notes.

Train heuristic classifier on combined training and evaluation tasks and expose CLI. [S:DESIGN v1] pass [S:ALG v1] train_from_task_pairs pass [S:TRAIN v1] dataset=train+eval pass [S:TEST v1] unit=103 pass

coderabbitai · 2025-09-12T12:10:07Z

Caution

Review failed

The pull request is closed.

Walkthrough

The PR updates documentation progress markers; adds operation priors to beam search and integrates them into enhanced search; refactors grid equality; adds deterministic initialization and training APIs for neural guidance; introduces structured logging and enhanced-vs-baseline fallback in the solver; adds training scripts and a model config; and expands tests.

Changes

Cohort / File(s)	Summary
Docs progress update `AGENTS.md`	Updates checkboxes, dates, and test notes for steps 1.3, 4.2–4.4; no code changes.
Beam search with operation priors `arc_solver/beam_search.py`, `tests/test_beam_search.py`	Adds optional `op_scores` to weight candidate scores by operation; unit test asserts penalized ops are excluded from programs.
Enhanced search integrates priors `arc_solver/enhanced_search.py`	Computes `op_scores` via neural guidance and passes them to `beam_search`; control flow now incorporates priors during search.
Grid equality safety `arc_solver/grid.py`	Refactors `eq` to handle non-array operands by checking shapes for arrays and falling back to Python equality otherwise.
Neural guidance training + determinism `arc_solver/neural/guidance.py`	Sets deterministic RNG; adds `SimpleClassifier.train`, training from episodes/tasks; exposes parallel helpers on `NeuralGuidance`.
Solver logging and fallback orchestration `arc_solver/solver.py`	Adds structured logger; reorganizes enhanced vs baseline prediction path with validation and fallback; replaces prints with logging.
Two-attempt dedup safeguard `arc_solver/search.py`	Ensures second attempt differs: if two attempts produce equal outputs, replace second with copied inputs.
Training CLIs and wrapper script `tools/train_guidance_from_episodes.py`, `tools/train_guidance_on_arc.py`, `scripts/train_from_episodes.sh`	Adds CLI to train from episodic DB; adds ARC dataset trainer; adds shell wrapper invoking episode-based training.
Model configuration data `models/guidance_arc.json`	Adds JSON model config: 17→32→7 network weights/biases and operations list.
Guidance and episodic tests `tests/test_guidance_from_tasks.py`, `tests/test_guidance_training.py`, `tests/test_episodic_integration.py`	Tests training from tasks and episodes (predicting “rotate”) and episodic storage/retrieval workflow.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant Client
  participant ARCSolver
  participant Enhanced as EnhancedSearch
  participant Baseline as BaselineSearch
  participant Logger

  Client->>ARCSolver: solve_task_two_attempts(task)
  ARCSolver->>Logger: info("start")
  alt enhancements enabled
    ARCSolver->>Enhanced: synthesize_with_enhancements(task)
    Enhanced-->>ARCSolver: enhanced_program?
    ARCSolver->>Enhanced: predict_two_enhanced(task)
    Enhanced-->>ARCSolver: enhanced_attempts?
    ARCSolver->>ARCSolver: _validate_solution(enhanced_attempts)
    alt valid
      ARCSolver->>Baseline: synthesize_baseline + predict_two_baseline(task)
      Baseline-->>ARCSolver: baseline_attempts
      ARCSolver->>Logger: info("enhanced success")
      ARCSolver-->>Client: [attempt1=enhanced, attempt2=baseline]
    else invalid or error
      ARCSolver->>Logger: exception("enhanced failed")
      ARCSolver->>Baseline: synthesize_baseline + predict_two_baseline(task)
      Baseline-->>ARCSolver: baseline_attempts
      ARCSolver->>Logger: info("fallback to baseline")
      ARCSolver-->>Client: baseline_attempts
    end
  else enhancements disabled
    ARCSolver->>Baseline: synthesize_baseline + predict_two_baseline(task)
    Baseline-->>ARCSolver: baseline_attempts
    ARCSolver-->>Client: baseline_attempts
  end

sequenceDiagram
  autonumber
  participant Enhanced as EnhancedSearch
  participant Guidance as NeuralGuidance
  participant Beam as beam_search

  Enhanced->>Guidance: score_operations(train_pairs)
  Guidance-->>Enhanced: op_scores {op: score}
  Enhanced->>Beam: beam_search(train_pairs, ..., op_scores=op_scores)
  Beam-->>Enhanced: programs/candidates (scores weighted by op_scores)

sequenceDiagram
  autonumber
  participant CLI as train_guidance_on_arc.py
  participant Loader as _load_tasks
  participant Guidance as NeuralGuidance
  participant FS as Filesystem

  CLI->>Loader: load training/eval tasks
  Loader-->>CLI: tasks [[(inp, out), ...], ...]
  CLI->>Guidance: train_from_task_pairs(tasks, epochs)
  Guidance-->>CLI: trained model in memory
  CLI->>FS: save_model(models/guidance_arc.json)
  FS-->>CLI: OK

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

fix: ensure solver returns predictions #7 — Touches arc_solver/solver.py with enhanced-vs-baseline fallback orchestration similar to this PR’s logging and validation path.
Unify solver and reorganize neural components #4 — Extends solver/enhanced search integration; this PR’s beam_search priors and enhanced call-site build on those pathways.
Implement numerical features and episodic memory #3 — Introduces/extends neural guidance and episodic-memory training; closely related to added training helpers and scripts here.

Poem

I twitch my ears at beams that glide,
With priors softly at my side.
I log my hops, enhanced then base,
If paths collide, I change my pace.
I train on tales the episodes spin—
Rotate the world, and somewhere, win. 🐇✨

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 3b1ee75 and 6878f52.

📒 Files selected for processing (15)

AGENTS.md (4 hunks)
arc_solver/beam_search.py (4 hunks)
arc_solver/enhanced_search.py (1 hunks)
arc_solver/grid.py (1 hunks)
arc_solver/neural/guidance.py (3 hunks)
arc_solver/search.py (2 hunks)
arc_solver/solver.py (3 hunks)
models/guidance_arc.json (1 hunks)
scripts/train_from_episodes.sh (1 hunks)
tests/test_beam_search.py (1 hunks)
tests/test_episodic_integration.py (1 hunks)
tests/test_guidance_from_tasks.py (1 hunks)
tests/test_guidance_training.py (1 hunks)
tools/train_guidance_from_episodes.py (1 hunks)
tools/train_guidance_on_arc.py (1 hunks)

✨ Finishing touches

📝 Generate Docstrings

🧪 Generate unit tests

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch codex/complete-all-project-phases

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

chatgpt-codex-connector

Codex Review: Here are some suggestions.

Reply with @codex fix comments to fix any unresolved comments.

About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you open a pull request for review, mark a draft as ready, or comment "@codex review". If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex fix this CI failure" or "@codex address that feedback".

arc_solver/beam_search.py

feat: train guidance model on ARC data

6878f52

Train heuristic classifier on combined training and evaluation tasks and expose CLI. [S:DESIGN v1] pass [S:ALG v1] train_from_task_pairs pass [S:TRAIN v1] dataset=train+eval pass [S:TEST v1] unit=103 pass

tylerbessire added the codex label Sep 12, 2025 — with ChatGPT Codex Connector

tylerbessire merged commit edb8e65 into main Sep 12, 2025
2 of 6 checks passed

tylerbessire deleted the codex/complete-all-project-phases branch September 12, 2025 12:10

chatgpt-codex-connector bot reviewed Sep 12, 2025

View reviewed changes

arc_solver/beam_search.py Show resolved Hide resolved

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix array comparison and add solver logging #11

Fix array comparison and add solver logging #11

Uh oh!

tylerbessire commented Sep 12, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Sep 12, 2025 •

edited

Loading

Review failed

Uh oh!

Uh oh!

chatgpt-codex-connector bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Fix array comparison and add solver logging #11

Fix array comparison and add solver logging #11

Uh oh!

Conversation

tylerbessire commented Sep 12, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Testing

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Sep 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review failed

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Poem

Uh oh!

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

tylerbessire commented Sep 12, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Sep 12, 2025 •

edited

Loading