V2 #22

tylerbessire · 2025-09-21T03:18:40Z

Summary by CodeRabbit

New Features
- Added feature-flagged behavioral training with reward telemetry and episodic/neural updates.
- Introduced test-time adaptation flow with CLI evaluation and JSON reports.
- New pattern, relational, and placeholder engines to learn/apply transformations.
- Memory-augmented solver with checkpointing, persona summary, and improved postprocessing.
- Enhanced search with intraverbal scoring and episodic placeholders.
- New glyph extraction and tiling inference; richer human/object reasoning.
- Reinforcement-enabled neural guidance with online updates.
Documentation
- Expanded architecture docs and README with operant training instructions.
Tests
- Added comprehensive tests for behavioral, hypothesis, and relational reasoning.
Chores
- New evaluation tools (terminal/GUI), training/validation utilities, and scripts.

🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <[email protected]>

…DOC-LINK v1] pass

…nalysis-into-puma-solver

…v1] pass

…solver

…nalysis-into-puma-solver Implement behavioral reinforcement engine with RFT integration

- Add PlaceholderCandidateGenerator with multiple reconstruction strategies - Implement stripe pattern extraction from borders with extended context - Add advanced mirroring strategies (symmetric, pattern completion, contextual) - Implement recolor mapping derived from border/neighbor palette analysis - Extend DSL with placeholder reconstruction operations - Add PlaceholderTemplateEngine for template detection and application - Create comprehensive test suite showing 44+ candidate generation strategies - Integrate with existing HumanGradeReasoner for macro validation and storage 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <[email protected]>

- Added checkpoint_path and enable_logging parameters to ARCSolver - Implemented save_checkpoint/load_checkpoint with automatic saves every 10 tasks - Added submission_results tracking for progress recovery - Created logging toggle controlled by ARC_ENABLE_LOGGING environment variable - All logging statements now respect enable_logging flag to reduce memory usage - Checkpoint saves task results, stats, and progress metadata - Supports resuming from checkpoint after memory crashes 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <[email protected]>

korbit-ai · 2025-09-21T03:18:43Z

You've used up your 5 PR reviews for this month under the Korbit Starter Plan. You'll get 5 more reviews on October 18th, 2025 or you can upgrade to Pro for unlimited PR reviews and enhanced features in your Korbit Console.

coderabbitai · 2025-09-21T03:18:46Z

Caution

Review failed

The pull request is closed.

Walkthrough

Introduces extensive ARC solver expansions: a reinforcement-driven BehavioralEngine, relational (RFT) and pattern modules, placeholder templating, glyph extraction, intraverbal/tacting guidance, continuous self-memory, enhanced search/ranking, solver checkpointing/state, test-time adaptation, evaluation tools/GUI, utilities, tests, model weight updates, and documentation.

Changes

Cohort / File(s)	Summary
Behavioral Engine & RL plumbing `arc_solver/behavioral_engine.py`, `arc_solver/neural/guidance.py`, `arc_solver/neural/episodic.py`, `arc_solver/intraverbal.py`, `arc_solver/tacting.py`, `arc_solver/utils/metrics.py`, `arc_solver/utils/__init__.py`	Adds feature-flagged BehavioralEngine, reward grading/telemetry, neural online-update and reinforcement, episodic reward tracking and template retrieval, intraverbal chaining, tacting system, and a MovingAverage utility.
Relational/Pattern modules `arc_pattern_engine.py`, `arc_solver/rft.py`, `arc_solver/rft_engine/...`	Introduces ARCPatternEngine with learn/apply rules and solver integration; adds RFT utilities (facts/analyzer) and an RFT inference engine with function hints.
Placeholders, reasoning, hypotheses, glyphs, heuristics, objects `arc_solver/placeholders.py`, `arc_solver/enhanced_search.py`, `arc_solver/human_reasoning.py`, `arc_solver/hypothesis.py`, `arc_solver/glyph_extraction.py`, `enhanced_glyph_extractor.py`, `arc_solver/object_reasoning.py`, `arc_solver/heuristics.py`	Adds placeholder templates/engine and (de)serialization; integrates episodic placeholders and intraverbal scoring in search; expands human and hypothesis engines with relational/template/memory paths and glyph/area/holes/expansion logic; adds glyph extractor and enhanced variant; updates heuristics with tiling; augments object descriptors.
Solver core, persistence, and data `arc_solver/solver.py`, `checkpoint.json`, `episodes.json`, `models/guidance_arc.json`	Extends ARCSolver with memory, placeholders, analytics, postprocessing, coherence scoring, and checkpointing APIs; adds checkpoint and episodic data with reward fields; updates first-layer neural weights.
Test-time adaptation & evaluation tooling `adapt_test_time.py`, `evaluate_first_20.py`, `evaluate_gui.py`, `run_evaluation.sh`, `tools/*`	Adds a test-time adapted solver with evaluation CLI; terminal/GUI evaluators; runner script; dataset validators, subset evaluator, memory builder, and guidance trainer with path resolution and batching.
Documentation `README.md`, `docs/architecture.md`, `docs/functional_contextualist_architecture.md`	Adds operant training section to README; extends architecture docs with Functional Contextualist/RFT components and roadmap.
Tests `tests/test_behavioral_engine.py`, `tests/test_hypothesis_engine.py`	Adds tests for BehavioralEngine, RFTEngine, neural reinforcement, relational facts, and expanded hypothesis/human reasoning paths.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  actor User
  participant BE as BehavioralEngine
  participant DL as Dataset Loaders
  participant ES as EnhancedSearch
  participant RFT as RFTEngine
  participant NG as NeuralGuidance
  participant ER as EpisodicRetrieval
  participant RG as RewardGrader
  Note over BE: Feature flag checked (env)
  User->>BE: train(dataset_path, solutions_path, ...)
  BE->>DL: load_challenges/load_solutions
  loop per task
    BE->>RFT: analyse(train_pairs)
    RFT-->>BE: RFTInference (function_hints)
    BE->>ES: synthesize_enhanced(train_pairs)
    ES-->>BE: candidate programs
    BE->>RG: grade(predictions vs targets, program, behavioural_signal)
    RG-->>BE: RewardBreakdown
    alt best program found
      BE->>NG: reinforce(train_pairs, program, reward, inference)
      BE->>ER: add_successful_solution(..., reward, metadata)
    else no viable
      Note over BE: fallback/no update
    end
  end
  BE-->>User: BehaviouralMetrics (JSON-serializable)

sequenceDiagram
  autonumber
  actor User
  participant TTA as TestTimeAdaptedSolver
  participant BS as Baseline ARCSolver
  participant TTT as TestTimeTrainer
  participant ES as EnhancedSearch
  User->>TTA: evaluate_with_adaptation(tasks, budget)
  loop per task
    TTA->>ES: initial candidates (search/synthesis)
    ES-->>TTA: programs + scores
    alt adapt enabled and candidates
      TTA->>TTT: adapt(train_pairs, programs)
      TTT-->>TTA: adapted scorers/models
      TTA->>ES: re-score/rank with adapted state
      ES-->>TTA: ranked programs
    else skip adaptation
    end
    alt programs viable
      TTA-->>User: predictions + timings
    else
      TTA->>BS: solve(task)
      BS-->>TTA: fallback predictions
      TTA-->>User: predictions + timings
    end
  end
  TTA-->>User: summary stats (success rates, medians)

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~120+ minutes

Possibly related PRs

feat: integrate neural guidance baseline #19 — Also modifies neural guidance: adds guidance metrics/tooling and tests; overlaps with reinforcement and episodic reward integrations in this PR.

Suggested labels

codex

Poem

I twitch my ears at patterns new,
Tiny grids and colors, too—
Tacts and tales in chains I weave,
Rewards in memory, I believe.
Hop, adapt, and solve on cue—
Checkpoints saved; on to task two! 🐇✨

✨ Finishing touches

📝 Generate Docstrings

🧪 Generate unit tests

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch V2

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f494ffb and d42ec6f.

⛔ Files ignored due to path filters (16)

.DS_Store is excluded by !**/.DS_Store
arc_solver/__pycache__/beam_search.cpython-313.pyc is excluded by !**/*.pyc
arc_solver/__pycache__/canonical.cpython-313.pyc is excluded by !**/*.pyc
arc_solver/__pycache__/dsl.cpython-313.pyc is excluded by !**/*.pyc
arc_solver/__pycache__/enhanced_search.cpython-313.pyc is excluded by !**/*.pyc
arc_solver/__pycache__/grid.cpython-313.pyc is excluded by !**/*.pyc
arc_solver/__pycache__/heuristics.cpython-313.pyc is excluded by !**/*.pyc
arc_solver/__pycache__/hypothesis.cpython-313.pyc is excluded by !**/*.pyc
arc_solver/__pycache__/mcts_search.cpython-313.pyc is excluded by !**/*.pyc
arc_solver/__pycache__/search.cpython-313.pyc is excluded by !**/*.pyc
arc_solver/__pycache__/solver.cpython-313.pyc is excluded by !**/*.pyc
arc_solver/neural/__pycache__/__init__.cpython-313.pyc is excluded by !**/*.pyc
arc_solver/neural/__pycache__/episodic.cpython-313.pyc is excluded by !**/*.pyc
arc_solver/neural/__pycache__/guidance.cpython-313.pyc is excluded by !**/*.pyc
arc_solver/neural/__pycache__/sketches.cpython-313.pyc is excluded by !**/*.pyc
puma-arc-solver-v1.zip is excluded by !**/*.zip

📒 Files selected for processing (37)

README.md (1 hunks)
adapt_test_time.py (1 hunks)
arc_pattern_engine.py (1 hunks)
arc_solver/behavioral_engine.py (1 hunks)
arc_solver/continuous_learning.py (1 hunks)
arc_solver/enhanced_search.py (11 hunks)
arc_solver/glyph_extraction.py (1 hunks)
arc_solver/heuristics.py (3 hunks)
arc_solver/human_reasoning.py (16 hunks)
arc_solver/hypothesis.py (5 hunks)
arc_solver/intraverbal.py (1 hunks)
arc_solver/neural/episodic.py (8 hunks)
arc_solver/neural/guidance.py (8 hunks)
arc_solver/object_reasoning.py (7 hunks)
arc_solver/placeholders.py (1 hunks)
arc_solver/rft.py (1 hunks)
arc_solver/rft_engine/__init__.py (1 hunks)
arc_solver/rft_engine/engine.py (1 hunks)
arc_solver/solver.py (10 hunks)
arc_solver/tacting.py (1 hunks)
arc_solver/utils/__init__.py (1 hunks)
arc_solver/utils/metrics.py (1 hunks)
checkpoint.json (1 hunks)
docs/architecture.md (1 hunks)
docs/functional_contextualist_architecture.md (1 hunks)
enhanced_glyph_extractor.py (1 hunks)
episodes.json (16 hunks)
evaluate_first_20.py (1 hunks)
evaluate_gui.py (1 hunks)
models/guidance_arc.json (1 hunks)
run_evaluation.sh (1 hunks)
tests/test_behavioral_engine.py (1 hunks)
tests/test_hypothesis_engine.py (2 hunks)
tools/build_memory.py (3 hunks)
tools/evaluate_subset.py (1 hunks)
tools/train_guidance.py (5 hunks)
tools/validate_training_data.py (1 hunks)

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

tylerbessire and others added 14 commits September 19, 2025 09:43

Sync local changes to V2 branch

04af98f

🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <[email protected]>

Add functional contextualist architecture roadmap [S:DOC v1] pass [S:…

c8a9b97

…DOC-LINK v1] pass

Merge pull request #20 from tylerbessire/codex/integrate-behavioral-a…

3971000

…nalysis-into-puma-solver

Add behavioral reinforcement engine with RFT integration [S:BEHAVIOR …

7024ac9

…v1] pass

Merge branch 'V2' into codex/integrate-behavioral-analysis-into-puma-…

d675228

…solver

Merge pull request #21 from tylerbessire/codex/integrate-behavioral-a…

52c6e46

…nalysis-into-puma-solver Implement behavioral reinforcement engine with RFT integration

Improve neural guidance training stability

22cc19e

Add tacting and intraverbal modules

299850b

Write guidance model as JSON and refresh weights

0161f05

Resolve training tools paths relative to project root

01e0800

Enhance mini adaptation evaluation workflow

f699255

codex

dcba83a

tylerbessire merged commit 40104f3 into main Sep 21, 2025
2 of 6 checks passed

tylerbessire deleted the V2 branch September 21, 2025 03:18

tylerbessire restored the V2 branch September 21, 2025 03:20

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

V2 #22

V2 #22

Uh oh!

tylerbessire commented Sep 21, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

korbit-ai bot commented Sep 21, 2025

Uh oh!

coderabbitai bot commented Sep 21, 2025 •

edited

Loading

Review failed

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

V2 #22

V2 #22

Uh oh!

Conversation

tylerbessire commented Sep 21, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

korbit-ai bot commented Sep 21, 2025

Uh oh!

coderabbitai bot commented Sep 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review failed

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested labels

Poem

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

tylerbessire commented Sep 21, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Sep 21, 2025 •

edited

Loading