Skip to content

Conversation

@tylerbessire
Copy link
Owner

@tylerbessire tylerbessire commented Sep 21, 2025

Summary by CodeRabbit

  • New Features

    • Added feature-flagged behavioral training with reward telemetry and episodic/neural updates.
    • Introduced test-time adaptation flow with CLI evaluation and JSON reports.
    • New pattern, relational, and placeholder engines to learn/apply transformations.
    • Memory-augmented solver with checkpointing, persona summary, and improved postprocessing.
    • Enhanced search with intraverbal scoring and episodic placeholders.
    • New glyph extraction and tiling inference; richer human/object reasoning.
    • Reinforcement-enabled neural guidance with online updates.
  • Documentation

    • Expanded architecture docs and README with operant training instructions.
  • Tests

    • Added comprehensive tests for behavioral, hypothesis, and relational reasoning.
  • Chores

    • New evaluation tools (terminal/GUI), training/validation utilities, and scripts.

tylerbessire and others added 14 commits September 19, 2025 09:43
🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <[email protected]>
…nalysis-into-puma-solver

Implement behavioral reinforcement engine with RFT integration
- Add PlaceholderCandidateGenerator with multiple reconstruction strategies
- Implement stripe pattern extraction from borders with extended context
- Add advanced mirroring strategies (symmetric, pattern completion, contextual)
- Implement recolor mapping derived from border/neighbor palette analysis
- Extend DSL with placeholder reconstruction operations
- Add PlaceholderTemplateEngine for template detection and application
- Create comprehensive test suite showing 44+ candidate generation strategies
- Integrate with existing HumanGradeReasoner for macro validation and storage

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <[email protected]>
- Added checkpoint_path and enable_logging parameters to ARCSolver
- Implemented save_checkpoint/load_checkpoint with automatic saves every 10 tasks
- Added submission_results tracking for progress recovery
- Created logging toggle controlled by ARC_ENABLE_LOGGING environment variable
- All logging statements now respect enable_logging flag to reduce memory usage
- Checkpoint saves task results, stats, and progress metadata
- Supports resuming from checkpoint after memory crashes

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <[email protected]>
@korbit-ai
Copy link

korbit-ai bot commented Sep 21, 2025

You've used up your 5 PR reviews for this month under the Korbit Starter Plan. You'll get 5 more reviews on October 18th, 2025 or you can upgrade to Pro for unlimited PR reviews and enhanced features in your Korbit Console.

@coderabbitai
Copy link

coderabbitai bot commented Sep 21, 2025

Caution

Review failed

The pull request is closed.

Walkthrough

Introduces extensive ARC solver expansions: a reinforcement-driven BehavioralEngine, relational (RFT) and pattern modules, placeholder templating, glyph extraction, intraverbal/tacting guidance, continuous self-memory, enhanced search/ranking, solver checkpointing/state, test-time adaptation, evaluation tools/GUI, utilities, tests, model weight updates, and documentation.

Changes

Cohort / File(s) Summary
Behavioral Engine & RL plumbing
arc_solver/behavioral_engine.py, arc_solver/neural/guidance.py, arc_solver/neural/episodic.py, arc_solver/intraverbal.py, arc_solver/tacting.py, arc_solver/utils/metrics.py, arc_solver/utils/__init__.py
Adds feature-flagged BehavioralEngine, reward grading/telemetry, neural online-update and reinforcement, episodic reward tracking and template retrieval, intraverbal chaining, tacting system, and a MovingAverage utility.
Relational/Pattern modules
arc_pattern_engine.py, arc_solver/rft.py, arc_solver/rft_engine/...
Introduces ARCPatternEngine with learn/apply rules and solver integration; adds RFT utilities (facts/analyzer) and an RFT inference engine with function hints.
Placeholders, reasoning, hypotheses, glyphs, heuristics, objects
arc_solver/placeholders.py, arc_solver/enhanced_search.py, arc_solver/human_reasoning.py, arc_solver/hypothesis.py, arc_solver/glyph_extraction.py, enhanced_glyph_extractor.py, arc_solver/object_reasoning.py, arc_solver/heuristics.py
Adds placeholder templates/engine and (de)serialization; integrates episodic placeholders and intraverbal scoring in search; expands human and hypothesis engines with relational/template/memory paths and glyph/area/holes/expansion logic; adds glyph extractor and enhanced variant; updates heuristics with tiling; augments object descriptors.
Solver core, persistence, and data
arc_solver/solver.py, checkpoint.json, episodes.json, models/guidance_arc.json
Extends ARCSolver with memory, placeholders, analytics, postprocessing, coherence scoring, and checkpointing APIs; adds checkpoint and episodic data with reward fields; updates first-layer neural weights.
Test-time adaptation & evaluation tooling
adapt_test_time.py, evaluate_first_20.py, evaluate_gui.py, run_evaluation.sh, tools/*
Adds a test-time adapted solver with evaluation CLI; terminal/GUI evaluators; runner script; dataset validators, subset evaluator, memory builder, and guidance trainer with path resolution and batching.
Documentation
README.md, docs/architecture.md, docs/functional_contextualist_architecture.md
Adds operant training section to README; extends architecture docs with Functional Contextualist/RFT components and roadmap.
Tests
tests/test_behavioral_engine.py, tests/test_hypothesis_engine.py
Adds tests for BehavioralEngine, RFTEngine, neural reinforcement, relational facts, and expanded hypothesis/human reasoning paths.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  actor User
  participant BE as BehavioralEngine
  participant DL as Dataset Loaders
  participant ES as EnhancedSearch
  participant RFT as RFTEngine
  participant NG as NeuralGuidance
  participant ER as EpisodicRetrieval
  participant RG as RewardGrader
  Note over BE: Feature flag checked (env)
  User->>BE: train(dataset_path, solutions_path, ...)
  BE->>DL: load_challenges/load_solutions
  loop per task
    BE->>RFT: analyse(train_pairs)
    RFT-->>BE: RFTInference (function_hints)
    BE->>ES: synthesize_enhanced(train_pairs)
    ES-->>BE: candidate programs
    BE->>RG: grade(predictions vs targets, program, behavioural_signal)
    RG-->>BE: RewardBreakdown
    alt best program found
      BE->>NG: reinforce(train_pairs, program, reward, inference)
      BE->>ER: add_successful_solution(..., reward, metadata)
    else no viable
      Note over BE: fallback/no update
    end
  end
  BE-->>User: BehaviouralMetrics (JSON-serializable)
Loading
sequenceDiagram
  autonumber
  actor User
  participant TTA as TestTimeAdaptedSolver
  participant BS as Baseline ARCSolver
  participant TTT as TestTimeTrainer
  participant ES as EnhancedSearch
  User->>TTA: evaluate_with_adaptation(tasks, budget)
  loop per task
    TTA->>ES: initial candidates (search/synthesis)
    ES-->>TTA: programs + scores
    alt adapt enabled and candidates
      TTA->>TTT: adapt(train_pairs, programs)
      TTT-->>TTA: adapted scorers/models
      TTA->>ES: re-score/rank with adapted state
      ES-->>TTA: ranked programs
    else skip adaptation
    end
    alt programs viable
      TTA-->>User: predictions + timings
    else
      TTA->>BS: solve(task)
      BS-->>TTA: fallback predictions
      TTA-->>User: predictions + timings
    end
  end
  TTA-->>User: summary stats (success rates, medians)
Loading

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~120+ minutes

Possibly related PRs

Suggested labels

codex

Poem

I twitch my ears at patterns new,
Tiny grids and colors, too—
Tacts and tales in chains I weave,
Rewards in memory, I believe.
Hop, adapt, and solve on cue—
Checkpoints saved; on to task two! 🐇✨

✨ Finishing touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch V2

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f494ffb and d42ec6f.

⛔ Files ignored due to path filters (16)
  • .DS_Store is excluded by !**/.DS_Store
  • arc_solver/__pycache__/beam_search.cpython-313.pyc is excluded by !**/*.pyc
  • arc_solver/__pycache__/canonical.cpython-313.pyc is excluded by !**/*.pyc
  • arc_solver/__pycache__/dsl.cpython-313.pyc is excluded by !**/*.pyc
  • arc_solver/__pycache__/enhanced_search.cpython-313.pyc is excluded by !**/*.pyc
  • arc_solver/__pycache__/grid.cpython-313.pyc is excluded by !**/*.pyc
  • arc_solver/__pycache__/heuristics.cpython-313.pyc is excluded by !**/*.pyc
  • arc_solver/__pycache__/hypothesis.cpython-313.pyc is excluded by !**/*.pyc
  • arc_solver/__pycache__/mcts_search.cpython-313.pyc is excluded by !**/*.pyc
  • arc_solver/__pycache__/search.cpython-313.pyc is excluded by !**/*.pyc
  • arc_solver/__pycache__/solver.cpython-313.pyc is excluded by !**/*.pyc
  • arc_solver/neural/__pycache__/__init__.cpython-313.pyc is excluded by !**/*.pyc
  • arc_solver/neural/__pycache__/episodic.cpython-313.pyc is excluded by !**/*.pyc
  • arc_solver/neural/__pycache__/guidance.cpython-313.pyc is excluded by !**/*.pyc
  • arc_solver/neural/__pycache__/sketches.cpython-313.pyc is excluded by !**/*.pyc
  • puma-arc-solver-v1.zip is excluded by !**/*.zip
📒 Files selected for processing (37)
  • README.md (1 hunks)
  • adapt_test_time.py (1 hunks)
  • arc_pattern_engine.py (1 hunks)
  • arc_solver/behavioral_engine.py (1 hunks)
  • arc_solver/continuous_learning.py (1 hunks)
  • arc_solver/enhanced_search.py (11 hunks)
  • arc_solver/glyph_extraction.py (1 hunks)
  • arc_solver/heuristics.py (3 hunks)
  • arc_solver/human_reasoning.py (16 hunks)
  • arc_solver/hypothesis.py (5 hunks)
  • arc_solver/intraverbal.py (1 hunks)
  • arc_solver/neural/episodic.py (8 hunks)
  • arc_solver/neural/guidance.py (8 hunks)
  • arc_solver/object_reasoning.py (7 hunks)
  • arc_solver/placeholders.py (1 hunks)
  • arc_solver/rft.py (1 hunks)
  • arc_solver/rft_engine/__init__.py (1 hunks)
  • arc_solver/rft_engine/engine.py (1 hunks)
  • arc_solver/solver.py (10 hunks)
  • arc_solver/tacting.py (1 hunks)
  • arc_solver/utils/__init__.py (1 hunks)
  • arc_solver/utils/metrics.py (1 hunks)
  • checkpoint.json (1 hunks)
  • docs/architecture.md (1 hunks)
  • docs/functional_contextualist_architecture.md (1 hunks)
  • enhanced_glyph_extractor.py (1 hunks)
  • episodes.json (16 hunks)
  • evaluate_first_20.py (1 hunks)
  • evaluate_gui.py (1 hunks)
  • models/guidance_arc.json (1 hunks)
  • run_evaluation.sh (1 hunks)
  • tests/test_behavioral_engine.py (1 hunks)
  • tests/test_hypothesis_engine.py (2 hunks)
  • tools/build_memory.py (3 hunks)
  • tools/evaluate_subset.py (1 hunks)
  • tools/train_guidance.py (5 hunks)
  • tools/validate_training_data.py (1 hunks)

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@tylerbessire tylerbessire merged commit 40104f3 into main Sep 21, 2025
2 of 6 checks passed
@tylerbessire tylerbessire deleted the V2 branch September 21, 2025 03:18
@tylerbessire tylerbessire restored the V2 branch September 21, 2025 03:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants