Skip to content

Conversation

@tylerbessire
Copy link
Owner

@tylerbessire tylerbessire commented Sep 13, 2025

Summary

  • standardize recolor mapping parameter across heuristics and search
  • normalize DSL caching and episodic loading for recolor mappings
  • add regression tests for recolor mapping and episode serialization

Testing

  • pytest tests -q
  • python tools/train_guidance_on_arc.py --epochs 1

https://chatgpt.com/codex/tasks/task_e_68c489916a30832286cc8ff5ea78d65c

Summary by CodeRabbit

  • Documentation

    • Added update noting recolor parameter standardization and successful tests.
  • Refactor

    • Standardized recolor parameter name to “mapping” across the app; legacy “color_map” still accepted during data load.
  • Bug Fixes

    • Fixed caching for operations with dict parameters, improving determinism and performance.
    • Normalized recolor mappings during episode load to ensure integer keys/values and prevent training/serialization issues.
  • Tests

    • Added comprehensive tests for recolor detection and episode round‑trip serialization.
    • Adjusted a beam search test target to maintain expected behavior.

@coderabbitai
Copy link

coderabbitai bot commented Sep 13, 2025

Caution

Review failed

The pull request is closed.

Walkthrough

Refactors recolor parameter naming from color_map to mapping across generation, heuristics, and signatures; adds episodic loader normalization to accept both keys; introduces parameter normalization for semantic cache keys in DSL; updates tests and adds a new recolor round‑trip test; adds a documentation note about the fix.

Changes

Cohort / File(s) Summary
Docs update
AGENTS.md
Adds Step 4.3 UPDATE documenting recolor parameter standardization and episodic loader key normalization; notes test coverage and date.
DSL cache key normalization
arc_solver/dsl.py
Adds _norm_params to normalize dict-valued params for hashing; apply_op now uses normalized params for cache keys.
Recolor API resolution + signatures
arc_solver/dsl_complete.py
recolor gains optional mapping param; resolves effective mapping from mapping or color_map; get_operation_signatures switches recolor entry to ['mapping'].
Search parameter generation
arc_solver/enhanced_search_complete.py
Renames generated recolor param key from color_map to mapping in _generate_comprehensive_parameters.
Heuristics parameter key rename
arc_solver/heuristics_complete.py
Updates recolor invocations and emitted programs to use {"mapping": ...} instead of {"color_map": ...}.
Episodic program normalization
arc_solver/neural/episodic.py
from_dict rebuilds programs; for recolor, reads mapping from "mapping" or "color_map" and normalizes to int→int under "mapping".
Tests
tests/test_beam_search.py, tests/test_recolor_fix.py
Adjusts a target array in beam search test; adds new recolor tests validating heuristics output and Episode to_dict/from_dict round‑trip with integer mappings.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant Caller
  participant DSL as DSL.apply_op
  participant Cache as _sem_cache
  Note over DSL,Cache: New: normalized parameter key (_norm_params)
  Caller->>DSL: apply_op(a, name, params)
  DSL->>DSL: key = (name, _norm_params(params))
  DSL->>Cache: check key
  alt cache hit
    Cache-->>DSL: result
    DSL-->>Caller: result
  else cache miss
    DSL->>DSL: compute op(name, params)
    DSL->>Cache: store key → result
    DSL-->>Caller: result
  end
Loading
sequenceDiagram
  autonumber
  participant Loader as Episode.from_dict
  participant Data as input dict
  participant Prog as Program builder
  Note over Loader,Prog: Normalizes recolor mapping from "mapping" or "color_map"
  Loader->>Data: read programs[]
  loop per program
    Data-->>Loader: ops[]
    loop per op
      alt op == "recolor"
        Loader->>Loader: m = op["mapping"] or op["color_map"]
        Loader->>Loader: normalize keys/values to int
        Loader->>Prog: append op {"mapping": m}
      else other ops
        Loader->>Prog: append op as-is
      end
    end
  end
  Loader-->>Prog: build Episode(programs=...)
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Poem

I hopped through maps from color_map to mapping,
Keys aligned, no more frantic tapping.
Caches hum with tidy params in tow,
Episodes learn the paths they know.
Tests nibble greens—round trips done!
Carrot commits: another one. 🥕✨

✨ Finishing touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch codex/outline-arc-training-and-evaluation-process

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between fba8911 and 02b3608.

📒 Files selected for processing (8)
  • AGENTS.md (1 hunks)
  • arc_solver/dsl.py (1 hunks)
  • arc_solver/dsl_complete.py (2 hunks)
  • arc_solver/enhanced_search_complete.py (1 hunks)
  • arc_solver/heuristics_complete.py (2 hunks)
  • arc_solver/neural/episodic.py (1 hunks)
  • tests/test_beam_search.py (1 hunks)
  • tests/test_recolor_fix.py (1 hunks)

Tip

👮 Agentic pre-merge checks are now available in preview!

Pro plan users can now enable pre-merge checks in their settings to enforce checklists before merging PRs.

  • Built-in checks – Quickly apply ready-made checks to enforce title conventions, require pull request descriptions that follow templates, validate linked issues for compliance, and more.
  • Custom agentic checks – Define your own rules using CodeRabbit’s advanced agentic capabilities to enforce organization-specific policies and workflows. For example, you can instruct CodeRabbit’s agent to verify that API documentation is updated whenever API schema files are modified in a PR. Note: Upto 5 custom checks are currently allowed during the preview period. Pricing for this feature will be announced in a few weeks.

Please see the documentation for more information.

Example:

reviews:
  pre_merge_checks:
    custom_checks:
      - name: "Undocumented Breaking Changes"
        mode: "warning"
        instructions: |
          Pass/fail criteria: All breaking changes to public APIs, CLI flags, environment variables, configuration keys, database schemas, or HTTP/GraphQL endpoints must be documented in the "Breaking Change" section of the PR description and in CHANGELOG.md. Exclude purely internal or private changes (e.g., code not exported from package entry points or explicitly marked as internal).

Please share your feedback with us on this Discord post.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@tylerbessire tylerbessire merged commit cb3a2b9 into main Sep 13, 2025
2 of 6 checks passed
@tylerbessire tylerbessire deleted the codex/outline-arc-training-and-evaluation-process branch September 13, 2025 01:22
@tylerbessire tylerbessire restored the codex/outline-arc-training-and-evaluation-process branch September 13, 2025 05:10
@tylerbessire tylerbessire deleted the codex/outline-arc-training-and-evaluation-process branch September 13, 2025 05:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants