feat: add hierarchical episodic memory #9

tylerbessire · 2025-09-12T10:39:19Z

Summary

introduce hierarchical episodic memory buckets for faster retrieval
add consolidation method to merge duplicate episodes
document Phase 3 completion in AGENTS.md

Testing

pytest

https://chatgpt.com/codex/tasks/task_e_68c3f49e0fe08322adb65dd63806b8f9

Summary by CodeRabbit

New Features
- Introduced a hierarchical episodic memory index for faster, bucketed retrieval of similar tasks.
- Added automatic consolidation of duplicate episodes to keep memory clean and consistent.
Performance
- Improved candidate suggestion via a two-stage lookup that prioritizes fast, coarse matches.
- Reduced memory growth by organizing episodes into coarse-grained buckets.
Bug Fixes
- Ensured episode removals stay consistent across all indexes.
Documentation
- Updated progress markers to mark Phase 3 completed with passing tests and notes on hierarchical episodic memory.

coderabbitai · 2025-09-12T10:39:26Z

Caution

Review failed

The pull request is closed.

Walkthrough

Adds a hierarchical episodic memory index with bucketed retrieval, consolidation of duplicate episodes, and two-stage candidate program selection. Updates removal and storage to maintain the new index. Rebuilds the hierarchy on load. Documentation reflects Phase 3 completion and notes hierarchical episodic memory.

Changes

Cohort / File(s)	Summary of Changes
Documentation status updates `AGENTS.md`	Updated MetaCognition Phase 3 markers to completed with concrete dates, test results, and notes indicating hierarchical episodic memory readiness for Phase 4.
Episodic memory hierarchy and retrieval `arc_solver/neural/episodic.py`	Introduced hierarchy_index for coarse-grained episode bucketing; added query_hierarchy and consolidate APIs; updated get_candidate_programs to prefer hierarchical lookup with fallback to similarity search; ensured removal updates hierarchy; rebuilt hierarchy on load; minor formatting and doc updates.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  actor Caller
  participant DB as EpisodeDatabase
  participant Hier as Hierarchy Index
  participant Vec as Similarity Index

  Caller->>DB: get_candidate_programs(train_pairs)
  DB->>DB: query_hierarchy(train_pairs, τ=0.5, k=5)
  alt bucket has results
    DB->>Hier: Retrieve bucketed episodes
    DB->>DB: Filter by cosine similarity on task features
    note right of DB: Use hierarchical results
  else no bucketed results
    DB->>Vec: query_by_similarity(train_pairs)
    note right of DB: Fallback to prior similarity search
  end
  DB->>Caller: Aggregated candidate programs

sequenceDiagram
  autonumber
  actor Trainer
  participant DB as EpisodeDatabase
  participant Hier as Hierarchy Index
  participant Store as Persistent Storage

  Trainer->>DB: store_episode(episode)
  DB->>DB: _hierarchy_key(episode.features)
  DB->>Hier: Insert episode.id into bucket
  DB->>Store: Persist episode (existing flow)

  Trainer->>DB: load()
  DB->>DB: Clear hierarchy_index
  DB->>Store: Load episodes
  loop for each episode
    DB->>DB: _hierarchy_key(...)
    DB->>Hier: Insert episode.id into bucket
  end

  Trainer->>DB: remove_episode(id)
  DB->>Hier: Remove id from bucket
  DB->>Store: Remove episode (existing flow)

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~55 minutes

Poem

I burrowed through memories, tier by tier,
Buckets of tales now neatly appear.
If the fast path’s empty, I sniff the old trail,
Consolidate crumbs, no duplicate tale.
With whiskers twitching, tests hop and pass—
Phase 3’s done; on to the next patch of grass! 🐇✨

Tip

👮 Agentic pre-merge checks are now available in preview!

Pro plan users can now enable pre-merge checks in their settings to enforce checklists before merging PRs.

Built-in checks – Quickly apply ready-made checks to enforce title conventions, require pull request descriptions that follow templates, validate linked issues for compliance, and more.
Custom agentic checks – Define your own rules using CodeRabbit’s advanced agentic capabilities to enforce organization-specific policies and workflows. For example, you can instruct CodeRabbit’s agent to verify that API documentation is updated whenever API schema files are modified in a PR. Note: Upto 5 custom checks are currently allowed during the preview period. Pricing for this feature will be announced in a few weeks.

Please see the documentation for more information.

Example:

reviews:
  pre_merge_checks:
    custom_checks:
      - name: "Undocumented Breaking Changes"
        mode: "warning"
        instructions: |
          Pass/fail criteria: All breaking changes to public APIs, CLI flags, environment variables, configuration keys, database schemas, or HTTP/GraphQL endpoints must be documented in the "Breaking Change" section of the PR description and in CHANGELOG.md. Exclude purely internal or private changes (e.g., code not exported from package entry points or explicitly marked as internal).

Please share your feedback with us on this Discord post.

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 339520d and e2e21e4.

📒 Files selected for processing (2)

AGENTS.md (2 hunks)
arc_solver/neural/episodic.py (7 hunks)

✨ Finishing touches

📝 Generate Docstrings

🧪 Generate unit tests

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch codex/complete-phase-3-after-reviewing-agents.md

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

chatgpt-codex-connector

Codex Review: Here are some suggestions.

https://github.com/tylerbessire/PUMA/blob/e2e21e4b160ae6753b4a6b8d3e2b2cf80d48d12d/arc_solver/neural/episodic.py#L286-L298
[P1] Fill candidate pool when hierarchy yields too few matches

The new get_candidate_programs now queries query_hierarchy and only falls back to query_by_similarity when the hierarchical bucket is completely empty. If the bucket contains even one unrelated episode, the method returns only the programs from that small bucket and never supplements them with other episodes, so callers can receive far fewer than max_programs candidates despite the database containing many more useful examples. This is a regression from the previous implementation, which always filled the result list up to the requested size. Consider topping up from query_by_similarity whenever the hierarchical results do not supply enough programs.

Reply with @codex fix comments to fix any unresolved comments.

About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you open a pull request for review, mark a draft as ready, or comment "@codex review". If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex fix this CI failure" or "@codex address that feedback".

feat: add hierarchical episodic memory

e2e21e4

tylerbessire added the codex label Sep 12, 2025 — with ChatGPT Codex Connector

tylerbessire merged commit 63123e2 into main Sep 12, 2025
2 of 6 checks passed

tylerbessire deleted the codex/complete-phase-3-after-reviewing-agents.md branch September 12, 2025 10:39

chatgpt-codex-connector bot reviewed Sep 12, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: add hierarchical episodic memory #9

feat: add hierarchical episodic memory #9

Uh oh!

tylerbessire commented Sep 12, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Sep 12, 2025 •

edited

Loading

Review failed

Uh oh!

Uh oh!

chatgpt-codex-connector bot left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

feat: add hierarchical episodic memory #9

feat: add hierarchical episodic memory #9

Uh oh!

Conversation

tylerbessire commented Sep 12, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Testing

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Sep 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review failed

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Poem

Uh oh!

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

tylerbessire commented Sep 12, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Sep 12, 2025 •

edited

Loading