Skip to content

feat(alaya): lay the groundwork for standalone short-term memory planner/query#1216

Open
freezinlove wants to merge 4 commits intomoeru-ai:mainfrom
freezinlove:alaya
Open

feat(alaya): lay the groundwork for standalone short-term memory planner/query#1216
freezinlove wants to merge 4 commits intomoeru-ai:mainfrom
freezinlove:alaya

Conversation

@freezinlove
Copy link

@freezinlove freezinlove commented Mar 9, 2026

Refs #879

Context

This PR is a first implementation step toward the Alaya direction discussed for AIRI's memory system.

My intention here was not to model Alaya as a thin search/save/update/forget wrapper around vectors, but to start shaping it as a decoupled memory layer with room for:

  • planner / filter style ingestion
  • query-time recall assembly
  • multiple memory spaces
  • separable storage backends
  • later evolve toward richer agent memory, retention, and self-updating behavior

So this PR is intentionally a groundwork PR: it implements a usable first slice, while trying to stay aligned with the broader architecture discussed around Alaya.

What this PR includes

1. Standalone packages/memory-alaya

This PR adds packages/memory-alaya as a standalone package instead of baking memory logic directly into the app layer.

The package is organized around contracts, ports, engines, and use-cases so that memory orchestration is separated from:

  • app UI
  • concrete storage
  • concrete LLM providers
  • concrete embedding providers
  • chat-specific wiring

The current shape is closer to a memory orchestration layer than a simple search driver.

2. First short-term memory pipeline

This PR implements the first MVP pipeline for short-term memory (STM):

  • collect workspace conversation turns
  • trigger planner periodically by round threshold
  • run planner extraction
  • write selected STM entries
  • generate/store embeddings for STM entries
  • run query-time recall from STM
  • assemble selected memory into prompt context

At this stage, the implemented memory scope is short-term memory only.

3. Planner-first extraction flow

The planner is designed in a planner/filter style rather than a raw storage API.

Current behavior:

  • planner reads conversation batches from workspace memory
  • planner is LLM-first
  • planner uses a dedicated system prompt for extraction
  • only memories judged worth storing are emitted
  • extracted entries are written as structured STM records
  • embeddings are generated after memory extraction
  • planner runtime is separated from chat runtime configuration

This is intended as a first step toward the more job-oriented / filter-oriented memory pipeline discussed in the design thread.

4. Query engine MVP

This PR also adds the first query-side recall path for STM.

Current query engine responsibilities:

  • read STM records
  • score and select relevant entries
  • assemble a compact recall block
  • inject recall context into the chat system prompt flow

This is still a lightweight MVP query engine, but it establishes the boundary for future expansion toward:

  • broader filters
  • reranking
  • keyword/BM25 style scoring
  • edge/link-aware recall
  • time-based and retention-aware traversal

5. Local-first storage for current MVP

For the current MVP, STM records and embeddings are stored locally in IndexedDB.

This choice is mainly for the current local/self-hosted usage pattern, but I tried to avoid coupling Alaya to IndexedDB as a permanent assumption.

The goal is that later storage implementations can be added as separate backends/drivers, for example:

  • PostgreSQL / pgvector
  • DuckDB / PGLite style local stores
  • Redis vector search
  • Milvus
  • other specialized memory storage implementations

So in this PR, IndexedDB is the current runtime storage, not the intended final storage story.

6. Dedicated settings and developer tooling

This PR adds Alaya-related settings and a developer page so the memory pipeline is observable during iteration.

Included UI/devtools work:

  • memory settings page
  • short-term memory list / inspection
  • planner provider configuration
  • planner embedding provider configuration
  • planner logs
  • query logs
  • recall preview/debug information

This follows the idea of exposing memory experimentation through the developer tooling surface while the design is still evolving.

Current scope of this PR

This PR currently implements:

  • standalone Alaya package structure
  • short-term memory only
  • user memory only
  • planner ingestion / extraction MVP
  • query / recall MVP
  • embedding generation and local vector storage
  • provider separation for planner LLM vs planner embedding
  • developer tooling / settings integration

What is intentionally not implemented yet

This PR does not try to claim the full Alaya vision is complete.

Still not implemented yet:

  • long-term memory
  • mutate engine
  • voyager engine
  • memory queue / MQ backend
  • cross-source ingestion beyond current chat/workspace flow
  • graph-style memory linking / edge traversal
  • dream process / memory stitching
  • diary / calendar style memory spaces
  • external memory volumes / shareable memory mounts
  • agent self-memory
  • agent state
  • self-evolving or self-learning inner memory loops
  • richer emotional memory modeling beyond the current STM metadata direction
  • advanced recall ranking such as BM25 / Jieba / reranker / edge score composition
  • fully user-adjustable scoring weights
  • pluggable storage backend packages like memory-storage-pgvector

Why I structured it this way

From the design discussion, my understanding is that Alaya should move toward:

  • multiple pipelines instead of one fixed CRUD abstraction
  • multiple memory layers/spaces
  • multiple backends
  • support for richer event sources than just user chat
  • future agent-oriented memory rather than only assistant-style user preference memory

This PR does not solve those bigger problems yet, but it tries to avoid blocking them.

Concretely, I tried to keep:

  • memory package decoupled from app code
  • planner/query responsibilities separated
  • provider integration separated from chat provider configuration
  • storage hidden behind ports
  • UI/runtime wiring outside the core package

So this is meant as a practical first slice that is already usable, while still leaving room for the broader Alaya architecture to grow.

Validation

Ran during implementation:

pnpm -F @proj-airi/memory-alaya test
pnpm -F @proj-airi/memory-alaya build
pnpm -F @proj-airi/stage-ui typecheck
pnpm typecheck

Notes

A few implementation choices in this PR are MVP-oriented:

  • current memory focus is user memory, not agent memory
  • current storage is IndexedDB for local deployment convenience
  • current query engine is intentionally simple compared with the larger design space
  • current planner works on chat/workspace turns, not the wider future event universe

I would treat this PR as the first usable foundation for Alaya, not the final memory architecture.

@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request lays the foundational architecture for the Alaya memory system, introducing a new, decoupled package that orchestrates short-term memory planning and querying. It integrates LLM-first extraction and embedding generation for conversation turns, enabling intelligent recall, and provides comprehensive developer tools for observability and configuration.

Highlights

  • New Alaya Memory Package: Introduced a standalone @proj-airi/memory-alaya package, establishing a decoupled architecture for the AI's memory layer with clear contracts, ports, engines, and use-cases.
  • Short-Term Memory (STM) Pipeline: Implemented the initial MVP pipeline for short-term memory, covering conversation turn collection, LLM-driven planner extraction, embedding generation, and query-time recall assembly into prompt contexts.
  • Planner-First Extraction Flow: Designed the memory ingestion around a planner/filter style, where an LLM evaluates conversation batches and only stores memories deemed valuable, generating embeddings post-extraction.
  • Query Engine MVP: Added a lightweight query-side recall path for STM, responsible for reading, scoring, selecting relevant entries, and assembling a compact recall block for prompt injection.
  • Developer Tooling and Settings: Integrated dedicated settings pages and developer tools to inspect planner and query logs, manage short-term memory entries, and configure specific LLM and embedding providers for Alaya.
  • Local-First Storage: Utilized IndexedDB for local storage of STM records and embeddings in the current MVP, ensuring local/self-hosted usage patterns are supported while maintaining extensibility for future storage backends.
Changelog
  • apps/stage-pocket/src/pages/settings/system/developer.vue
    • Added a new menu item for 'Memory Alaya' to the developer settings page.
  • apps/stage-tamagotchi/src/renderer/layouts/settings.vue
    • Updated the settings layout to support a 'disableBackButton' meta property for routes.
  • apps/stage-tamagotchi/src/renderer/pages/devtools/index.vue
    • Removed the explicit 'layout: settings' meta property from the devtools index route.
  • apps/stage-tamagotchi/src/renderer/pages/settings/system/developer.vue
    • Added a new menu item for 'Memory Alaya' to the developer settings page.
  • apps/stage-web/src/pages/settings/system/developer.vue
    • Added a new menu item for 'Memory Alaya' to the developer settings page.
  • packages/memory-alaya/README.md
    • Added a comprehensive README detailing the purpose, scope, and usage of the new @proj-airi/memory-alaya package.
  • packages/memory-alaya/package.json
    • Added the package definition for @proj-airi/memory-alaya, including dependencies and scripts.
  • packages/memory-alaya/src/contracts/v1.ts
    • Added core data structures and types for Alaya memory, including schemas for planner input/output, memory records, and query engine input/output using Valibot.
  • packages/memory-alaya/src/engines/planner-engine.ts
    • Added the PlannerEngine interface and its implementation, providing run and shouldTrigger methods.
  • packages/memory-alaya/src/engines/query-engine.ts
    • Added the QueryEngine interface and its implementation with an execute method.
  • packages/memory-alaya/src/index.ts
    • Added an index file to export all public modules from the memory-alaya package.
  • packages/memory-alaya/src/ports/embedding-provider.ts
    • Added the MemoryEmbeddingProvider interface.
  • packages/memory-alaya/src/ports/llm-provider.ts
    • Added the MemoryLlmProvider interface.
  • packages/memory-alaya/src/ports/planner-observer.ts
    • Added the PlannerObserver interface for monitoring planner runs.
  • packages/memory-alaya/src/ports/short-term-memory-activity-store.ts
    • Added the ShortTermMemoryActivityStore interface for tracking memory access.
  • packages/memory-alaya/src/ports/short-term-memory-reader.ts
    • Added the ShortTermMemoryReader interface for retrieving short-term memories.
  • packages/memory-alaya/src/ports/short-term-memory-store.ts
    • Added the ShortTermMemoryStore interface for managing short-term memory persistence.
  • packages/memory-alaya/src/ports/token-estimator.ts
    • Added the MemoryTokenEstimator interface.
  • packages/memory-alaya/src/ports/workspace-memory-source.ts
    • Added the WorkspaceMemorySource interface for abstracting conversation turn retrieval.
  • packages/memory-alaya/src/use-cases/assemble-recall-context.ts
    • Added logic to assemble recalled memory into a formatted context string.
  • packages/memory-alaya/src/use-cases/recall-short-term-candidates.ts
    • Added logic to recall short-term memory candidates based on query and scope.
  • packages/memory-alaya/src/use-cases/run-planner-batch.test.ts
    • Added unit tests for the runPlannerBatch use-case.
  • packages/memory-alaya/src/use-cases/run-planner-batch.ts
    • Added the core logic for running the planner batch process, including candidate extraction, embedding, and storage.
  • packages/memory-alaya/src/use-cases/run-query-engine.test.ts
    • Added unit tests for the runQueryEngine use-case.
  • packages/memory-alaya/src/use-cases/run-query-engine.ts
    • Added the core logic for running the query engine, including scoring and selecting candidates.
  • packages/memory-alaya/src/use-cases/score-short-term-candidates.ts
    • Added logic to score short-term memory candidates based on similarity, time, and emotion.
  • packages/memory-alaya/src/use-cases/select-short-term-candidates.ts
    • Added logic to select short-term memory candidates based on budget and recall thresholds.
  • packages/memory-alaya/src/use-cases/should-trigger-planner.test.ts
    • Added unit tests for the shouldTriggerPlanner use-case.
  • packages/memory-alaya/src/use-cases/should-trigger-planner.ts
    • Added logic to determine if the planner should be triggered based on various policies.
  • packages/memory-alaya/src/utils/hash.ts
    • Added a utility function for FNV1a string hashing.
  • packages/memory-alaya/tsconfig.json
    • Added TypeScript configuration for the new memory-alaya package.
  • packages/memory-alaya/tsdown.config.ts
    • Added tsdown configuration for the new memory-alaya package.
  • packages/memory-alaya/vitest.config.ts
    • Added Vitest configuration for the new memory-alaya package.
  • packages/stage-layouts/src/layouts/settings.vue
    • Updated the settings layout to support a disableBackButton meta property for routes.
  • packages/stage-pages/src/pages/devtools/memory-alaya.vue
    • Added a new developer tools page to inspect Alaya planner and query engine logs, including runtime details and errors.
  • packages/stage-pages/src/pages/settings/memory/index.vue
    • Implemented the UI for displaying and managing short-term memory entries for the active workspace, replacing a placeholder.
  • packages/stage-pages/src/pages/settings/modules/memory-short-term.vue
    • Implemented the UI for configuring Alaya planner LLM and embedding providers, including model selection, timeouts, and system prompts, replacing a placeholder.
  • packages/stage-pages/src/pages/settings/providers/index.vue
    • Extended the providers list to include new categories for 'Planner' and 'Planner Embedding' providers.
  • packages/stage-pages/src/pages/settings/providers/planner-embedding/[providerId].vue
    • Added a new page for configuring specific planner embedding providers, including API keys and base URLs.
  • packages/stage-pages/src/pages/settings/providers/planner/[providerId].vue
    • Added a new page for configuring specific planner LLM providers, including API keys, base URLs, and account IDs (for Cloudflare).
  • packages/stage-ui/package.json
    • Added @proj-airi/memory-alaya as a new dependency.
  • packages/stage-ui/src/composables/use-modules-list.ts
    • Integrated the new useMemoryShortTermStore to reflect the configuration status of the memory module in the modules list.
  • packages/stage-ui/src/composables/use-planner-embedding-provider-validation.ts
    • Added a composable for real-time validation of planner embedding provider configurations.
  • packages/stage-ui/src/composables/use-planner-provider-validation.ts
    • Added a composable for real-time validation of planner LLM provider configurations.
  • packages/stage-ui/src/database/repos/alaya-short-term-memory.repo.test.ts
    • Added unit tests for the alaya-short-term-memory.repo.
  • packages/stage-ui/src/database/repos/alaya-short-term-memory.repo.ts
    • Implemented a repository for managing Alaya short-term memory records and checkpoints using IndexedDB.
  • packages/stage-ui/src/stores/chat.ts
    • Integrated useChatAlayaPlannerStore and useChatAlayaQueryStore.
    • Modified the message sending logic to incorporate Alaya's recall context into system messages and trigger the planner after chat turns.
  • packages/stage-ui/src/stores/chat/alaya-planner.ts
    • Created a Pinia store to manage the Alaya planner's execution, logging, and state, including handling schema versions and integrating LLM/embedding providers.
  • packages/stage-ui/src/stores/chat/alaya-query.ts
    • Created a Pinia store to manage Alaya's query engine, responsible for building recall context and logging query runs.
  • packages/stage-ui/src/stores/chat/alaya/heuristic-planner-llm-provider.test.ts
    • Added unit tests for the heuristic planner LLM provider.
  • packages/stage-ui/src/stores/chat/alaya/heuristic-planner-llm-provider.ts
    • Implemented a heuristic-based LLM provider for planner extraction, serving as a fallback when a dedicated LLM is unavailable or fails.
  • packages/stage-ui/src/stores/chat/alaya/planner-embedding-presets.ts
    • Defined presets for common planner embedding models (e.g., OpenAI, Qwen).
  • packages/stage-ui/src/stores/chat/alaya/planner-embedding-provider.test.ts
    • Added unit tests for the planner embedding provider.
  • packages/stage-ui/src/stores/chat/alaya/planner-embedding-provider.ts
    • Implemented the MemoryEmbeddingProvider interface, handling embedding requests, batching, and timeouts.
  • packages/stage-ui/src/stores/chat/alaya/planner-llm-provider.test.ts
    • Added unit tests for the planner LLM provider.
  • packages/stage-ui/src/stores/chat/alaya/planner-llm-provider.ts
    • Implemented the MemoryLlmProvider interface, handling LLM calls for extraction, including system prompt management, timeouts, and fallback logic.
  • packages/stage-ui/src/stores/chat/alaya/planner-system-prompt.ts
    • Defined the detailed default system prompt used by the Alaya planner LLM for memory extraction.
  • packages/stage-ui/src/stores/chat/alaya/short-term-memory-reader.test.ts
    • Added unit tests for the short-term memory reader.
  • packages/stage-ui/src/stores/chat/alaya/short-term-memory-reader.ts
    • Implemented the ShortTermMemoryReader interface to retrieve active short-term memory records.
  • packages/stage-ui/src/stores/chat/alaya/workspace-memory-source.test.ts
    • Added unit tests for the workspace memory source.
  • packages/stage-ui/src/stores/chat/alaya/workspace-memory-source.ts
    • Implemented the WorkspaceMemorySource interface, responsible for extracting and mapping chat session messages into a format consumable by the Alaya planner.
  • packages/stage-ui/src/stores/chat/maintenance.ts
    • Updated cleanupMessages to be an asynchronous function.
  • packages/stage-ui/src/stores/modules/index.ts
    • Exported new memory-related stores (memory-short-term and memory-short-term-records).
  • packages/stage-ui/src/stores/modules/memory-short-term-records.ts
    • Created a Pinia store to manage the loading, deleting, and clearing of short-term memory records for display in the UI.
  • packages/stage-ui/src/stores/modules/memory-short-term.ts
    • Created a Pinia store to manage the configuration settings for Alaya's short-term memory, including planner and embedding provider selections, thresholds, and timeouts.
  • packages/stage-ui/src/stores/planner-embedding-providers.ts
    • Created a Pinia store to manage the configuration and state of dedicated embedding providers for the Alaya planner.
  • packages/stage-ui/src/stores/planner-providers.ts
    • Created a Pinia store to manage the configuration and state of dedicated LLM providers for the Alaya planner.
Ignored Files
  • Ignored by pattern: packages/i18n/src/** (2)
    • packages/i18n/src/locales/en/settings.yaml
    • packages/i18n/src/locales/zh-Hans/settings.yaml
Activity
  • The author freezinlove has introduced a significant new feature: the Alaya memory system.
  • Validation included running unit tests for the new memory-alaya package, building the package, and type-checking the stage-ui and overall project.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces the Alaya memory system as a standalone package, featuring a well-thought-out architecture and a comprehensive UI for developer inspection and settings management. While the foundation is solid, the integration of LLM-generated memory into the prompt flow introduces standard LLM safety risks, such as indirect prompt injection. Additionally, the current local storage implementation has a potential performance bottleneck when listing keys, and a few medium-severity issues related to correctness, maintainability, and efficiency have been identified that should be addressed.

Comment on lines +177 to +180
if (
existingMemory.contentHash === record.contentHash
&& existingMemory.summary === record.summary
) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The contentHash is generated from the lowercased summary in run-planner-batch.ts, making it case-insensitive. However, this check includes a case-sensitive comparison of the summary field. This means memories differing only in case will not be skipped, which may be unintended. If the goal is case-insensitive deduplication, relying solely on contentHash would be clearer and more consistent. If case sensitivity is desired, the hash should be generated from the original summary.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Keeping this as-is for now.

contentHash is intentionally case-insensitive as a coarse content identity signal, while the direct summary comparison is retained as a stricter equality check before skipping writes. In the current flow this branch is already gated by the idempotency key path, so I’d prefer not to broaden dedup semantics in this PR without revisiting the full merge/update behavior.

Comment on lines +239 to +245
const idempotencyKeys = await listKeysByPrefix(storage, `${idempotencyKeyPrefix(workspaceId)}/`)
for (const key of idempotencyKeys) {
const mappedMemoryId = await storage.getItemRaw<string>(key)
if (mappedMemoryId === memoryId) {
await storage.removeItem(key)
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This loop appears to be redundant and inefficient. The idempotency key is already removed on line 236 using targetRecord.idempotencyKey. This second loop iterates over all idempotency keys for the workspace, which could be slow if there are many memories. If targetRecord.idempotencyKey is reliable and unique per memory, this loop is unnecessary and can be removed to improve performance.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Keeping this fallback cleanup for now.

The direct idempotency mapping removal is still the primary path. This extra scan is a defensive cleanup step for stale or duplicated mappings in the current IndexedDB-backed MVP, so it is intentionally conservative. I agree it is not ideal at larger scale, but I’d prefer to optimize this together with a broader storage/index cleanup pass rather than changing the deletion semantics in this PR.

@github-actions
Copy link
Contributor

github-actions bot commented Mar 9, 2026

⏳ Approval required for deploying to Cloudflare Workers (Preview) for stage-web.

Name Link
🔭 Waiting for approval For maintainers, approve here

Hey, @nekomeowww, @sumimakito, @luoling8192, @LemonNekoGH, kindly take some time to review and approve this deployment when you are available. Thank you! 🙏

@freezinlove
Copy link
Author

I think the latest workflow runs are currently waiting for maintainer approval on the fork PR.

When convenient, could a maintainer please approve the pending workflows for the latest head commit? Thank you.

@nekomeowww, @sumimakito, @luoling8192, @LemonNekoGH

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant