always accompany

beilu

Make AI truly remember.

beilu-always accompany is an AI companion platform unifying companionship and productivity, combining an IDE editing environment, a multi-AI collaboration engine, an original layered memory algorithm, a semantic retrieval system, and a chat system compatible with the SillyTavern ecosystem. It addresses the two fundamental bottlenecks of current LLMs head-on: limited context windows and attention degradation as context grows.

English | 中文

This entire project — design, architecture, and development — was completed independently by a university student, leveraging AI-assisted programming with skills spanning algorithm design, biomimicry principles, framework architecture, and logical thinking.

Chat interface with fine-tuned controls, adaptable to various beautification styles

Why This Project?

The Fundamental Problem with Current AI

Whether it's AI coding tools (Cursor, Copilot), AI chat applications (ChatGPT, Claude), or AI roleplay platforms (SillyTavern), they all face the same underlying limitations:

Problem	Current State	Consequence
Limited context window	Even 128K-1M tokens overflow in long conversations	Early messages get truncated; AI loses critical information
Attention degradation	The longer the context, the less the model focuses on each segment	Even if information exists in context, AI may "overlook" it
No persistent memory	Closing a conversation = forgetting everything	Every new session starts from zero

Our Solution

Don't stuff all memories into the context. Let a dedicated AI retrieve them on demand.

Traditional:  [All historical memory + current chat] → Single AI → Attention scattered
                              ↓
Our approach: [Index] → Retrieval AI (focused on finding) → [Selected memory + current chat] → Reply AI (focused on quality)

The Reply AI only sees precisely filtered memory fragments from the Retrieval AI. The context is clean, the signal-to-noise ratio is extremely high, and attention never degrades.

Core Features

🧠 Original Layered Memory Algorithm

Designed after the human hippocampus memory formation mechanism and the Ebbinghaus forgetting curve, achieving theoretically unlimited AI memory.

Three-Layer Memory Architecture

🔥 Hot Memory Layer — Injected every turn
   User profile / Permanent memories Top-100 / Pending tasks / Recent memories about user

🌤️ Warm Memory Layer — On-demand retrieval, last 30 days
   Daily summaries / Archived temporary memories / Monthly index

❄️ Cold Memory Layer — Deep retrieval, beyond 30 days
   Monthly summaries / Historical daily summaries / Yearly index

Additionally, an L0 Memory Table Layer (10 highly customizable tables, fully injected every turn as CSV) provides structured immediate context.

Key Metrics

Metric	Value
Hot layer injection per turn	~7,000-11,000 tokens (only 5-9% of a 128K window)
Retrieval AI context	<5,000 tokens (100% attention focused on retrieval)
P1 retrieval efficiency	Max 3 rounds to hit target (BM25 pre-filtering + regex exact match)
Retrieval tech stack	BM25 + Regex Search (dual-engine collaboration, zero external deps)
Storage cost	Zero (pure JSON files, no database dependency)
Single-character sustained operation	12+ years (at 5,000 files)
Theoretical duration	260+ years (at 100,000 files; NTFS/ext4 support far exceeds this)

Memory Decay Formula

score = weight × (1 / (1 + days_since_triggered × 0.1))

Inspired by the Ebbinghaus forgetting curve: important and recently triggered memories are prioritized for injection, rather than simple chronological order.

Pure Prompt-Driven — Zero Hardcoded Limitations

The most critical design feature of the memory system: all memory injection, retrieval, archival, and summarization operations are performed by AI through prompts, not traditional hardcoded logic.

This means:

Table meanings and purposes can be changed anytime: Simply modify the prompt descriptions for tables, and the AI will interpret and operate them accordingly — no code changes needed
Archival strategies are instantly adjustable: P2-P6 behaviors are entirely defined by prompts; modifying prompts changes archival rules, summary formats, and retrieval strategies
Zero technical barrier for migration: Users can edit prompts themselves to adapt to different scenarios (roleplay / coding assistant / game NPC) without programming skills
Naturally avoids technical debt: No complex parsers or state machines to maintain — the AI itself is the most flexible "parser"

📊 Highly Customizable Memory Tables — Perfectly Solves the AI RP God's-Eye Problem

10 fully customizable structured tables, injected every turn as CSV. Table meanings and purposes are entirely defined by prompts — change the prompt descriptions and the same table system serves completely different scenarios:

Scenario	Example Table Usage
AI Roleplay	Space-time settings / Character status / Social relations / Quest progress / Inventory — AI only knows what's recorded in the tables, perfectly solving the god's-eye problem
Programming	Architecture decisions / Code conventions / Module dependencies / Bug tracking / TODO lists
Work Management	Project progress / Meeting notes / Contacts / To-do items / Knowledge accumulation
Gaming	Character attributes / Equipment list / Skill trees / World state / NPC relationships

Table contents are automatically maintained by the Chat AI via <tableEdit> tags — the AI autonomously decides when to update which data during conversation, with no manual intervention needed.

Why does this solve the god's-eye problem? In traditional AI RP, the AI can "see" all conversation history, including information the character shouldn't know. With the table system, the AI acts only based on information explicitly recorded in the tables — if something isn't recorded in the character's cognition table, the AI simply doesn't "know" it. This is an information isolation mechanism that makes AI behavior more authentic and immersive.

🌍 World Book Dynamic Injection — Conditional Triggers Based on Table Data

World book entries support dynamic injection — deciding whether to inject a world setting based on real-time data from memory tables:

World book entry trigger conditions can read values and states from tables:
  → When affection > 80 in Table #1 "Character Status", inject "Special dialogue unlocked" setting
  → When main quest = "Chapter 3" in Table #4 "Quest Progress", inject corresponding world description
  → When a specific item exists in Table #5 "Inventory", inject the item's usage effects

This table-driven dynamic world-building mechanism is easy to learn and turns world settings from static text into a living system that evolves with conversation progress and character state.

🔍 Smart Retrieval System — BM25 + Regex Search Dual Engine

Newly introduced dual-engine retrieval capability for fast and precise information finding:

BM25 Semantic Retrieval

Classic & efficient: TF-IDF-based statistical algorithm that quickly filters the most relevant candidates from massive memory files
Pure JS implementation: Zero external dependencies — no vector database or Embedding API needed, ready out of the box

Regex Exact Search

Precise matching: After BM25 coarse filtering, regex provides exact targeting — supports pattern matching, keyword combinations, fuzzy search
Cross-file search: A single search scans all memory files and project files, rapidly locating target content
IDE deep integration: Regex search is available directly in the file editing environment for quick project-wide file and content search

Retrieval Performance

P1 Retrieval AI dramatically enhanced: BM25 pre-filtering + regex exact matching dual-engine collaboration reduces retrieval rounds from 5 to max 3 — 40% faster, 40% cheaper on API costs
Significantly higher accuracy: Regex search compensates for pure semantic retrieval's weakness in exact matching — keywords, dates, names and other structured info can be found in one shot

🤖 Multi-AI Collaboration Engine — Cost-Effective Intelligence

The system has 7 built-in AI roles, each with a dedicated responsibility. Each conversation only calls 2 AIs (Retrieval AI + Chat AI); the rest trigger on demand — no need to worry about usage:

AI	Role	Trigger	Usage Notes
Chat AI	Conversation with users, file operations	User sends a message	Called every conversation
P1 Retrieval AI	Search relevant history from memory (up to 3 rounds) + Smart Preset Switching	Automatic per turn	Called every turn, can use free AI
P2 Archive AI	Summarize and archive when temporary memories exceed threshold	~50 conversations per trigger	Extremely infrequent
P3 Daily Summary AI	Generate detailed daily summary	Manual trigger	Only when user clicks
P4 Hot→Warm AI	Move expired hot-layer memories to warm layer	Manual trigger	Only when user clicks
P5 Monthly Summary AI	Warm→Cold archival, generate monthly summaries	Manual trigger	Only when user clicks
P6 Repair AI	Check and fix memory file format issues	Manual trigger	Only when user clicks

💰 AI Usage & Cost Analysis

Key insight: The Retrieval AI (P1) only needs to "find memories" — it doesn't require a high-intelligence model. It can run entirely on free or ultra-low-cost AI (e.g., Gemini 2.0/2.5 Flash free tier).

AI calls per conversation:
  ① P1 Retrieval AI — Find memories + determine preset switching (can use free AI like Gemini Flash)
  ② Chat AI         — Generate reply based on selected memories (use any model you prefer)

Infrequent AI:
  ③ P2 Archive    — Triggers roughly once per 50 conversations, barely noticeable
  ④ P3-P6         — All manual trigger, zero usage unless you click

Bottom line: If P1 uses free AI (Gemini Flash free tier is more than enough), then the actual cost per conversation = only one Chat AI call. The memory system runs at virtually zero cost.

🔄 Smart Preset Switching — AI Auto-Adapts to Interaction Modes

Major breakthrough: P1 Retrieval AI doesn't just retrieve memories — it analyzes conversation intent in real-time and automatically switches to the most suitable prompt preset.

Multi-mode adaptation: Casual chat, roleplay, coding, prompt engineering… the AI automatically switches to the optimal preset based on conversation content, with prompts and COT (Chain of Thought) changing accordingly
Seamless experience: No manual intervention needed — just say "help me write code" and the AI's behavior mode adjusts in real-time
Cooldown anti-oscillation: Built-in cooldown counter prevents rapid repeated switching
Fully customizable: Switching logic is guided by COT in prompts; users can define their own switching conditions and strategies
Manual quick switch: Also supports one-click manual preset switching from the chat interface

This means AI is no longer "one preset fits all" — it dynamically adapts to the optimal behavior mode based on the current context, making it a truly multi-mode intelligent agent.

🤖 Discord Bot — Your AI Serving You on Other Platforms

Fully implemented cross-platform Bot capability: Deploy your AI companion to Discord channels and chat with your AI anytime, anywhere:

Full memory access: The Discord Bot shares the local memory system — your AI remembers your history even on Discord
Visual management panel: Bot Token, Owner, message depth and other common settings displayed as form controls — no manual JSON editing needed
Real-time message log: View Bot's sent/received messages in real-time on the management interface (user messages / AI replies / error logs)
Multi-channel support: Bot can work in multiple Discord channels simultaneously, maintaining independent context per channel
One-click context clear: Clear Bot's channel chat memory anytime to start fresh

🖥️ IDE-Style Interface

VSCode-style three-panel layout:

Left panel: Preset management / World book binding / Persona selection / Character editing
Center panel: Chat / File editor / Memory management — three-tab switching
Right panel: Character info / Feature toggles / Memory AI operation panel

IDE includes built-in BM25 + Regex Search dual-engine file retrieval — enter keywords or regex expressions to search across all project files instantly.

🔌 12 Feature Plugins

Preset engine / Memory system / File operations / Desktop screenshot / Logger / Feature toggles / Multi-AI collaboration / Regex beautification / World book / Web search / System info / Browser awareness

🌐 Multi-Language Support (i18n)

The management home page (beilu-home) supports 4 languages via a "translation overlay" approach — no restructuring of existing code, just adding data-i18n attributes to DOM elements for automatic translation.

Code	Language
zh-CN	Simplified Chinese (default)
en-UK	English
ja-JP	日本語 (Japanese)
zh-TW	繁體中文 (Traditional Chinese)

Language preference auto-saved to localStorage, persists across refreshes
Dynamic content (JS-generated text) translated via t(key) function
Language switch triggers a beilu-lang-change event; all modules respond automatically

🔬 System Diagnostics & One-Click Log Export

Built-in full-stack diagnostic framework for rapid troubleshooting:

Module-level toggle: Enable/disable diagnostics per module (chat engine, memory, preset, etc.) — zero overhead when disabled
Console interception: Automatically captures all console.log/warn/error/info from both frontend (browser) and backend (Deno), stored in a 500-entry ring buffer without affecting normal output
Error capture: Automatically catches window.onerror and unhandledrejection events
One-click log export: Click "📦 One-Click Pack Logs" in the Debug tab, or call beiluDiag.pack() from the browser console — generates a single JSON file

When reporting issues, attach this JSON file for complete context — no need to manually copy console output or describe steps.

📦 SillyTavern Ecosystem Compatible

Direct import of SillyTavern format character cards, presets, and world books
Support for Risu formats (ccv3 / charx / rpack)
14 AI service generators (proxy / gemini / claude / ollama / grok, etc.)

Comparison with Existing Tools

vs AI Chat Applications (ChatGPT / Claude / Gemini)

Dimension	ChatGPT etc.	beilu-always accompany
Memory	Simple summaries / conversation history	Three-layer graded + BM25/Regex dual-engine retrieval + multi-AI collaboration, theoretically unlimited
Attention	Degrades as context grows	Retrieval AI pre-filters; Reply AI attention stays focused
Customization	Limited System Prompt	Full preset system + 10 customizable memory tables + dynamic world book injection
Data ownership	Server-side storage	Local JSON files, fully self-owned
Cross-platform	Official clients only	Web + Discord Bot, AI serves you on multiple platforms

vs AI Coding Tools (Cursor / Copilot / Windsurf)

Dimension	Cursor etc.	beilu-always accompany
Project memory	Based on current file context	Cross-session persistent memory (architecture decisions, code conventions, historical discussions)
Multi-AI collaboration	Single model	7 AIs with dedicated roles; retrieval/summary/reply separated
Memory cost	Relies on large context windows	~10K tokens covers the hot layer
File search	IDE built-in	BM25 + Regex Search dual engine + IDE file tree

vs AI Roleplay Platforms (SillyTavern)

Dimension	SillyTavern	beilu-always accompany
Memory	No built-in memory system	Original three-layer memory + BM25/Regex dual-engine retrieval + 6 auxiliary AIs
God's-eye problem	No solution	Memory table information isolation — AI only knows what's in the tables
File operations	None	Built-in IDE file management + AI file operations
Desktop capability	None	beilu-eye desktop screenshot → AI recognition
Cross-platform	Web only	Web + Discord Bot
Preset compatibility	Native	Fully compatible with ST presets/character cards/world books

Thoughts on the Future of LLMs

Even when context windows expand to 10M+ tokens, layered memory remains valuable:

Attention problems won't disappear: No matter how large the window, model attention on massive text will still degrade. Pre-filtering + precise injection will always outperform "stuff everything in."
Cost efficiency: Larger windows = higher costs. Replacing 100K+ tokens of full history with ~10K tokens of selected memory reduces API call costs by 10x or more.
Structured > Unstructured: Tabular memory is easier for AI to accurately read and update than information scattered across conversations.

Layered memory is not a temporary workaround for limited context windows — it is a superior paradigm for information organization.

Roadmap

✅ Completed

Original three-layer memory algorithm (pure prompt-driven) — Permanent memory, theoretically unlimited
Multi-AI collaboration engine (Memory AI + Reply AI)
Smart Preset Switching System — P1 real-time context analysis with auto preset switching, multi-mode adaptive COT
🆕 Smart Retrieval System — BM25 + Regex Search dual engine, P1 retrieval from 5 rounds down to max 3 (40% faster, 40% cheaper), IDE file quick search
🆕 Discord Bot — Cross-platform AI service with visual management panel + real-time message log
🆕 Browser Page Awareness — Passive + on-demand architecture, userscript monitors DOM changes and reports page snapshots
IDE-style interface with file operations (including BM25 + Regex Search dual-engine file retrieval)
Highly customizable memory tables (10 tables, adaptable to RP / coding / work / gaming and any scenario)
World book dynamic injection (conditional triggers based on table data)
Desktop screenshot system (beilu-eye)
Rendering engine
Management home page i18n (Chinese / English / Japanese / Traditional Chinese)
12 feature plugins
Full-stack diagnostic framework with one-click log export

🔜 Near-term

More platform Bot integrations
Plugin ecosystem (Workshop-style high extensibility)
Live2D integration + AI-controlled models
AI game engine (chat interface = game interface, code-compatible, userscript-friendly)
TTS / Text-to-image integration
VSCode extension compatibility
Highly extensible core architecture

Getting Started

Requirements

Deno runtime
Modern browser (Chrome / Edge / Firefox)
At least one AI API key (Gemini API recommended — free tier available)

Installation & Launch

# Clone the project
git clone https://github.com/beilusaiying/always-accompany.git
cd always-accompany

# Launch (Windows)
run.bat

# Launch (Linux/macOS)
chmod +x run.sh
./run.sh

After launch, open your browser and navigate to http://localhost:1314

Basic Configuration

Configure AI source: Home → System Settings → Add AI service source (proxy / gemini, etc.)
Import character card: Home → Usage → Import (supports SillyTavern PNG/JSON format)
Configure memory presets: Home → Memory Presets → Set up API for P1-P6 (recommend P1 using Gemini Flash free tier)
Start chatting: Click a character card to enter the chat interface

Using the Memory System

Automatic operation: Memory tables are automatically maintained by the Chat AI (via <tableEdit> tags); Retrieval AI (P1) triggers automatically each turn
Manual operations: Chat interface right panel → Memory AI Operations → P2-P6 manual buttons
Daily archival: At the end of each day, click the "End Today" button to trigger the 9-step daily archival process
Memory browsing: Chat interface → Memory Tab → Browse/edit/import/export memory files

Discord Bot Setup

Create a Bot application on Discord Developer Portal
In beilu-chat interface → Bot tab at the top → Enter Bot Token and Owner username
Click "Start Bot" → @your Bot in a Discord channel to start chatting

Tech Stack

Component	Technology
Runtime	Deno (with fount module system)
Backend	Node.js compatibility layer + Express-style routing
Frontend	Vanilla JavaScript (ESM modules)
AI integration	14 ServiceGenerators
Smart retrieval	BM25 + Regex Search dual engine (pure JS, zero deps)
Desktop screenshot	Python (mss + tkinter + pystray)
Cross-platform	discord.js v14
Storage	Pure JSON file system

🎁 Community & Resources

💬 Join the Discord Community

Discussion, resource sharing, prompt exchange, bug reports — come join us!

📦 Ready-to-Use Memory Prompt Presets

The project includes a carefully crafted P1-P6 Memory AI prompt preset, ready to use out of the box:

beilu-presets_2026-02-23.json — Complete prompt configurations for P1 Retrieval AI, P2 Archive AI, P3 Daily Summary AI, P4 Hot→Warm AI, P5 Monthly Summary AI, and P6 Repair AI

How to use: Home → Memory Presets → Click "Import" → Select this JSON file to import all presets in one click.

🤝 How to Contribute

We welcome everyone to participate in building this project! You can:

🃏 Share character cards — Create and publish your character cards to enrich the community
📝 Publish prompt presets — Share your tuned memory presets and chat presets to help others
🌍 Contribute world books — Build world settings for other users to import
🐛 Report bugs — Use the one-click log export feature and attach the diagnostic report
💡 Suggest features — Feature requests, UI improvements, plugin ideas — all welcome
🔧 Contribute code — Fork & PR, let's build together

The community has many more great prompts and character card resources — feel free to explore and share!

Acknowledgments

This project would not be possible without the contributions of the following open-source projects and communities:

fount — The foundational framework providing AI message handling, service source management, module loading, and other core infrastructure, saving significant development time on low-level implementation
SillyTavern — The pioneering project in AI roleplay, whose preset format, character card specification, and world book system have become community standards. This project is fully compatible with its ecosystem
SillyTavern Plugin Community — Thanks to all open-source plugin authors for their exploration and sharing. Their work on rendering engines, memory enhancement, and feature extensions provided valuable references and inspiration for this project's design

Screenshots

🖥️ IDE AI Editor — VSCode-inspired, easy to get started

IDE-style AI coding and file editing interface, inspired by VSCode for a familiar experience. Plugin integration and management coming soon.

If you're unfamiliar with AI coding or a beginner, please use the designated sandbox space for AI file capabilities: 📖 Read / ✏️ Write / 🗑️ Delete / 🔄 Retry / 🔌 MCP / ❓ Questions / 📋 Todo. You can disable write and delete for safety.

🧠 Memory Files — View and edit memory data in real-time

Manually edit content anytime, observe memory AI operations in real-time. You can also make requests to the memory AI directly.

🎨 Regex Editor — Sandbox & Free modes

Manage regex rules at different levels, modify conversations, with Sandbox and Free modes. Protects against potentially malicious scripts from unknown character cards.

⚠️ We cannot guarantee effectiveness against all malicious scripts. Please review character card code for malicious content before use. We are not responsible for any damages.

📋 Commander-Level Prompts — Full control over all sent content

Commander-level prompts that control all sent content, maximizing prompt effectiveness.

🧠 Memory Presets P1-P6 — Fully prompt-driven, zero technical barrier

P2-P6 behaviors can all be modified through prompts — no coding required, highly adaptable.

📖 System Guide — Detailed documentation for quick onboarding

Detailed system documentation to help you get started quickly.

🔬 System Diagnostics — One-click log export for rapid troubleshooting

Comprehensive system self-diagnosis with one-click log packaging. Captures both browser console and server logs into a single JSON file — just attach it when reporting issues.

License

This project is built on the fount framework, with direct authorization from the original author.

Name		Name	Last commit message	Last commit date
Latest commit History 71 Commits
ai玩耍空间		ai玩耍空间
default		default
desktop-eye		desktop-eye
imgs		imgs
path		path
src		src
.directory		.directory
.gitignore		.gitignore
README.md		README.md
README_CN.md		README_CN.md
beilu_项目总结.md		beilu_项目总结.md
deno.json		deno.json
eslint.config.mjs		eslint.config.mjs
package-lock.json		package-lock.json
run.bat		run.bat
run.sh		run.sh

Folders and files

Latest commit

History

Repository files navigation

always accompany

Why This Project?

The Fundamental Problem with Current AI

Our Solution

Core Features

🧠 Original Layered Memory Algorithm

Three-Layer Memory Architecture

Key Metrics

Memory Decay Formula

Pure Prompt-Driven — Zero Hardcoded Limitations

📊 Highly Customizable Memory Tables — Perfectly Solves the AI RP God's-Eye Problem

🌍 World Book Dynamic Injection — Conditional Triggers Based on Table Data

🔍 Smart Retrieval System — BM25 + Regex Search Dual Engine

BM25 Semantic Retrieval

Regex Exact Search

Retrieval Performance

🤖 Multi-AI Collaboration Engine — Cost-Effective Intelligence

💰 AI Usage & Cost Analysis

🔄 Smart Preset Switching — AI Auto-Adapts to Interaction Modes

🤖 Discord Bot — Your AI Serving You on Other Platforms

🖥️ IDE-Style Interface

🔌 12 Feature Plugins

🌐 Multi-Language Support (i18n)

🔬 System Diagnostics & One-Click Log Export

📦 SillyTavern Ecosystem Compatible

Comparison with Existing Tools

vs AI Chat Applications (ChatGPT / Claude / Gemini)

vs AI Coding Tools (Cursor / Copilot / Windsurf)

vs AI Roleplay Platforms (SillyTavern)

Thoughts on the Future of LLMs

Roadmap

✅ Completed

🔜 Near-term

Getting Started

Requirements

Installation & Launch

Basic Configuration

Using the Memory System

Discord Bot Setup

Tech Stack

🎁 Community & Resources

💬 Join the Discord Community

📦 Ready-to-Use Memory Prompt Presets

🤝 How to Contribute

Acknowledgments

Screenshots

License

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages