Skip to content

Latest commit

 

History

History
334 lines (205 loc) · 9.09 KB

File metadata and controls

334 lines (205 loc) · 9.09 KB

OpenClaw Integration Playbook

This guide turns the integration and test findings for memory-lancedb-pro into a reusable checklist for two audiences:

  • OpenClaw users who want a stable first deployment
  • Maintainers who need a practical regression guide for future iterations

It is intentionally generic. It focuses on verifiable behavior, failure signatures, and operating rules instead of local machine paths.

0. Start with the correct user path

Before following the rest of this guide, classify your current state:

  • new OpenClaw user or first-time memory setup
  • existing OpenClaw user adding this plugin later
  • existing memory-lancedb-pro user upgrading from a pre-v1.1.0 release

The operating rule is:

  • upgrade is for older memory-lancedb-pro data
  • migrate is for built-in memory-lancedb data
  • reembed is for embedding rebuilds, not routine upgrades

If you mix those paths, debugging becomes much harder because storage-format issues and retrieval-quality issues get conflated.

1. Recommended Deployment Modes

Choose one of these modes explicitly.

Mode A: Retrieval-first memory

Use this when you want:

  • memory_store / memory_recall
  • hybrid search (vector + BM25)
  • auto-capture / auto-recall
  • smart extraction and lifecycle ranking

Keep plugin session summaries disabled unless you have a concrete retrieval need for prior sessions.

Mode B: Retrieval + session-summary search

Use this when you also want /new to write a searchable session summary into LanceDB.

In this mode:

  • enable plugin sessionMemory.enabled
  • decide whether OpenClaw built-in session-memory should also remain enabled

If both are enabled, /new can produce two outputs:

  • built-in workspace/session summary files
  • LanceDB session-summary memories written by memory-lancedb-pro

That is valid, but it is a double-write design. If you do not want duplicated session summarization paths, keep only one.

2. Recommended Session Memory Strategy

For most users, use one of these patterns.

Option 1: Built-in only

Choose this when you mainly want transcript persistence and workspace summaries.

  • plugin sessionMemory.enabled = false
  • OpenClaw built-in hooks.internal.entries.session-memory.enabled = true

Option 2: Plugin only

Choose this when you want session summaries to participate in LanceDB retrieval, dedupe, and lifecycle scoring.

  • plugin sessionMemory.enabled = true
  • OpenClaw built-in hooks.internal.entries.session-memory.enabled = false

Option 3: Dual write

Choose this only if you explicitly want both:

  • workspace markdown/session artifacts
  • LanceDB-searchable session memories

If you use dual write, document it for your team. Otherwise it will look like duplicate behavior during debugging.

3. Baseline Verification Checklist

Run this before debugging retrieval quality.

openclaw config validate
openclaw status
openclaw gateway status
openclaw plugins info memory-lancedb-pro
openclaw hooks list --json

Confirm:

  • plugin is loaded from the expected path
  • plugins.slots.memory points to memory-lancedb-pro
  • the expected hooks are enabled
  • the gateway has been restarted after config changes

If you use plugin session memory, openclaw hooks list --json should show the plugin hook:

  • memory-lancedb-pro-session-memory

If you want plugin-only session summaries, also confirm:

  • built-in session-memory is disabled

4. Fresh-Agent Bootstrap Checks

When a newly created agent fails on its first real turn, do not start with retrieval debugging. Verify agent bootstrap first.

Typical symptom:

  • Unknown model: openai-codex/gpt-5.4

Common root cause:

  • the new agent has not been initialized with agent-local model/auth indexes

Verify these files exist under the new agent directory:

  • ~/.openclaw/agents/<agentId>/agent/models.json
  • ~/.openclaw/agents/<agentId>/agent/auth-profiles.json

Also verify the agent can resolve at least one usable model before testing memory behavior.

Practical rule:

  • if a fresh agent cannot complete a plain text turn, memory tests on that agent are not meaningful yet

5. Retrieval Quality Rules

Short CJK keyword queries need explicit validation

Short Chinese keywords are a common false-negative case in hybrid retrieval systems.

Observed pattern during testing:

  • unique identifiers or full sentences recall well
  • short CJK keywords may depend on BM25/lexical matching quality
  • lowering minScore can improve recall but often increases noise

Recommended tuning order:

  1. verify BM25 / lexical fallback behavior first
  2. verify hybrid fusion behavior second
  3. tune minScore / hardMinScore last

Do not start by aggressively lowering thresholds. That often converts a retrieval-quality problem into a relevance-noise problem.

Lifecycle scoring must not suppress strong fresh hits

If you use lifecycle decay and tiering:

  • apply relevance filtering before lifecycle/time decay demotion
  • ensure new working memories are not pushed below hardMinScore simply because decay was applied too early

Regression target:

  • a fresh, high-relevance working-tier memory must remain retrievable

6. Functional Smoke Tests

These tests cover the core closed loop.

CLI and storage

openclaw memory-pro stats
openclaw memory-pro list --scope global --limit 5
openclaw memory-pro search "your test keyword" --scope global --limit 5

Validate:

  • stats returns usable counts
  • list returns stored items in the expected scope
  • search returns deterministic results for at least one exact identifier and one natural-language query

Tool loop

Validate at least once:

  • memory_store
  • memory_recall
  • memory_update
  • memory_forget
  • memory_list
  • memory_stats

Scope isolation

Validate at least:

  • main -> main hits
  • main -> work miss
  • work -> global allowed if intended
  • life -> work miss unless explicitly permitted

Smart extraction stability

Validate the three semantic branches:

  • create
  • merge
  • skip

Then run a multi-turn sequence such as:

  • create -> skip -> merge -> skip

Expected result:

  • one stable memory record
  • duplicates suppressed
  • new facts merged without unbounded duplication

7. Real /new Session Test

If plugin session memory is enabled, run one real /new validation after the basic smoke tests.

Check for three facts:

  1. the active session changes
  2. the expected hook fires
  3. the summary lands in the intended storage path

Plugin evidence should look like:

  • a log entry that the session summary was stored for the previous session

If built-in session memory is also enabled, you should additionally see the built-in workspace/session artifact.

If built-in session memory is disabled, do not treat the absence of a workspace session-summary markdown file as a plugin failure.

8. Recommended Regression Matrix

Run this matrix before release candidates or after major retrieval changes.

Integration

  • plugin loads successfully
  • gateway restart preserves plugin registration
  • hooks list shows expected hook state

Retrieval

  • exact identifier recall
  • short CJK keyword recall
  • full sentence semantic recall
  • rerank fallback behavior when reranker is unavailable

Memory lifecycle

  • fresh working memory remains retrievable
  • tier promotion/demotion does not erase useful recall

Extraction

  • create
  • merge
  • skip
  • multi-turn duplicate suppression

Session flow

  • /new triggers the intended hook path
  • plugin-only mode does not rely on built-in session artifacts
  • dual-write mode is explicit and understood

Agent bootstrap

  • newly added agent can complete its first turn
  • new agent has usable model/auth indexes

9. Troubleshooting Patterns

Search returns empty, but the store contains data

Check in this order:

  1. scope mismatch
  2. minScore / hardMinScore
  3. BM25 / lexical fallback availability
  4. rerank endpoint health
  5. lifecycle decay ordering

/new appears to do nothing

Check:

  • plugin session memory is enabled
  • plugin hook is actually registered and named
  • gateway has been restarted after the hook/config change
  • built-in hook state matches your intended design

Results became noisy after tuning

Likely cause:

  • threshold reduction solved recall at the cost of precision

Preferred fix order:

  1. lexical/BM25 improvement
  2. hybrid-fusion tuning
  3. rerank tuning
  4. threshold change

10. Upgrade and Maintenance Notes

If you maintain local patches against the OpenClaw installation itself, treat them as temporary operational fixes, not durable plugin behavior.

After any openclaw update, re-check:

  • fresh-agent bootstrap
  • model provider resolution for newly added agents
  • hook registration state
  • /new behavior

If your team depends on local installation patches, keep them in a repeatable patch or automation script outside the plugin repo.

11. Recommended Documentation Policy for Future Changes

When changing retrieval or hook behavior, update all three artifacts together:

  • user-facing README summary
  • this integration playbook
  • regression tests that prove the changed behavior

That keeps operational guidance aligned with actual behavior and prevents README drift.