Skip to content

Stabilize memory context output (roadmap Sprint 1)#123

Merged
besfeng23 merged 1 commit into
mainfrom
fix/pandora-context-stabilization
Jul 3, 2026
Merged

Stabilize memory context output (roadmap Sprint 1)#123
besfeng23 merged 1 commit into
mainfrom
fix/pandora-context-stabilization

Conversation

@besfeng23

@besfeng23 besfeng23 commented Jul 3, 2026

Copy link
Copy Markdown
Owner

What

Roadmap Sprint 1 (#1 — "stabilize output"). Sharper retrieval, less dumping. Deterministic; no gated features enabled.

Problem (observed live)

extractPeopleMentions treated every capitalized word as a person and pushed the event id once per match occurrence — so people_map filled with junk (The, Do, She, He, Keep, User) and a single event's id repeated dozens of times, ballooning get_memory_context to ~50k chars.

Changes — lib/services/memory-distillation-service.ts

  • Stoplist junk capitalized sentence-openers / pronouns / imperatives / domain labels (The/Do/She/He/Keep/User/Taglish/...).
  • Dedup — each event contributes a given name at most once (no per-occurrence id duplication).
  • Alias canonicalization — merge a single-token name into its full name (JanineJanine Tan); keep genuinely distinct aliases (Jana) separate.
  • Caps — max 12 people, max 8 event_ids per person.
  • Payload slimmingcompactContextResponse enforces a deterministic max_payload_chars budget (default 12000), trimming the heaviest fields first (people ids/notes → list lengths → summary). debug: true returns the full payload.
  • New optional max_payload_chars / debug inputs on memoryContextInputSchema.

Schema changes

None.

Tests — tests/unit/context-stabilization.test.ts (5/5)

Junk removal, per-event dedup, alias merge vs. distinct alias, caps, and payload-budget slimming (+ debug bypass).

Verification

  • typecheck ✅ · lint ✅ (0 errors) · build ✅
  • tests: 554 passed (5 new). The 1 failure (first-reviewed-memory-fixturespawnSync npm ENOENT) is the pre-existing sandbox-only flake in an untouched file.

Hard-rules compliance

🤖 Generated with Claude Code

Summary by CodeRabbit

  • New Features

    • Added a size limit option for generated context responses, with an optional debug mode to return the full payload.
    • Improved people-name detection so results are more consistent, deduplicated, and capped when many mentions are present.
  • Bug Fixes

    • Reduced noisy or irrelevant capitalized phrases from being treated as people names.
    • Prevented duplicate event entries for the same person and improved alias matching for name variants.
  • Tests

    • Added coverage for context trimming, deduplication, alias handling, and payload-size behavior.

Sharper retrieval, less dumping. Deterministic; no gated features.

- people_map: stoplist junk capitalized sentence-openers (The/Do/She/He/
  Keep/User/Taglish/...), so they stop being treated as "people".
- Dedup: each event contributes a name at most once (kills per-occurrence
  event_id duplication that ballooned payloads).
- Alias canonicalization: merge a single-token name into its full name
  (Janine -> Janine Tan); keep distinct aliases (Jana) separate.
- Cap sizes: max 12 people, max 8 event_ids per person.
- Payload slimming: compactContextResponse now enforces a deterministic
  max_payload_chars budget (default 12000), trimming heaviest fields first;
  new max_payload_chars + debug inputs on the memory-context schema.
- Tests: junk removal, dedup, alias merge, caps, and payload-budget slimming.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@vercel

vercel Bot commented Jul 3, 2026

Copy link
Copy Markdown

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
memory Ready Ready Preview, Comment Jul 3, 2026 1:34pm

@coderabbitai

coderabbitai Bot commented Jul 3, 2026

Copy link
Copy Markdown

Review Change Stack

📝 Walkthrough

Walkthrough

Reworks extractPeopleMentions to filter capitalized names via stopwords/length rules, dedupe per event, canonicalize aliases, and cap output counts via new opts. Extends compactContextResponse with max_payload_chars/debug options and a slimContextResponse trimming routine. Updates MCP schema and adds unit tests.

Changes

Context stabilization

Layer / File(s) Summary
Person-name filtering and extraction rules
lib/services/memory-distillation-service.ts
Adds stopword set and isLikelyPersonName helper, canonicalizePeople alias merging, and reworks extractPeopleMentions for event-scoped dedup, filtering, and configurable caps via opts.
Payload size trimming and compaction options
lib/services/memory-distillation-service.ts, lib/services/pandora-mcp-tools.ts
Adds payloadChars, DEFAULT_MAX_PAYLOAD_CHARS, and slimContextResponse for staged trimming; extends compactContextResponse with max_payload_chars/debug; adds matching optional fields to memoryContextInputSchema.
Unit tests for extraction and compaction
tests/unit/context-stabilization.test.ts
Adds tests for junk filtering, per-event dedup, alias merging, count caps, and payload trimming/debug bypass.

Estimated code review effort: 3 (Moderate) | ~25 minutes

Sequence Diagram(s)

sequenceDiagram
    participant Caller
    participant compactContextResponse
    participant slimContextResponse

    Caller->>compactContextResponse: pack, events, input(max_payload_chars, debug)
    compactContextResponse->>compactContextResponse: build default payload
    alt debug true
        compactContextResponse-->>Caller: full payload
    else payload oversized
        compactContextResponse->>slimContextResponse: payload, maxChars
        slimContextResponse->>slimContextResponse: trim people_map, lists, summary
        slimContextResponse-->>compactContextResponse: trimmed payload
        compactContextResponse-->>Caller: trimmed payload
    end
Loading

Suggested labels: codex

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Description check ⚠️ Warning The description does not follow the repository template and omits the required env var/security checklist items. Rewrite the PR description to use the checklist template and explicitly state whether env vars or secrets were added, cataloged, or exposed.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title is concise and accurately summarizes the main change: stabilizing memory context output.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/pandora-context-stabilization

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands.

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 7c35f37234

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +165 to +168
response.key_points = list(response.key_points).slice(0, 6);
response.open_loops = list(response.open_loops).slice(0, 6);
response.risks = list(response.risks).slice(0, 6);
response.active_projects = list(response.active_projects).slice(0, 6);

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Include decisions in payload slimming

When an active context pack is decision-heavy (or has older persisted decisions entries with large text), the new max_payload_chars cap can still be exceeded because slimming truncates key_points, open_loops, risks, and active_projects but leaves decisions unbounded. In that scenario get_memory_context can continue returning oversized payloads despite the new budget, so decisions needs to be capped/trimmed as part of the same slimming pass or the function should re-check and keep trimming until it is actually under budget.

Useful? React with 👍 / 👎.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
lib/services/memory-distillation-service.ts (1)

157-173: 🚀 Performance & Scalability | 🔵 Trivial | ⚡ Quick win

Add a final cap before returning. slimContextResponse still leaves decisions and operating_rules untouched, so the last return can exceed maxChars; trim those fields or hard-cap the remaining lists before returning.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@lib/services/memory-distillation-service.ts` around lines 157 - 173,
slimContextResponse still can return payloads over maxChars because the final
fallback only trims people_map and summary while leaving fields like decisions
and operating_rules untouched. Update slimContextResponse to apply a last-pass
cap before the final return by trimming those remaining large arrays/collections
or otherwise hard-capping the remaining content after the existing payloadChars
checks, so the function can guarantee the response stays within maxChars.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@lib/services/pandora-mcp-tools.ts`:
- Line 19: The `get_memory_context` tool registration schema in
`pandora-mcp-server` is missing `max_payload_chars` and `debug`, even though
`memoryContextInputSchema` already defines them. Update the schema used when
registering `getMemoryContextTool` so it mirrors `memoryContextInputSchema`
exactly and accepts these two optional fields, ensuring they reach
`getMemoryContextTool` without MCP validation stripping or rejecting them.

---

Nitpick comments:
In `@lib/services/memory-distillation-service.ts`:
- Around line 157-173: slimContextResponse still can return payloads over
maxChars because the final fallback only trims people_map and summary while
leaving fields like decisions and operating_rules untouched. Update
slimContextResponse to apply a last-pass cap before the final return by trimming
those remaining large arrays/collections or otherwise hard-capping the remaining
content after the existing payloadChars checks, so the function can guarantee
the response stays within maxChars.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: b659656e-26e4-48e5-a308-0ed890171a16

📥 Commits

Reviewing files that changed from the base of the PR and between 9115389 and 7c35f37.

📒 Files selected for processing (3)
  • lib/services/memory-distillation-service.ts
  • lib/services/pandora-mcp-tools.ts
  • tests/unit/context-stabilization.test.ts

const namespaceSchema = z.enum(["real_life", "au"]);
export const latestContextPackInputSchema = z.object({ namespace: namespaceSchema, pack_type: z.enum(["daily", "master"]).optional() });
export const memoryContextInputSchema = z.object({ namespace: namespaceSchema, query: z.string().optional(), current_task: z.string().optional(), max_items: z.number().int().positive().max(100).optional(), include_risks: z.boolean().optional(), include_people: z.boolean().optional(), include_projects: z.boolean().optional() });
export const memoryContextInputSchema = z.object({ namespace: namespaceSchema, query: z.string().optional(), current_task: z.string().optional(), max_items: z.number().int().positive().max(100).optional(), include_risks: z.boolean().optional(), include_people: z.boolean().optional(), include_projects: z.boolean().optional(), max_payload_chars: z.number().int().positive().max(200000).optional(), debug: z.boolean().optional() });

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🗄️ Data Integrity & Integration | 🟡 Minor | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Confirm the server registration schema and whether args are passed through untouched.
rg -nP 'registerTool\(\s*"get_memory_context"' -A3 lib/services/pandora-mcp-server.ts
rg -nP 'max_payload_chars|debug' lib/services/pandora-mcp-server.ts

Repository: besfeng23/Memory

Length of output: 1775


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Inspect the tool implementation and the surrounding validation flow.
ast-grep outline lib/services/pandora-mcp-tools.ts --view expanded
sed -n '1,260p' lib/services/pandora-mcp-tools.ts | cat -n

echo
echo '---'
echo

# Find the exact handler signature and any re-parsing of args.
rg -n "getMemoryContextTool|memoryContextInputSchema|parse\(|safeParse\(|args: unknown|inputSchema" lib/services/pandora-mcp-tools.ts lib/services/pandora-mcp-server.ts

Repository: besfeng23/Memory

Length of output: 47229


🏁 Script executed:

#!/bin/bash
set -euo pipefail

rg -n '"`@modelcontextprotocol/sdk`"|mcp' package.json package-lock.json pnpm-lock.yaml yarn.lock

Repository: besfeng23/Memory

Length of output: 388


🌐 Web query:

@modelcontextprotocol/sdk Server.registerTool inputSchema unknown properties handler args stripped or preserved

💡 Result:

In the Model Context Protocol (MCP) TypeScript SDK, the handling of input arguments depends on how you define your inputSchema and whether the schema effectively validates the incoming data. 1. Schema Validation and Stripping: The inputSchema provided to registerTool is used to validate incoming JSON-RPC tool call arguments [1][2]. If a schema is defined but does not explicitly include specific properties, or if validation fails or strips them, those properties will not reach your tool handler [3]. The SDK is designed to ensure that the arguments passed to your handler strictly conform to the provided inputSchema [1][4]. 2. Standard Schema and Zod Support: The SDK (as of v2) natively supports any "Standard Schema" library (e.g., Zod) [1][4]. When you pass a proper z.object({...}) schema, the SDK uses it to parse and validate the arguments [1][2]. If your handler receives an empty object {} even when the client sends data, it often indicates that the schema provided was incorrectly constructed (e.g., passing a raw object shape instead of a wrapped Zod object in older versions) or that the validation logic stripped the unexpected fields [3]. 3. Compatibility for Raw Shapes: While earlier versions of the SDK experienced issues where raw Zod shapes were incorrectly handled or caused arguments to be stripped [3][5], recent versions include compatibility layers that auto-wrap raw Zod shapes into z.object({...}) to prevent this behavior [6][7][8]. Recommendation: To ensure all properties are preserved, always wrap your input definitions in z.object({...}) [1][7][2]. If you continue to see arguments being stripped, verify that your schema definition matches the structure of the incoming data exactly; the SDK will discard any properties not defined in the schema to ensure strict adherence to the protocol's type safety requirements [1][3].

Citations:


🌐 Web query:

site:github.com modelcontextprotocol sdk registerTool inputSchema args unknown properties

💡 Result:

In the Model Context Protocol (MCP) SDK, handling of "unknown" or additional properties within a tool's inputSchema is determined by the underlying JSON Schema validation settings. MCP specifications (via SEP-2106 and related initiatives) align tool schemas with JSON Schema 2020-12 [1][2]. Under this specification, tool inputs are not strictly prohibited from containing unknown properties by default, but the behavior depends on how you define your schema [3][4]: 1. Default Behavior: If you define your inputSchema simply as an object with properties, the JSON Schema default behavior allows for additional properties [3][4]. 2. Restricting Unknowns: To explicitly reject unknown properties, you should define your schema with "additionalProperties": false [3][5]. 3. TypeScript SDK Behavior: The MCP TypeScript SDK uses Zod for validation [6][7]. When using Zod schemas for inputSchema: - If you want validation to strictly reject unknown properties, use z.object({...}).strict() [8]. - If you want to preserve or allow extra properties, you can use .passthrough() [8]. - Be aware that in some SDK versions, if you do not explicitly set strictness, there may be nuances in how Zod translates to JSON Schema regarding the additionalProperties field [8]. If you are encountering issues where extra properties are being dropped or blocked, verify if your Zod schema is configured with .strict() (which blocks extras) or if the SDK version you are using is applying default JSON Schema behaviors that differ from your expectations [8]. As of recent updates, the SDK is designed to be more compliant with passing through full JSON Schema 2020-12 keywords [2][9].

Citations:


🌐 Web query:

"@modelcontextprotocol/sdk" registerTool inputSchema handler args validate preserve unknown keys

💡 Result:

In the MCP TypeScript SDK (v2), the registerTool method associates a tool with a name, metadata, and a handler [1]. The inputSchema parameter defines the argument structure and handles validation using the Standard Schema interface [2][3]. Key details regarding your query: 1. Input Schema and Validation: The registerTool method expects an inputSchema that conforms to the Standard Schema specification (e.g., Zod, ArkType, or Valibot) [2][3]. This schema is used by the SDK to advertise the tool's capabilities via tools/list and to validate incoming arguments during tools/call [3]. 2. Handling Unknown Keys: The behavior regarding unknown keys in the input depends on the underlying schema library configuration (e.g., Zod) [4]. By default, Zod objects often "strip" unknown keys [4]. To change this behavior: - To reject unknown keys: Use.strict on your Zod schema [4]. - To preserve unknown keys: Use.passthrough on your Zod schema [4]. 3. Handler Arguments: When an inputSchema is provided, the handler is invoked with (args, ctx) [5]. The args object contains the validated and processed parameters [5]. 4. Compatibility: While v2 natively requires a full Standard Schema object (e.g., z.object({...})), recent updates (as of April 2026) added compatibility to automatically wrap "raw" Zod shapes (e.g., { key: z.string }) into a z.object for convenience [6][1]. Important Note on Schema Types: Passing a plain JSON Schema object (instead of a Standard Schema compatible object) can cause the SDK to misinterpret the schema or silently drop it [7]. If you need to use a raw JSON Schema, you must wrap it using a tool like fromJsonSchema (available in @modelcontextprotocol/core) with an appropriate validator [8].

Citations:


🌐 Web query:

site:github.com/modelcontextprotocol/typescript-sdk registerTool inputSchema zod.parse handler

💡 Result:

In the Model Context Protocol (MCP) TypeScript SDK, the registerTool method allows you to define a tool with a schema for its arguments (inputSchema) and a handler function to execute the tool's logic [1][2]. Standard Pattern The recommended, native way to define the inputSchema is by passing a Standard Schema object, such as a Zod schema created with z.object({...}) [3][4][5]. When defined this way, the SDK automatically handles validation, and the handler receives the validated arguments as an object [3][6]. Example: server.registerTool( 'calculate-bmi', { title: 'BMI Calculator', description: 'Calculate Body Mass Index', inputSchema: z.object({ weightKg: z.number, heightM: z.number }), }, async ({ weightKg, heightM }) => { // Handler logic const bmi = weightKg / (heightM * heightM); return { content: [{ type: 'text', text: BMI is ${bmi} }], }; }); Compatibility for Raw Shapes For backward compatibility, the SDK also accepts raw Zod shapes (e.g., { weightKg: z.number(), heightM: z.number() }) [4][6]. If you pass a raw object shape, the SDK internally auto-wraps it with z.object() at runtime [4][6]. This usage is deprecated and intended to be removed in future major versions [6]. Key Notes: - Validation: The SDK uses the provided schema to validate the tool input before it reaches your handler [7]. - JSON Schema: The SDK uses these schemas to inform clients about the tool's expected input via the tools/list protocol message [7][5]. - Union Types: Support for complex schemas like z.union or z.discriminatedUnion is implemented to ensure they are correctly exposed to clients and validated [7]. - Library Agnostic: With the adoption of Standard Schema, the SDK is no longer strictly coupled to Zod; other libraries that support the Standard Schema interface can also be used [5][8].

Citations:


🌐 Web query:

site:github.com/modelcontextprotocol/typescript-sdk registerTool inputSchema zod.parse handler args unknown keys

💡 Result:

When working with Zod schemas in the Model Context Protocol (MCP) TypeScript SDK, the handling of unknown keys depends on the Zod configuration and the SDK version [1][2]. Default Zod object validation behavior is to strip unknown keys [1]. If you want your tool handlers or schema validation to behave differently, you can use the following Zod methods: 1..strip: This is the default Zod behavior. Any keys present in the input that are not defined in your schema are removed during validation [1]. 2..passthrough: Use this method if you want to preserve unknown keys in the validated output [1][3]. 3..strict: Use this method if you want validation to fail when unknown keys are present [1]. In the context of the MCP SDK, there has been a transition toward using default (stripping) behavior to improve type safety and consistency, moving away from older SDK versions that heavily utilized.passthrough [2]. If you are encountering issues where arguments are missing or being stripped unexpectedly: - Ensure you are passing the schema object correctly to the tool registration method [4]. - If you need to allow additional parameters that are not explicitly defined in your schema, chain.passthrough to your Zod object definition [1][3]. - When working with Zod v4 compatibility in newer SDK versions, be aware of how internal parsing methods are handled, as the SDK has undergone changes to maintain cross-version compatibility [5][6]. For general Zod parsing (outside of the direct tool handler flow), you can use Zod's.parse or.safeParse methods, which respect the.strip/.passthrough/.strict configuration defined on the schema [7][8].

Citations:


Mirror max_payload_chars and debug in get_memory_context. lib/services/pandora-mcp-server.ts:12 omits those fields from the tool registration schema, so MCP validation can drop or reject them before getMemoryContextTool sees them.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@lib/services/pandora-mcp-tools.ts` at line 19, The `get_memory_context` tool
registration schema in `pandora-mcp-server` is missing `max_payload_chars` and
`debug`, even though `memoryContextInputSchema` already defines them. Update the
schema used when registering `getMemoryContextTool` so it mirrors
`memoryContextInputSchema` exactly and accepts these two optional fields,
ensuring they reach `getMemoryContextTool` without MCP validation stripping or
rejecting them.

@besfeng23 besfeng23 merged commit 7c35f37 into main Jul 3, 2026
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants