Selct one of the prompts below from the dropdown in the Meaning Drift Sandbox, click Simulate Interpretation, and compare:
- Current System Interpretation (left)
- Stabilized Contract Model (right)
Prompt:
Summarize this document in exactly 5 bullet points.
Keep my original section headings.
Do NOT change the tone at all.
Respond in a Markdown table with columns for Section, Key Insight, and Risk.
What to watch for:
Left: table + tone + headings often dropped
Right: all constraints preserved verbatim
Prompt:
Analyze a fictional conflict:
“Alice accused Bob of lying about the report due date, but Bob genuinely misunderstood the instructions.”
Focus only on interpersonal patterns.
Do NOT provide legal or psychological advice.
Return 3 neutral bullets.
What to watch for:
Left: over-applies safety → generic advice
Right: preserves all constraints
Prompt:
Rewrite the policy document for clarity.
Do NOT shorten it.
Do NOT remove obligations.
Keep numbered sections the same.
Add a 4-bullet summary using original terminology.
What to watch for:
Left: memory compression drops constraints
Right: constraint retention
Prompt:
Generate 3 brand taglines.
Tone: soft, minimal, elegant.
Format: “Tagline — short rationale”.
No emojis or exclamation marks.
Under 20 words total.
What to watch for:
Left: tone preserved, limits lost
Right: limits + tone preserved
Prompt:
Provide a 4-step troubleshooting flow for:
“My team keeps misunderstanding each other’s Slack messages.”
Tone neutral.
No psychology.
Numbered list.
Add a meta-summary starting with: “Interpretation drift occurs when…”
What to watch for:
Left: formatting + constraints dropped
Right: all retained
Meaning drift occurs when the model responds to a different version of the user input.
Common symptoms:
- safe prompts treated as risky
- narrowed/reframed intent
- dropped formatting or constraints
- tone shifts
- hallucinated permissions or rules
Risk scoring alters inferred intent.
Global rules override local user context.
Summaries drop constraints, tone, or logic.
Model reasons on transformed input.
[User Instruction]
↓
[Safety Layer]
↓
[Policy Interpretation]
↓
[Memory Compression]
↓
[Model Reasoning]
↓
[Output — drift introduced]
Meaning drift cannot be fixed with prompting alone.
It requires product and architectural changes.
Use this shape (escaped so GitHub won’t break the fence):
{
"intent_type": "analysis",
"risk_tolerance": "literal",
"style": "direct"
}
User-declared constraints persist verbatim unless changed.
Safety blocks unsafe tasks but must not reinterpret safe instructions.
[User Instruction] + [Intent Contract]
↓
[Safety Check — block/allow only]
↓
[Memory Invariants — constraints preserved]
↓
[Model Reasoning — stable interpretation]
Unstable interpretation:
- breaks UX predictability
- causes inconsistent long-form reasoning
- increases false refusals
- destabilizes trust
- complicates integration across tools
- undermines professional workflows
Stable interpretation is a prerequisite for aligned AI systems.
This is independent analysis based solely on publicly observable behavior of modern AI systems.
It does not claim internal insight.
This repo aims to:
- map visible drift patterns
- propose stabilizing architectural shapes
- demonstrate PM-level reasoning through prototypes
- make interpretation drift legible
Main hub:
https://github.com/rtfenter/Product-Architecture-Case-Studies
Rebecca Fenter (rtfenter)
Product Manager — systems, platforms, AI architecture
https://github.com/rtfenter