Skip to content

Commit 1cb90cc

Browse files
committed
Fix calibration using DB message count instead of compressed window count
After a compressed turn (layers 1-4), calibrate() was called with withParts.length — the total number of messages in the DB — instead of the number of messages actually sent to the model (the compressed window). On the next turn, newMsgCount = dbCount - dbCount = near-zero, so expectedInput ≈ lastKnownInput (the compressed prompt size, e.g. 114K). Since 114K < maxInput (168K), layer 0 fires and sends all messages uncompressed → overflow (405K on a 200K-limit model). Fix: transform() now sets lastTransformedCount = result.messages.length via a thin public wrapper around the renamed transformInner(). The event handler uses getLastTransformedCount() for calibration instead of withParts.length. On layer 0 these are equal; on layers 1-4, the delta on the next turn is now computed relative to the compressed window.
1 parent 6af2b88 commit 1cb90cc

File tree

3 files changed

+39
-3
lines changed

3 files changed

+39
-3
lines changed

package.json

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
{
22
"name": "opencode-lore",
3-
"version": "0.2.2",
3+
"version": "0.2.3",
44
"type": "module",
55
"license": "MIT",
66
"description": "Three-tier memory architecture for OpenCode — distillation, not summarization",

src/gradient.ts

Lines changed: 32 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -51,6 +51,18 @@ let lastKnownLtm = 0;
5151
let lastKnownSessionID: string | null = null;
5252
let lastKnownMessageCount = 0;
5353

54+
// Number of messages in the most recent transform() output — i.e. how many
55+
// messages were actually sent to the model. On layer 0 this equals the full
56+
// session length. On layers 1-4 it equals the compressed window size.
57+
// Calibration must use this count (not the total DB message count) so that
58+
// the delta on the next turn reflects only messages added since the last
59+
// compressed window, not since the last DB snapshot.
60+
let lastTransformedCount = 0;
61+
62+
export function getLastTransformedCount(): number {
63+
return lastTransformedCount;
64+
}
65+
5466
// --- Force escalation ---
5567
// Set when the API returns "prompt is too long" — forces the transform to skip
5668
// layer 0 (and optionally layer 1) on the next call to ensure the context is
@@ -139,6 +151,7 @@ export function resetCalibration() {
139151
lastKnownLtm = 0;
140152
lastKnownSessionID = null;
141153
lastKnownMessageCount = 0;
154+
lastTransformedCount = 0;
142155
forceMinLayer = 0;
143156
}
144157

@@ -691,7 +704,7 @@ export function needsUrgentDistillation(): boolean {
691704
return v;
692705
}
693706

694-
export function transform(input: {
707+
function transformInner(input: {
695708
messages: MessageWithParts[];
696709
projectPath: string;
697710
sessionID?: string;
@@ -890,6 +903,24 @@ export function transform(input: {
890903
};
891904
}
892905

906+
// Public wrapper: records the compressed message count for calibration.
907+
// Calibration needs to know how many messages were SENT to the model (the
908+
// compressed window), not the total DB count. On layer 0 these are equal;
909+
// on layers 1-4 the compressed window is smaller, and the delta on the next
910+
// turn must be computed relative to the compressed count — otherwise the
911+
// expected input on the next turn is anchored to the compressed input token
912+
// count but the "new messages" delta is computed against the full DB count,
913+
// making newMsgCount ≈ 0 and causing layer 0 passthrough on an overflowing session.
914+
export function transform(input: {
915+
messages: MessageWithParts[];
916+
projectPath: string;
917+
sessionID?: string;
918+
}): TransformResult {
919+
const result = transformInner(input);
920+
lastTransformedCount = result.messages.length;
921+
return result;
922+
}
923+
893924
// Compute our message-only estimate for a set of messages (for calibration use)
894925
export function estimateMessages(messages: MessageWithParts[]): number {
895926
return messages.reduce((sum, m) => sum + estimateMessage(m), 0);

src/index.ts

Lines changed: 6 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -14,6 +14,7 @@ import {
1414
setLtmTokens,
1515
getLtmBudget,
1616
setForceMinLayer,
17+
getLastTransformedCount,
1718
} from "./gradient";
1819
import { formatKnowledge } from "./prompt";
1920
import { createRecallTool } from "./reflect";
@@ -219,7 +220,11 @@ export const LorePlugin: Plugin = async (ctx) => {
219220
const msgEstimate = estimateMessages(withParts);
220221
const actualInput =
221222
msg.tokens.input + msg.tokens.cache.read + msg.tokens.cache.write;
222-
calibrate(actualInput, msgEstimate, msg.sessionID, withParts.length);
223+
// Use the compressed message count (from the last transform output),
224+
// not the total DB count. On layer 0 these are equal. On layers 1-4,
225+
// the model only saw the compressed window — calibrate must track that
226+
// count so the next turn's delta is computed correctly.
227+
calibrate(actualInput, msgEstimate, msg.sessionID, getLastTransformedCount() || withParts.length);
223228
}
224229
}
225230
}

0 commit comments

Comments
 (0)