-
-
Notifications
You must be signed in to change notification settings - Fork 1.7k
feat(core): Add byte size limit and oldest first truncation for gen_ai messages #17863
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for contributing 🎉 ! Let's address some comments before merging this
// For models.generateContent: ContentListUnion: Content | Content[] | PartUnion | PartUnion[] | ||
if ('contents' in params) { | ||
span.setAttributes({ [GEN_AI_REQUEST_MESSAGES_ATTRIBUTE]: JSON.stringify(params.contents) }); | ||
const contents = params.contents; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you revert the comment removal to help others understand the request structure? this could also be a string
return new TextEncoder().encode(str).length; | ||
} | ||
|
||
export function truncateMessagesByBytes(messages: unknown[], maxBytes: number): unknown[] { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add JSDocs for this files?
* This is only recorded if recordInputs is true. | ||
* Handles different parameter formats for different Google GenAI methods. | ||
*/ | ||
function addPrivateRequestAttributes(span: Span, params: Record<string, unknown>): void { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you also revert back this JSDoc comment?
} | ||
|
||
// Extract and record AI request inputs, if present. This is intentionally separate from response attributes. | ||
function addRequestAttributes(span: Span, params: Record<string, unknown>): void { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you also revert back this JSDoc comment?
6d97149
to
50d1d6d
Compare
@RulaKhaled |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for resolving comments, it already looks better! I've pushed a quick commit for truncation functionality + added some JSDocs here :4d047e5 (this is a different branch) could you give this a shot? :) + can you take care of tests and the quick comments left?
} else { | ||
i++; | ||
bytes += 4; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's ok to keep the previous getByteSize
where you use TextEncoder directly, the binary search is meant to be for:
- truncating strings messages
- truncating array of messages
e.g:
export function truncateMessagesByBytes(messages: unknown[], maxBytes: number): unknown[] {
if (!Array.isArray(messages) || messages.length === 0) {
return messages;
}
const fullSize = getByteSize(JSON.stringify(messages));
if (fullSize <= maxBytes) {
return messages;
}
// Binary search for the minimum startIndex where remaining messages fit (works for single or multiple messages)
let left = 0;
let right = messages.length - 1;
let bestStartIndex = messages.length;
while (left <= right) {
const mid = Math.floor((left + right) / 2);
const remainingMessages = messages.slice(mid);
const remainingSize = getByteSize(JSON.stringify(remainingMessages));
if (remainingSize <= maxBytes) {
bestStartIndex = mid;
right = mid - 1; // Try to keep more messages
} else {
// If we're down to a single message and it doesn't fit, break and handle content truncation
if (remainingMessages.length === 1) {
bestStartIndex = mid; // Use this single message
break;
}
left = mid + 1; // Need to remove more messages
}
}
const remainingMessages = messages.slice(bestStartIndex);
// SPECIAL CASE: Single message handling (either started with 1, or reduced to 1 after binary search)
if (remainingMessages.length === 1) {
const singleMessage = remainingMessages[0];
const singleMessageSize = getByteSize(JSON.stringify(singleMessage));
// If single message fits, return it
if (singleMessageSize <= maxBytes) {
return remainingMessages;
}
// Single message is too large, try to truncate its content
if (
typeof singleMessage === 'object' &&
singleMessage !== null &&
'content' in singleMessage &&
typeof (singleMessage as { content: unknown }).content === 'string'
) {
const originalContent = (singleMessage as { content: string }).content;
const messageWithoutContent = { ...singleMessage, content: '' };
const otherMessagePartsSize = getByteSize(JSON.stringify(messageWithoutContent));
const availableContentBytes = maxBytes - otherMessagePartsSize;
if (availableContentBytes <= 0) {
return [];
}
const truncatedContent = truncateStringByBytes(originalContent, availableContentBytes);
return [{ ...singleMessage, content: truncatedContent }];
} else {
return [];
}
}
// Multiple messages remain and fit within limit
return remainingMessages;
}
where truncateStringByBytes also does a quick binary search.
3ab7059
to
c91de63
Compare
Description:
fixes: #17809
gen_ai.request.messages
andvercel.ai.prompt
to prevent oversized payloads from exceeding ingestion limits.Changes:
messageTruncation.ts
utility with byte size calculation and oldest-first truncation