Skip to content

Conversation

naaa760
Copy link

@naaa760 naaa760 commented Oct 5, 2025

Description:

fixes: #17809

  • Implements message truncation for gen_ai.request.messages and vercel.ai.prompt to prevent oversized payloads from exceeding ingestion limits.

Changes:

  • Add messageTruncation.ts utility with byte size calculation and oldest-first truncation
  • Apply truncation to all AI integrations (OpenAI, Google GenAI, Anthropic, Vercel AI)
  • Set 100KB default limit for gen_ai messages
  • Preserve recent message context by removing oldest messages first

cursor[bot]

This comment was marked as outdated.

@RulaKhaled RulaKhaled self-requested a review October 6, 2025 08:38
@RulaKhaled RulaKhaled changed the title add byte size limit and oldest first truncation for gen_ai messages feat(core): Add byte size limit and oldest first truncation for gen_ai messages Oct 6, 2025
Copy link
Member

@RulaKhaled RulaKhaled left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for contributing 🎉 ! Let's address some comments before merging this

// For models.generateContent: ContentListUnion: Content | Content[] | PartUnion | PartUnion[]
if ('contents' in params) {
span.setAttributes({ [GEN_AI_REQUEST_MESSAGES_ATTRIBUTE]: JSON.stringify(params.contents) });
const contents = params.contents;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you revert the comment removal to help others understand the request structure? this could also be a string

return new TextEncoder().encode(str).length;
}

export function truncateMessagesByBytes(messages: unknown[], maxBytes: number): unknown[] {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add JSDocs for this files?

* This is only recorded if recordInputs is true.
* Handles different parameter formats for different Google GenAI methods.
*/
function addPrivateRequestAttributes(span: Span, params: Record<string, unknown>): void {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you also revert back this JSDoc comment?

}

// Extract and record AI request inputs, if present. This is intentionally separate from response attributes.
function addRequestAttributes(span: Span, params: Record<string, unknown>): void {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you also revert back this JSDoc comment?

cursor[bot]

This comment was marked as outdated.

@naaa760
Copy link
Author

naaa760 commented Oct 7, 2025

@RulaKhaled
addressed the comments
let me know!

cursor[bot]

This comment was marked as outdated.

cursor[bot]

This comment was marked as outdated.

Copy link
Member

@RulaKhaled RulaKhaled left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for resolving comments, it already looks better! I've pushed a quick commit for truncation functionality + added some JSDocs here :4d047e5 (this is a different branch) could you give this a shot? :) + can you take care of tests and the quick comments left?

} else {
i++;
bytes += 4;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's ok to keep the previous getByteSize where you use TextEncoder directly, the binary search is meant to be for:

  1. truncating strings messages
  2. truncating array of messages

e.g:

export function truncateMessagesByBytes(messages: unknown[], maxBytes: number): unknown[] {
if (!Array.isArray(messages) || messages.length === 0) {
  return messages;
}

const fullSize = getByteSize(JSON.stringify(messages));

if (fullSize <= maxBytes) {
  return messages;
}

// Binary search for the minimum startIndex where remaining messages fit (works for single or multiple messages)
let left = 0;
let right = messages.length - 1;
let bestStartIndex = messages.length;

while (left <= right) {
  const mid = Math.floor((left + right) / 2);
  const remainingMessages = messages.slice(mid);
  const remainingSize = getByteSize(JSON.stringify(remainingMessages));

  if (remainingSize <= maxBytes) {
    bestStartIndex = mid;
    right = mid - 1; // Try to keep more messages
  } else {
    // If we're down to a single message and it doesn't fit, break and handle content truncation
    if (remainingMessages.length === 1) {
      bestStartIndex = mid; // Use this single message
      break;
    }
    left = mid + 1; // Need to remove more messages
  }
}

const remainingMessages = messages.slice(bestStartIndex);

// SPECIAL CASE: Single message handling (either started with 1, or reduced to 1 after binary search)
if (remainingMessages.length === 1) {
  const singleMessage = remainingMessages[0];
  const singleMessageSize = getByteSize(JSON.stringify(singleMessage));

  // If single message fits, return it
  if (singleMessageSize <= maxBytes) {
    return remainingMessages;
  }

  // Single message is too large, try to truncate its content
  if (
    typeof singleMessage === 'object' &&
    singleMessage !== null &&
    'content' in singleMessage &&
    typeof (singleMessage as { content: unknown }).content === 'string'
  ) {
    const originalContent = (singleMessage as { content: string }).content;
    const messageWithoutContent = { ...singleMessage, content: '' };
    const otherMessagePartsSize = getByteSize(JSON.stringify(messageWithoutContent));
    const availableContentBytes = maxBytes - otherMessagePartsSize;

    if (availableContentBytes <= 0) {
      return [];
    }

    const truncatedContent = truncateStringByBytes(originalContent, availableContentBytes);
    return [{ ...singleMessage, content: truncatedContent }];
  } else {
    return [];
  }
}

// Multiple messages remain and fit within limit
return remainingMessages;
}

where truncateStringByBytes also does a quick binary search.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Implement Byte Size Limit and Oldest First Truncation for gen_ai.*.messages
2 participants