Skip to content

Conversation

@devin-ai-integration
Copy link
Contributor

@devin-ai-integration devin-ai-integration bot commented Nov 2, 2025

Add 800 token output cap to Ask AI to reduce AWS Bedrock costs

This PR adds a maxTokens: 800 parameter to the streamText() call in the Anthropic streaming handler to cap output token usage and reduce AWS Bedrock costs.

Context & Motivation

The current Ask AI implementation has no maxTokens parameter, while the Slack/Discord bots cap at 1000-2000 tokens. This change adds an 800 token cap to prevent runaway costs from long responses.

Changes Made

  • Added maxTokens: 800 to streamText() call in packages/fern-docs/search-server/ask-fern/src/ask-fern/stream-anthropic.ts:162

Testing

⚠️ This change has NOT been tested yet. Please verify:

  1. The parameter name maxTokens is correct for AI SDK v5.0.0-beta.2 (not maxOutputTokens)
  2. The 800 token cap is appropriate for your use case
  3. Consider a phased rollout starting with high-cost domains (ElevenLabs)
  4. Monitor answer quality metrics after deployment

Human Review Checklist

  • Verify maxTokens is the correct parameter name for AI SDK version 5.0.0-beta.2
  • Confirm 800 tokens is an appropriate cap (not too restrictive for quality)
  • Consider if this should be rolled out to all domains at once or phased
  • Plan to monitor both cost reduction AND answer quality/satisfaction metrics post-deployment
  • Verify lint checks passed

Devin session: https://app.devin.ai/sessions/d3c90a389a754e37932c1850826ece6b
Requested by: [email protected] (@sahil485)

Add maxTokens: 800 parameter to streamText() call in stream-anthropic.ts
to cap output token usage and reduce costs. This prevents unbounded response
lengths which were contributing to high AWS Bedrock bills.

Co-Authored-By: [email protected] <[email protected]>
@vercel
Copy link
Contributor

vercel bot commented Nov 2, 2025

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Preview Updated (UTC)
dev.ferndocs.com Ready Ready Preview Nov 2, 2025 3:08am
fern-dashboard Ready Ready Preview Nov 2, 2025 3:08am
fern-dashboard-dev Ready Ready Preview Nov 2, 2025 3:08am
ferndocs.com Ready Ready Preview Nov 2, 2025 3:08am
preview.ferndocs.com Ready Ready Preview Nov 2, 2025 3:08am
prod-assets.ferndocs.com Ready Ready Preview Nov 2, 2025 3:08am
prod.ferndocs.com Ready Ready Preview Nov 2, 2025 3:08am
1 Skipped Deployment
Project Deployment Preview Updated (UTC)
fern-platform Ignored Ignored Nov 2, 2025 3:08am

@devin-ai-integration
Copy link
Contributor Author

🤖 Devin AI Engineer

I'll be helping with this pull request! Here's what you should know:

✅ I will automatically:

  • Address comments on this PR. Add '(aside)' to your comment to have me ignore it.
  • Look at CI failures and help fix them

Note: I can only respond to comments from users who have write access to this repository.

⚙️ Control Options:

  • Disable automatic comment and CI monitoring

Fix TypeScript error by changing maxTokens to maxOutputTokens, which is
the correct parameter name for AI SDK v5.0.0-beta.2. Also apply the same
cap to stream-cohere.ts for consistency and add shared MAX_OUTPUT_TOKENS
constant to stream-constants.ts.

Co-Authored-By: [email protected] <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant