-
Notifications
You must be signed in to change notification settings - Fork 1.5k
Open
Description
As of #1994, the default for Anthropic max_tokens is 4096. I had previously made it use the models' actual max tokens limits by default in #1979, but that was hitting "a ValueError if a non-streaming request is expected to be above roughly 10 minutes long": https://github.com/anthropics/anthropic-sdk-python#long-requests
I don't like this though as it's pretty surprising to users, I'd rather have it intelligently use the highest number that fits under the ValueError limit for non-streaming requests, and the actual token limit for streaming requests.
Note that we'd need something like #2067 to not hit this error on Vertex AI:
{"type":"error","error":{"type":"invalid_request_error","message":"input length and `max_tokens` exceed context limit: 189127 + 16000 > 200000, decrease
input length or `max_tokens` and try again"}
jaakdentrekhaak, mukund-el and sudonickksudonickk
Metadata
Metadata
Assignees
Labels
No labels