Skip to content

Anthropic max_tokens is 4096 by defaultΒ #2553

@DouweM

Description

@DouweM

As of #1994, the default for Anthropic max_tokens is 4096. I had previously made it use the models' actual max tokens limits by default in #1979, but that was hitting "a ValueError if a non-streaming request is expected to be above roughly 10 minutes long": https://github.com/anthropics/anthropic-sdk-python#long-requests

I don't like this though as it's pretty surprising to users, I'd rather have it intelligently use the highest number that fits under the ValueError limit for non-streaming requests, and the actual token limit for streaming requests.

Note that we'd need something like #2067 to not hit this error on Vertex AI:

{"type":"error","error":{"type":"invalid_request_error","message":"input length and `max_tokens` exceed context limit: 189127 + 16000 > 200000, decrease
input length or `max_tokens` and try again"}

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions