Fix metadata token counting for Anthropic format responses #109

b3nw · 2026-01-28T01:12:05Z

Fixes metadata token counting for Anthropic-format API responses (used by /v1/messages endpoint).

The _log_metadata method in TransactionLogger only supported OpenAI format usage keys (prompt_tokens, completion_tokens) but Anthropic responses use different keys (input_tokens, output_tokens). This caused null token counts in metadata.json for providers like dedaluslabs and firmware when using the Anthropic-compatible /v1/messages endpoint.

Changes:

Add fallback from OpenAI to Anthropic format for token counts (prompt_tokens → input_tokens, completion_tokens → output_tokens)
Use explicit None checks instead of or to correctly handle 0 values
Calculate total_tokens if missing from Anthropic responses (sum of input + output)
Handle stop_reason (Anthropic format) as well as finish_reason (OpenAI format)

Testing Done:

Verified dedaluslabs and firmware providers now log token counts correctly in metadata.json when using /v1/messages endpoint
Confirmed OpenAI format responses continue to work unchanged
Tested edge case where token counts are 0 (now correctly logged as 0 instead of falling back)

Important

Fixes token counting for Anthropic-format API responses in TransactionLogger by adding support for Anthropic keys and handling zero token counts.

Behavior:
- Fixes token counting for Anthropic-format API responses in _log_metadata() of transaction_logger.py.
- Adds fallback from OpenAI format keys (prompt_tokens, completion_tokens) to Anthropic format keys (input_tokens, output_tokens).
- Uses explicit None checks to handle zero token counts correctly.
- Calculates total_tokens if missing in Anthropic responses by summing input_tokens and output_tokens.
- Handles stop_reason (Anthropic) alongside finish_reason (OpenAI).
Testing:
- Verified correct logging of token counts for dedaluslabs and firmware providers using /v1/messages endpoint.
- Confirmed unchanged behavior for OpenAI format responses.
- Tested edge case for zero token counts, ensuring correct logging as 0.

^{This description was created by}^{for 6a5f601. You can customize this summary. It will automatically update as commits are pushed.}

Pre-calculate input tokens before streaming starts so message_start contains accurate input_tokens. This matches Anthropic's native API behavior where input_tokens is provided upfront. Previously, message_start always had input_tokens=0 because OpenAI-format streams only provide usage data in the final chunk. Claude Code reads current_usage from message_start, causing used_percentage to always be 0. Changes: - Add precalculated_input_tokens parameter to anthropic_streaming_wrapper - Calculate tokens via token_count() before streaming in AnthropicHandler - Use precalculated value for message_start usage dict - Include input_tokens in message_delta for total_input_tokens accumulation

mirrobot-agent · 2026-01-30T05:15:12Z

I'm starting my review of the fix for Anthropic metadata token counting. I'll be looking closely at how token counts are handled in both streaming and non-streaming responses. Back soon with the results! 🔍

mirrobot-agent

Overall Assessment

This is a solid fix for the token counting issues observed with Anthropic-format responses. The approach of providing fallback keys and handling zero values explicitly is robust and correctly addresses the underlying problem where token counts would appear as null for certain providers.

The addition of precalculated_input_tokens for streaming responses is a significant UX improvement. By calculating tokens upfront, the message_start event now reports accurate input tokens immediately, matching Anthropic's native API behavior and avoiding the confusion of seeing 0 tokens until the end of the stream.

Architectural Feedback

The integration of pre-calculated tokens into the AnthropicHandler and anthropic_streaming_wrapper is well-placed. It leverages the existing token_count infrastructure effectively without introducing unnecessary complexity.

Key Suggestions

Streaming Usage Fallbacks: While the TransactionLogger was updated to handle both OpenAI and Anthropic usage keys, the anthropic_streaming_wrapper (around lines 223-224) still only checks for prompt_tokens and completion_tokens in individual chunks. For full consistency, consider adding fallback checks for input_tokens and output_tokens there as well. This ensures that if a provider returns Anthropic-style usage in an OpenAI-formatted stream, it is correctly captured and updated from the chunks.

Questions for the Author

None. The implementation is clear and the testing covers the relevant edge cases well.

This review was generated by an AI assistant.

b3nw requested a review from Mirrowel as a code owner January 28, 2026 01:12

b3nw force-pushed the fix/metadata-token-counting branch from 6a5f601 to 9c2436d Compare January 28, 2026 03:16

b3nw force-pushed the fix/metadata-token-counting branch from 9c2436d to 0ba756b Compare January 28, 2026 04:36

mirrobot-agent bot approved these changes Jan 30, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix metadata token counting for Anthropic format responses #109

Fix metadata token counting for Anthropic format responses #109

b3nw commented Jan 28, 2026 •

edited by ellipsis-dev bot

Loading

Uh oh!

mirrobot-agent bot commented Jan 30, 2026

Uh oh!

mirrobot-agent bot left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Fix metadata token counting for Anthropic format responses #109

Are you sure you want to change the base?

Fix metadata token counting for Anthropic format responses #109

Conversation

b3nw commented Jan 28, 2026 • edited by ellipsis-dev bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mirrobot-agent bot commented Jan 30, 2026

Uh oh!

mirrobot-agent bot left a comment

Choose a reason for hiding this comment

Overall Assessment

Architectural Feedback

Key Suggestions

Questions for the Author

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

b3nw commented Jan 28, 2026 •

edited by ellipsis-dev bot

Loading