Token Usage #971

lweiler-lab · 2026-01-20T06:57:18Z

lweiler-lab
Jan 20, 2026

Since using V3 my token usage has increased significantly. I am burning thour my weekly limit in a two days. Anyone experiencing this also? Any Ideas how improve that? At this state I have to stop using V3.

UrRhb · 2026-03-25T12:12:41Z

UrRhb
Mar 25, 2026

Had the same experience with multi-agent setups — V3's orchestration layer adds significant overhead because each agent pass includes the full conversation context plus coordination prompts.

A few things that helped me get token usage under control:

1. Instrument first, optimize second
Before changing anything, figure out where the tokens are going. In my case, 70%+ of spend was in the coordinator/planner steps, not the actual worker agents. Without per-request visibility you're optimizing blind.

If you're using the Node.js SDK directly, burn0 can give you per-request cost breakdowns with a single import — it intercepts HTTP calls and logs exactly what each agent step costs. Helped me identify that one summarization step was re-sending the entire research context (~80k tokens) when it only needed the conclusions.

2. Context window management
Multi-agent frameworks tend to pass full context between steps. Look for opportunities to:

Summarize intermediate results before passing to the next agent
- Use max_tokens on responses to prevent unnecessarily verbose outputs
- If using Claude, leverage prompt caching — repeated system prompts across agent steps get cached at ~90% discount
  3. Model routing by task complexity
  Not every agent step needs the most capable model. Simple classification or routing decisions can use a smaller/cheaper model while reserving the flagship model for complex reasoning steps. The cost difference can be 10-50x.

4. Check for retry loops
V3's error handling sometimes silently retries failed tool calls, which can compound token usage fast. Watch for patterns where a single user request generates multiple API calls to the same model in quick succession.

The weekly limit issue suggests your usage jumped 3-4x, which lines up with what I've seen when moving to more sophisticated orchestration patterns. The tokens-per-task metric is the one to watch.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Token Usage #971

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Token Usage #971

Uh oh!

lweiler-lab Jan 20, 2026

Replies: 1 comment

Uh oh!

UrRhb Mar 25, 2026

lweiler-lab
Jan 20, 2026

UrRhb
Mar 25, 2026