Skip to content

Commit 1bf9cee

Browse files
authored
feat: Add cost tracking for API usage (#324)
Adds comprehensive cost tracking functionality to the CLI. Includes: - Real-time cost calculation based on token usage - Default pricing for 30+ models across all major providers - New /cost command to view session breakdowns - Status bar integration showing current session cost - Configurable pricing with support for custom models This helps users monitor and manage their API spending directly within the CLI. ## Key Changes ### 1. **New Pricing Configuration** (`config/pricing.go`) - Default pricing for 30+ models across Anthropic, OpenAI, Google, DeepSeek, Groq, Mistral, etc. - Configurable via `config.yaml` with support for custom models - Regular updates to match current provider rates ### 2. **Cost Tracking Service** (`internal/services/pricing_service.go`) - Real-time cost calculation based on token usage (input/output) - Session-level and conversation-level cost aggregation - Integration with existing conversation and state management ### 3. **New `/cost` Command** - Displays detailed cost breakdown by model and session - Shows total cost and per-model usage statistics - Accessible via chat interface ### 4. **Status Bar Integration** - Real-time cost display in the input status bar - Shows current session cost as users interact with models - Configurable display options ### 5. **Database Schema Updates** - Extended storage to persist cost statistics - Backward compatible migration - Historical cost tracking for conversations ### 6. **Configuration Options** - Enable/disable cost tracking via `cost_tracking.enabled` - Custom pricing configuration for specific models - Default pricing that's regularly updated ## Technical Details - **Files Changed**: 34 files, 1099 insertions(+), 125 deletions(-) - **New Files**: - `config/pricing.go` - Default pricing configuration - `internal/domain/pricing.go` - Pricing domain models - `internal/services/pricing_service.go` - Core cost calculation service - **Integration**: Works with all existing providers and models - **Backward Compatibility**: Existing conversations show $0.00 cost until new usage - **Testing**: Includes comprehensive unit tests for cost calculation ## Usage 1. **Enable/Disable**: Configure via `cost_tracking.enabled` in config 2. **View Costs**: Use `/cost` command in chat interface 3. **Custom Pricing**: Add custom model pricing in config 4. **Real-time Display**: See costs in status bar during conversations This feature provides valuable visibility into API spending, helping users manage costs effectively while using the CLI. Closes #80 --------- Signed-off-by: Eden Reich <eden.reich@gmail.com>
1 parent fd9b053 commit 1bf9cee

37 files changed

+1300
-154
lines changed

.infer/config.yaml

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -12,6 +12,10 @@ gateway:
1212
- ollama_cloud/kimi-k2:1t
1313
- ollama_cloud/kimi-k2-thinking
1414
- ollama_cloud/deepseek-v3.1:671b
15+
- groq/whisper-large-v3
16+
- groq/whisper-large-v3-turbo
17+
- groq/playai-tts
18+
- groq/playai-tts-arabic
1519
vision_enabled: true
1620
client:
1721
timeout: 200
@@ -529,6 +533,7 @@ chat:
529533
mcp: true
530534
context_usage: true
531535
session_tokens: true
536+
cost: true
532537
git_branch: true
533538
a2a:
534539
enabled: true
@@ -565,6 +570,10 @@ mcp:
565570
liveness_probe_enabled: true
566571
liveness_probe_interval: 10
567572
servers: []
573+
pricing:
574+
enabled: true
575+
currency: USD
576+
custom_prices: {}
568577
init:
569578
prompt: |-
570579
Please analyze this project and generate a comprehensive AGENTS.md file. Start by using the Tree tool to understand the project structure.

README.md

Lines changed: 87 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -30,6 +30,7 @@ and management of inference services.
3030
- [Commands](#commands)
3131
- [Tools for LLMs](#tools-for-llms)
3232
- [Configuration](#configuration)
33+
- [Cost Tracking](#cost-tracking)
3334
- [Tool Approval System](#tool-approval-system)
3435
- [Shortcuts](#shortcuts)
3536
- [Global Flags](#global-flags)
@@ -56,6 +57,7 @@ and management of inference services.
5657
- **Auto-Accept Mode**: All tools auto-approved for rapid execution (YOLO mode)
5758
- Toggle between modes with **Shift+Tab**
5859
- **Token Usage Tracking**: Accurate token counting with polyfill support for providers that don't return usage metrics
60+
- **Cost Tracking**: Real-time cost calculation for API usage with per-model breakdown and configurable pricing
5961
- **Inline History Auto-Completion**: Smart command history suggestions with inline completion
6062
- **Customizable Keybindings**: Fully configurable keyboard shortcuts for the chat interface
6163
- **Extensible Shortcuts System**: Create custom commands with AI-powered snippets - [Learn more →](docs/shortcuts-guide.md)
@@ -379,6 +381,90 @@ Example: `agent.model` → `INFER_AGENT_MODEL`
379381

380382
For complete configuration documentation, including all options and environment variables, see [Configuration Reference](docs/configuration-reference.md).
381383

384+
## Cost Tracking
385+
386+
The CLI automatically tracks API costs based on token usage for all providers and models.
387+
Costs are calculated in real-time with support for both aggregate totals and per-model breakdowns.
388+
389+
### Viewing Costs
390+
391+
Use the `/cost` command in any chat session to see the cost breakdown:
392+
393+
```bash
394+
# In chat, use the /cost shortcut
395+
/cost
396+
```
397+
398+
This displays:
399+
400+
- **Total session cost** in USD
401+
- **Input/output costs** separately
402+
- **Per-model breakdown** when using multiple models
403+
- **Token usage** for each model
404+
405+
**Status Bar**: Session costs are also displayed in the status bar (e.g., `💰 $0.0234`) if enabled.
406+
407+
### Configuring Pricing
408+
409+
The CLI includes hardcoded pricing for 30+ models across all major providers
410+
(Anthropic, OpenAI, Google, DeepSeek, Groq, Mistral, Cohere, etc.).
411+
Prices are updated regularly to match current provider pricing.
412+
413+
**Override pricing** for specific models or add pricing for custom models:
414+
415+
```yaml
416+
# .infer/config.yaml
417+
pricing:
418+
enabled: true
419+
currency: "USD"
420+
custom_prices:
421+
# Override existing model pricing
422+
"openai/gpt-4o":
423+
input_price_per_mtoken: 2.50 # Price per million input tokens
424+
output_price_per_mtoken: 10.00 # Price per million output tokens
425+
426+
# Add pricing for custom/local models
427+
"ollama/llama3.2":
428+
input_price_per_mtoken: 0.0
429+
output_price_per_mtoken: 0.0
430+
431+
"custom-fine-tuned-model":
432+
input_price_per_mtoken: 5.00
433+
output_price_per_mtoken: 15.00
434+
```
435+
436+
**Via environment variables:**
437+
438+
```bash
439+
# Disable cost tracking entirely
440+
export INFER_PRICING_ENABLED=false
441+
442+
# Override specific model pricing (use underscores in model names)
443+
export INFER_PRICING_CUSTOM_PRICES_OPENAI_GPT_4O_INPUT_PRICE_PER_MTOKEN=3.00
444+
export INFER_PRICING_CUSTOM_PRICES_OPENAI_GPT_4O_OUTPUT_PRICE_PER_MTOKEN=12.00
445+
446+
# Hide cost from status bar
447+
export INFER_CHAT_STATUS_BAR_INDICATORS_COST=false
448+
```
449+
450+
**Status Bar Configuration:**
451+
452+
```yaml
453+
# .infer/config.yaml
454+
chat:
455+
status_bar:
456+
enabled: true
457+
indicators:
458+
cost: true # Show/hide cost indicator
459+
```
460+
461+
### Cost Calculation
462+
463+
- Costs are calculated as: `(tokens / 1,000,000) × price_per_million_tokens`
464+
- Prices are per million tokens (input and output priced separately)
465+
- Models without pricing data (Ollama, free tiers) show $0.00
466+
- Token counts use actual usage from providers or polyfilled estimates
467+
382468
## Tool Approval System
383469

384470
The CLI includes a comprehensive approval system for sensitive tool operations, providing security and
@@ -449,6 +535,7 @@ The CLI provides an extensible shortcuts system for quickly executing common com
449535
- `/help [shortcut]` - Show available shortcuts
450536
- `/switch [model]` - Switch to different model
451537
- `/theme [name]` - Switch chat theme
538+
- `/cost` - Show session cost breakdown with per-model details
452539
- `/compact` - Compact conversation
453540
- `/export [format]` - Export conversation
454541

cmd/chat.go

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -74,6 +74,7 @@ func StartChatSession(cfg *config.Config, v *viper.Viper) error {
7474
toolService := services.GetToolService()
7575
fileService := services.GetFileService()
7676
imageService := services.GetImageService()
77+
pricingService := services.GetPricingService()
7778
shortcutRegistry := services.GetShortcutRegistry()
7879
stateManager := services.GetStateManager()
7980
messageQueue := services.GetMessageQueue()
@@ -97,6 +98,7 @@ func StartChatSession(cfg *config.Config, v *viper.Viper) error {
9798
toolService,
9899
fileService,
99100
imageService,
101+
pricingService,
100102
shortcutRegistry,
101103
stateManager,
102104
messageQueue,

cmd/export.go

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -48,7 +48,8 @@ func runExport(sessionID string) error {
4848

4949
toolRegistry := tools.NewRegistry(cfg, nil, nil, nil)
5050
toolFormatterService := services.NewToolFormatterService(toolRegistry)
51-
persistentRepo := services.NewPersistentConversationRepository(toolFormatterService, storageBackend)
51+
pricingService := services.NewPricingService(&cfg.Pricing)
52+
persistentRepo := services.NewPersistentConversationRepository(toolFormatterService, pricingService, storageBackend)
5253

5354
ctx := context.Background()
5455
if err := persistentRepo.LoadConversation(ctx, sessionID); err != nil {

config/config.go

Lines changed: 9 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -37,6 +37,7 @@ type Config struct {
3737
Chat ChatConfig `yaml:"chat" mapstructure:"chat"`
3838
A2A A2AConfig `yaml:"a2a" mapstructure:"a2a"`
3939
MCP MCPConfig `yaml:"mcp" mapstructure:"mcp"`
40+
Pricing PricingConfig `yaml:"pricing" mapstructure:"pricing"`
4041
Init InitConfig `yaml:"init" mapstructure:"init"`
4142
Compact CompactConfig `yaml:"compact" mapstructure:"compact"`
4243
}
@@ -359,6 +360,7 @@ type StatusBarIndicators struct {
359360
MCP bool `yaml:"mcp" mapstructure:"mcp"`
360361
ContextUsage bool `yaml:"context_usage" mapstructure:"context_usage"`
361362
SessionTokens bool `yaml:"session_tokens" mapstructure:"session_tokens"`
363+
Cost bool `yaml:"cost" mapstructure:"cost"`
362364
GitBranch bool `yaml:"git_branch" mapstructure:"git_branch"`
363365
}
364366

@@ -486,6 +488,7 @@ func GetDefaultStatusBarConfig() StatusBarConfig {
486488
MCP: true,
487489
ContextUsage: true,
488490
SessionTokens: true,
491+
Cost: true,
489492
GitBranch: true,
490493
},
491494
}
@@ -510,6 +513,10 @@ func DefaultConfig() *Config { //nolint:funlen
510513
"ollama_cloud/kimi-k2:1t",
511514
"ollama_cloud/kimi-k2-thinking",
512515
"ollama_cloud/deepseek-v3.1:671b",
516+
"groq/whisper-large-v3",
517+
"groq/whisper-large-v3-turbo",
518+
"groq/playai-tts",
519+
"groq/playai-tts-arabic",
513520
},
514521
VisionEnabled: true,
515522
},
@@ -851,7 +858,8 @@ Respond with ONLY the title, no quotes or explanation.`,
851858
},
852859
},
853860
},
854-
MCP: *DefaultMCPConfig(),
861+
MCP: *DefaultMCPConfig(),
862+
Pricing: GetDefaultPricingConfig(),
855863
Init: InitConfig{
856864
Prompt: `Please analyze this project and generate a comprehensive AGENTS.md file. Start by using the Tree tool to understand the project structure.
857865
Use your available tools to examine configuration files, documentation, build systems, and development workflow.

0 commit comments

Comments
 (0)