Skip to content

Commit f1ee425

Browse files
committed
feat(usage): normalize and report token usage metrics
- Add `UsageExtractor` to normalize raw usage data according to vendor mappings in `resources/models.yaml` - Update `ModelDriverInterface` signature and all drivers (`Anthropic`, `Google`, `OpenAi`, `Yandex`, `DeepL`) to return both `text` and `usage` - Adapt `Bblslug::translate()` to extract and include usage metrics in its result - Enhance `runFromCli()` to display a structured “Usage metrics” section (total + breakdown) after translation - Update and slightly restructure `README.md` text and CLI sections to document the new usage-metrics feature
1 parent 7d8b3f6 commit f1ee425

File tree

10 files changed

+399
-164
lines changed

10 files changed

+399
-164
lines changed

README.md

Lines changed: 155 additions & 108 deletions
Original file line numberDiff line numberDiff line change
@@ -7,34 +7,35 @@ It leverages LLM-based APIs to translate plain text or HTML while preserving str
77
APIs supported:
88

99
- Anthropic (Claude):
10-
- `anthropic:claude-haiku-3.5` - Claude Haiku 3.5 (latest)
11-
- `anthropic:claude-opus-4` - Claude Opus 4 (20250514)
12-
- `anthropic:claude-sonnet-4` - Claude Sonnet 4 (20250514)
13-
- DeepL
14-
- `deepl:free` - DeepL free tier
15-
- `deepl:pro` - DeepL pro tier
16-
- Google (Gemini)
17-
- `google:gemini-2.0-flash` - Gemini 2.0 Flash
18-
- `google:gemini-2.5-flash` - Gemini 2.5 Flash
19-
- `google:gemini-2.5-flash-lite` - Gemini 2.5 Flash Lite
20-
- `google:gemini-2.5-pro` - Gemini 2.5 Pro
21-
- OpenAI (GPT)
22-
- `openai:gpt-4` - OpenAI GPT-4
23-
- `openai:gpt-4-turbo` - OpenAI GPT-4 Turbo
24-
- `openai:gpt-4o` - OpenAI GPT-4o
25-
- `openai:gpt-4o-mini` - OpenAI GPT-4o Mini
10+
- `anthropic:claude-haiku-3.5` - Claude Haiku 3.5 (latest)
11+
- `anthropic:claude-opus-4` - Claude Opus 4 (20250514)
12+
- `anthropic:claude-sonnet-4` - Claude Sonnet 4 (20250514)
13+
- DeepL:
14+
- `deepl:free` - DeepL free tier
15+
- `deepl:pro` - DeepL pro tier
16+
- Google (Gemini):
17+
- `google:gemini-2.0-flash` - Gemini 2.0 Flash
18+
- `google:gemini-2.5-flash` - Gemini 2.5 Flash
19+
- `google:gemini-2.5-flash-lite` - Gemini 2.5 Flash Lite
20+
- `google:gemini-2.5-pro` - Gemini 2.5 Pro
21+
- OpenAI (GPT):
22+
- `openai:gpt-4` - OpenAI GPT-4
23+
- `openai:gpt-4-turbo` - OpenAI GPT-4 Turbo
24+
- `openai:gpt-4o` - OpenAI GPT-4o
25+
- `openai:gpt-4o-mini` - OpenAI GPT-4o Mini
2626
- Yandex:
2727
- `yandex:gpt-lite` - YandexGPT Lite
2828
- `yandex:gpt-pro` - YandexGPT Pro
2929
- `yandex:gpt-32k` - YandexGPT Pro 32K
3030

3131
## Features
3232

33-
- Supports **html** and **plain text** (`--format=text|html`)
33+
- Supports **HTML** and **plain text** (`--format=text|html`)
3434
- Placeholder-based protection with filters: `html_pre`, `html_code`, `url`, etc.
35-
- Model selection via `--model=vendor:name` (`deepl:pro`, `google:gemini-2.5-flash`, `openai:gpt-4o`, …)
36-
- Fully configurable backend registry
35+
- Model selection via `--model=vendor:name`
36+
- Fully configurable backend registry (via `resources/models.yaml`)
3737
- **Dry-run** mode to preview placeholders without making API calls
38+
- **Variables** (`--variables`) to send or override model-specific options
3839
- **Verbose** mode (`--verbose`) to print request previews
3940
- Can be invoked as a CLI tool or embedded in PHP code
4041

@@ -47,42 +48,45 @@ chmod +x vendor/bin/bblslug
4748

4849
## CLI Usage
4950

51+
### Prepare
52+
5053
1. **Always specify a model** with `--model=vendor:name` option.
5154

5255
2. **Export your API key(s)** before running:
5356

54-
```bash
55-
export ANTHROPIC_API_KEY=...
56-
export DEEPL_FREE_API_KEY=...
57-
export DEEPL_PRO_API_KEY=...
58-
export GOOGLE_API_KEY=...
59-
export OPENAI_API_KEY=...
60-
export YANDEX_API_KEY=... && export YANDEX_FOLDER_ID=...
61-
```
57+
```bash
58+
export ANTHROPIC_API_KEY=...
59+
export DEEPL_FREE_API_KEY=...
60+
export DEEPL_PRO_API_KEY=...
61+
export GOOGLE_API_KEY=...
62+
export OPENAI_API_KEY=...
63+
export YANDEX_API_KEY=... && export YANDEX_FOLDER_ID=...
64+
```
6265

63-
**NB!** Some vendors require additional parameters for client authentication (like Yandex)
66+
**NB!** Some vendors require additional parameters, e.g. `YANDEX_FOLDER_ID`.
6467

6568
3. **Input / output**:
6669

67-
- If `--source` is omitted, Bblslug reads from **STDIN**.
68-
- If `--translated` is omitted, Bblslug writes to **STDOUT**.
70+
- If `--source` is omitted, Bblslug reads from **STDIN**.
71+
- If `--translated` is omitted, Bblslug writes to **STDOUT**.
6972

7073
4. **Optional proxy**:
7174

72-
To route requests through a proxy (e.g. HTTP or SOCKS5), use the `--proxy` option or set the `BBLSLUG_PROXY` environment variable:
73-
74-
```bash
75-
# using CLI flag
76-
vendor/bin/bblslug --proxy="http://localhost:8888" ...
75+
To route requests through a proxy (e.g. HTTP or SOCKS5), use the `--proxy` option or set the `BBLSLUG_PROXY` environment variable:
7776

78-
# or set it globally
79-
export BBLSLUG_PROXY="socks5h://127.0.0.1:9050"
80-
```
77+
```bash
78+
# using CLI flag
79+
vendor/bin/bblslug --proxy="http://localhost:8888" ...
80+
81+
# or set it globally
82+
export BBLSLUG_PROXY="socks5h://127.0.0.1:9050"
83+
```
8184

82-
This works for all HTTP requests and supports authentication (`http://user:pass@host:port`).
85+
This works for all HTTP requests and supports authentication (`http://user:pass@host:port`).
8386

8487

8588
### Show available models
89+
8690
```bash
8791
vendor/bin/bblslug --list-models
8892
```
@@ -121,32 +125,17 @@ vendor/bin/bblslug \
121125
--context="Translate as a professional technical translator"
122126
```
123127

124-
### Pipe STDIN → file
128+
### Pass model-specific variables
125129

126130
```bash
127131
vendor/bin/bblslug \
128132
--model=vendor:name \
129133
--format=text \
130-
--source=input.txt
131-
```
132-
133-
### Pipe STDIN → file
134-
135-
```bash
136-
cat input.txt | vendor/bin/bblslug \
137-
--model=vendor:name \
138-
--format=text \
134+
--variables=foo=bar,foo2=bar2 \
135+
--source=in.txt \
139136
--translated=out.txt
140137
```
141138

142-
### Pipe STDIN → STDOUT
143-
144-
```bash
145-
echo "Hello world" | vendor/bin/bblslug \
146-
--model=vendor:name \
147-
--format=text
148-
```
149-
150139
### Dry-run placeholders only
151140

152141
```bash
@@ -169,69 +158,127 @@ vendor/bin/bblslug \
169158
--translated=out.html
170159
```
171160

172-
### Use additional model/vendor-specific options
161+
### Pipe STDIN → file
173162

174163
```bash
175-
vendor/bin/bblslug \
164+
cat input.txt | vendor/bin/bblslug \
176165
--model=vendor:name \
177-
--variables=some=XXX,other=YYY \
178-
--format=html \
179-
--verbose \
180-
--source=input.html \
181-
--translated=out.html
166+
--format=text \
167+
--translated=out.txt
168+
```
169+
170+
### Pipe STDIN → STDOUT
171+
172+
```bash
173+
echo "Hello world" | vendor/bin/bblslug \
174+
--model=vendor:name \
175+
--format=text > translated.out
182176
```
183177

178+
### Statistics
179+
180+
- **Usage metrics**
181+
182+
After each translation (when not in dry-run), Bblslug prints to stderr a summary of consumed usage metrics, for example:
183+
184+
```
185+
Usage metrics:
186+
Tokens:
187+
Total: 1074
188+
-----------------
189+
Prompt: 631
190+
Completion: 443
191+
```
192+
184193
## PHP Library Usage
185194

186-
You can embed Bblslug in your PHP project:
195+
You can embed Bblslug in your PHP project.
196+
197+
### Quickstart
198+
199+
1. **Install:**
200+
201+
```bash
202+
composer require habr/bblslug
203+
```
204+
205+
2. **Require & Import:**
206+
207+
```php
208+
require 'vendor/autoload.php';
209+
use Bblslug\Bblslug;
210+
```
211+
212+
3. **Translate:**
213+
214+
```php
215+
$text = file_get_contents('input.html');
216+
$result = Bblslug::translate(
217+
apiKey: getenv('MODEL_API_KEY'), // API key for the chosen model
218+
format: 'html', // 'text' or 'html'
219+
modelKey: 'vendor:model', // Model identifier (e.g. deepl:free, openai:gpt-4o, etc.)
220+
text: $text, // Source text or HTML
221+
// optional:
222+
// Additional context/prompt pass to model
223+
context: 'Translate as a professional technical translator',
224+
filters: ['url','html_code'], // List of placeholder filters
225+
proxy: getenv('BBLSLUG_PROXY'), // Optional proxy URI (http://..., socks5h://...)
226+
sourceLang: 'DE', // Source language code (optional; autodetect if null)
227+
targetLang: 'EN', // Target language code (optional; default from driver settings)
228+
variables: ['foo'=>'bar'], // model-specific overrides
229+
verbose: true, // If true, returns debug request/response
230+
);
231+
echo $result['result'];
232+
```
233+
234+
### Result structure
187235

188236
```php
189-
<?php
190-
require 'vendor/autoload.php';
191-
192-
use Bblslug\Bblslug;
193-
194-
// Load input text or HTML from file
195-
$text = file_get_contents('input.html');
196-
197-
// Call library translate method
198-
$result = Bblslug::translate(
199-
apiKey: getenv('DEEPL_PRO_API_KEY'), // API key for the chosen model
200-
format: 'html', // 'text' or 'html'
201-
modelKey: 'deepl:pro', // Model identifier (e.g. deepl:free, deepl:pro, openai:gpt-4o)
202-
text: $text, // Source text or HTML
203-
204-
// optional parameters:
205-
context: null, // Additional context/prompt (DeepL: context)
206-
dryRun: false, // If true, only prepare placeholders, no API call
207-
filters: ['url', 'html_code'], // List of placeholder filters
208-
proxy: getenv('BBLSLUG_PROXY'), // Optional proxy URI (http://..., socks5h://...)
209-
sourceLang: null, // Source language code (optional; autodetect if null)
210-
targetLang: null, // Target language code (optional; default from driver settings)
211-
variables: TBD, // TBD
212-
verbose: true, // If true, prints debug request/response to stderr
213-
);
214-
215-
// Result output example
216-
// $result = [
217-
// 'original' => '...', // Original input
218-
// 'prepared' => '...', // After placeholder filters
219-
// 'result' => '...', // Translated result
220-
// 'lengths' => [ // Character counts
221-
// 'original' => 123,
222-
// 'prepared' => 100,
223-
// 'translated' => 130
224-
// ],
225-
// 'filterStats' => [ // Placeholder stats
226-
// ['filter'=>'url','count'=>5], …
227-
// ];
228-
229-
echo $result['result'];
237+
[
238+
'original' => string, // Original input
239+
'prepared' => string, // After placeholder filters
240+
'result' => string, // Translated result
241+
'httpStatus' => int, // HTTP status
242+
'debugRequest' => string, // Request debug
243+
'debugResponse' => string, // Response debug
244+
'rawResponseBody' => string, // Response body
245+
'consumed' => [ // Normalized usage metrics
246+
'tokens' => [
247+
'total' => int, // Total tokens consumed
248+
'breakdown' => [ // Per-type breakdown
249+
'prompt' => int, // Name depeds of model
250+
'completion' => int, // Name depeds of model
251+
],
252+
],
253+
// additional categories if supported by the model...
254+
'lengths' => [ // Text length statistics
255+
'original' => int, // - original text
256+
'prepared' => int, // - after placeholder filters
257+
'translated' => int, // - returned translated text
258+
],
259+
'filterStats' => [ // Placeholder stats
260+
['filter'=>'url','count'=>3], …
261+
],
262+
]
230263
```
231264

232-
## Examples
265+
### Error handling
266+
267+
```php
268+
try {
269+
$res = Bblslug::translate(...);
270+
} catch (\InvalidArgumentException $e) {
271+
// invalid model, missing API key, etc.
272+
} catch (\RuntimeException $e) {
273+
// HTTP error, parse failure, driver-specific error
274+
}
275+
```
276+
277+
278+
279+
## Samples
233280

234-
You can find sample input files under the `examples/` directory.
281+
You can find sample input files under the `samples/` directory.
235282

236283
## License
237284

resources/models.yaml

Lines changed: 24 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -21,6 +21,12 @@ anthropic:
2121
limits:
2222
estimated_max_chars: 400000
2323
http_error_handling: true
24+
usage:
25+
tokens:
26+
total: total_tokens
27+
breakdown:
28+
prompt: prompt_tokens
29+
completion: completion_tokens
2430

2531
models:
2632
claude-haiku-3.5:
@@ -107,6 +113,12 @@ google:
107113
- system_instruction
108114
- contents
109115
- generationConfig
116+
usage:
117+
tokens:
118+
total: totalTokenCount
119+
breakdown:
120+
prompt: promptTokenCount
121+
candidates: candidatesTokenCount
110122

111123
models:
112124
gemini-2.0-flash:
@@ -180,6 +192,12 @@ openai:
180192
max_tokens: 128000
181193
token_estimator: gpt
182194
estimated_max_chars: 512000
195+
usage:
196+
tokens:
197+
total: total_tokens
198+
breakdown:
199+
prompt: prompt_tokens
200+
completion: completion_tokens
183201

184202
models:
185203
gpt-4:
@@ -232,6 +250,12 @@ yandex:
232250
token_estimator: gpt
233251
estimated_max_chars: 32768
234252
http_error_handling: true
253+
usage:
254+
tokens:
255+
total: totalTokens
256+
breakdown:
257+
input: inputTextTokens
258+
completion: completionTokens
235259

236260
models:
237261
gpt-lite:

0 commit comments

Comments
 (0)