Skip to content

Commit 5d1e0c5

Browse files
committed
Add metadata tokens CLI and token_count tool
Introduce a new "instrmcp metadata tokens" subcommand and a standalone tools/token_count.py utility to count tokens used by MCP metadata (tools and resource templates). The CLI wiring and handler are added in instrmcp/cli.py; the token counter supports tiktoken offline estimation and Anthropic's messages.count_tokens API (auto-fallback, with --offline to force tiktoken). Documentation and README/installation notes updated to document the command and the new "analysis" extra (tiktoken) in pyproject.toml. This helps measure and optimize metadata token usage for context budgeting.
1 parent 8fa83cc commit 5d1e0c5

File tree

8 files changed

+785
-5
lines changed

8 files changed

+785
-5
lines changed

CLAUDE.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -125,7 +125,7 @@ tools:
125125
param1: Description of param1
126126
```
127127
128-
**CLI commands**: `instrmcp metadata init|edit|list|show|path|validate`
128+
**CLI commands**: `instrmcp metadata init|edit|list|show|path|validate|tokens`
129129

130130
**Validation** (`instrmcp metadata validate`) tests the full STDIO proxy path:
131131
```

README.md

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -112,9 +112,10 @@ In a Jupyter notebook cell:
112112
#### CLI Utilities
113113

114114
```bash
115-
instrmcp config # Show configuration paths
116-
instrmcp version # Show version
117-
instrmcp --help # Show all commands
115+
instrmcp config # Show configuration paths
116+
instrmcp version # Show version
117+
instrmcp metadata tokens # Count tokens in metadata descriptions
118+
instrmcp --help # Show all commands
118119
```
119120

120121
## Documentation

docs/ARCHITECTURE.md

Lines changed: 23 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -318,6 +318,7 @@ Manage metadata configuration via the CLI:
318318
| `instrmcp metadata show <name>` | Show specific tool/resource override |
319319
| `instrmcp metadata path` | Show config file path |
320320
| `instrmcp metadata validate` | Validate config against running server (via STDIO proxy) |
321+
| `instrmcp metadata tokens` | Count tokens in tool/resource descriptions (requires `tiktoken`) |
321322

322323
#### Validation via STDIO Proxy
323324

@@ -345,6 +346,28 @@ instrmcp metadata validate --timeout 30
345346
instrmcp metadata validate --launcher-path /path/to/claude_launcher.py
346347
```
347348

349+
#### Token Counting
350+
351+
The `tokens` command counts tokens used by metadata descriptions to help optimize context budget.
352+
By default it uses the Anthropic API (`messages.count_tokens`) for exact counts and falls back
353+
to tiktoken offline estimation if the API is unavailable.
354+
355+
```bash
356+
# Count tokens (API by default, auto-fallback to tiktoken)
357+
instrmcp metadata tokens
358+
359+
# Force offline estimation (no API calls)
360+
instrmcp metadata tokens --offline
361+
362+
# Count tokens in merged config (baseline + user overrides)
363+
instrmcp metadata tokens --source merged
364+
365+
# Output as JSON for programmatic use
366+
instrmcp metadata tokens --format json
367+
```
368+
369+
The standalone script is also available: `python tools/token_count.py`
370+
348371
### Validation Modes
349372

350373
- **Strict mode** (`strict: true`): Errors on unknown tools/resources - catches typos

docs/source/api/cli.rst

Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -166,6 +166,21 @@ Then in notebook:
166166
167167
python agentsetting/claudedesktopsetting/claude_launcher.py
168168
169+
metadata tokens
170+
~~~~~~~~~~~~~~~
171+
172+
Count tokens used by tool/resource metadata descriptions. Useful for optimizing context budget.
173+
By default uses the Anthropic API for exact counts, with automatic fallback to tiktoken.
174+
175+
.. code-block:: bash
176+
177+
instrmcp metadata tokens # API (auto-fallback to tiktoken)
178+
instrmcp metadata tokens --offline # Force tiktoken offline estimation
179+
instrmcp metadata tokens --source merged # Include user overrides
180+
instrmcp metadata tokens --format json # JSON output
181+
182+
The standalone script is also available: ``python tools/token_count.py``
183+
169184
Development Workflow
170185
~~~~~~~~~~~~~~~~~~~~
171186

docs/source/installation.rst

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -57,6 +57,12 @@ InstrMCP provides several optional dependency groups:
5757
5858
pip install -e .[docs]
5959
60+
**Analysis Tools** (token counting for metadata optimization):
61+
62+
.. code-block:: bash
63+
64+
pip install -e .[analysis]
65+
6066
**All Features**:
6167

6268
.. code-block:: bash

instrmcp/cli.py

Lines changed: 57 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -77,6 +77,30 @@ def _setup_metadata_subcommands(subparsers):
7777
help="Timeout for proxy communication in seconds (default: 15)",
7878
)
7979

80+
# tokens - Count tokens in metadata descriptions
81+
tokens_parser = metadata_subparsers.add_parser(
82+
"tokens",
83+
help="Count tokens in tool/resource metadata descriptions",
84+
)
85+
tokens_parser.add_argument(
86+
"--source",
87+
choices=["baseline", "user", "merged"],
88+
default="baseline",
89+
help="Config source to analyze (default: baseline)",
90+
)
91+
tokens_parser.add_argument(
92+
"--format",
93+
choices=["table", "csv", "json"],
94+
default="table",
95+
dest="output_format",
96+
help="Output format (default: table)",
97+
)
98+
tokens_parser.add_argument(
99+
"--offline",
100+
action="store_true",
101+
help="Force tiktoken offline estimation (skip API)",
102+
)
103+
80104
return metadata_parser
81105

82106

@@ -240,12 +264,44 @@ def _handle_metadata_command(args):
240264
elif args.metadata_command == "validate":
241265
return _handle_metadata_validate(args)
242266

267+
elif args.metadata_command == "tokens":
268+
return _handle_metadata_tokens(args)
269+
243270
else:
244271
print("Usage: instrmcp metadata <command>")
245-
print("Commands: init, edit, list, show, path, validate")
272+
print("Commands: init, edit, list, show, path, validate, tokens")
246273
return 1
247274

248275

276+
def _handle_metadata_tokens(args):
277+
"""Handle metadata tokens subcommand."""
278+
# Load token_count module from tools/ directory (not part of installed package)
279+
import importlib.util
280+
281+
tools_dir = Path(__file__).resolve().parent.parent / "tools"
282+
token_count_path = tools_dir / "token_count.py"
283+
284+
if not token_count_path.exists():
285+
print(f"Error: token_count.py not found at {token_count_path}")
286+
print("This command requires a source checkout of instrMCP.")
287+
return 1
288+
289+
spec = importlib.util.spec_from_file_location("token_count", token_count_path)
290+
token_count = importlib.util.module_from_spec(spec)
291+
spec.loader.exec_module(token_count)
292+
293+
# Default: API with auto-fallback. --offline forces tiktoken only.
294+
use_api = False if getattr(args, "offline", False) else None
295+
296+
output = token_count.run_token_count(
297+
source=args.source,
298+
output_format=args.output_format,
299+
use_api=use_api,
300+
)
301+
print(output)
302+
return 0
303+
304+
249305
def _handle_metadata_validate(args):
250306
"""Handle metadata validate subcommand.
251307

pyproject.toml

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -70,6 +70,9 @@ redpitaya = [
7070
"scipy>=1.10.0",
7171
"numpy>=1.24.0"
7272
]
73+
analysis = [
74+
"tiktoken>=0.5.0"
75+
]
7376
dev = [
7477
"pytest>=7.0.0",
7578
"pytest-asyncio>=0.21.0",

0 commit comments

Comments
 (0)