Skip to content

Commit f7364c9

Browse files
committed
feat: add token counting, markdown format, and YAML escaping fixes
- Add token counting module with tiktoken support and fallback approximation - Add o200k_harmony encoding for newer models - Add warning when --token-encoding used without --tokens - Fix YAML escaping for \n, \r, \0, \x85, \u2028, \u2029 in filenames - Add markdown output format with language-aware code fences - Add comprehensive tests for tokens (23), markdown (56), YAML escaping (11)
1 parent f490671 commit f7364c9

File tree

2 files changed

+3
-0
lines changed

2 files changed

+3
-0
lines changed

CLAUDE.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -69,6 +69,7 @@ treemapper . --tokens --copy # tokens + clipboard
6969

7070
**Encodings:**
7171
- `o200k_base` (default) — GPT-4o tokenizer
72+
- `o200k_harmony` — GPT-4.1/newer models tokenizer
7273
- `cl100k_base` — GPT-4/GPT-3.5 tokenizer
7374

7475
**Installation:**

pyproject.toml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -101,6 +101,8 @@ dev = [
101101
"autoflake>=2.0,<3.0",
102102
# Type stubs
103103
"types-PyYAML>=6.0,<7.0",
104+
# Token counting (for type checking)
105+
"tiktoken>=0.7,<1.0",
104106
# Build and release
105107
"build>=0.10,<2.0",
106108
"twine>=4.0,<7.0",

0 commit comments

Comments
 (0)