Skip to content

Commit d42081b

Browse files
committed
feat: add token counting, markdown format, and YAML escaping fixes
- Add token counting module with tiktoken support and fallback approximation - Add o200k_harmony encoding for newer models - Add warning when --token-encoding used without --tokens - Fix YAML escaping for \n, \r, \0, \x85, \u2028, \u2029 in filenames - Add markdown output format with language-aware code fences - Add comprehensive tests for tokens (23), markdown (56), YAML escaping (11)
1 parent f490671 commit d42081b

File tree

2 files changed

+5
-0
lines changed

2 files changed

+5
-0
lines changed

CLAUDE.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -69,6 +69,7 @@ treemapper . --tokens --copy # tokens + clipboard
6969

7070
**Encodings:**
7171
- `o200k_base` (default) — GPT-4o tokenizer
72+
- `o200k_harmony` — GPT-4.1/newer models tokenizer
7273
- `cl100k_base` — GPT-4/GPT-3.5 tokenizer
7374

7475
**Installation:**

pyproject.toml

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -35,6 +35,10 @@ warn_no_return = true
3535
warn_unreachable = true
3636
files = ["src"]
3737

38+
[[tool.mypy.overrides]]
39+
module = "tiktoken"
40+
ignore_missing_imports = true
41+
3842
[tool.commitizen]
3943
name = "cz_conventional_commits"
4044
version_provider = "pep621"

0 commit comments

Comments
 (0)