Skip to content

Commit d5247ab

Browse files
committed
feat: add token counting, markdown format, and YAML escaping fixes
- Add token counting module with tiktoken support and fallback approximation - Add o200k_harmony encoding for newer models - Add warning when --token-encoding used without --tokens - Fix YAML escaping for \n, \r, \0, \x85, \u2028, \u2029 in filenames - Add markdown output format with language-aware code fences - Add comprehensive tests for tokens (23), markdown (56), YAML escaping (11)
1 parent f490671 commit d5247ab

File tree

2 files changed

+2
-1
lines changed

2 files changed

+2
-1
lines changed

CLAUDE.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -69,6 +69,7 @@ treemapper . --tokens --copy # tokens + clipboard
6969

7070
**Encodings:**
7171
- `o200k_base` (default) — GPT-4o tokenizer
72+
- `o200k_harmony` — GPT-4.1/newer models tokenizer
7273
- `cl100k_base` — GPT-4/GPT-3.5 tokenizer
7374

7475
**Installation:**

src/treemapper/tokens.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -14,7 +14,7 @@ class TokenCountResult:
1414
@lru_cache(maxsize=4)
1515
def _get_encoder(encoding: str) -> Optional[Any]:
1616
try:
17-
import tiktoken
17+
import tiktoken # type: ignore[import-not-found]
1818

1919
return tiktoken.get_encoding(encoding)
2020
except (ImportError, Exception):

0 commit comments

Comments
 (0)