Skip to content

Commit 84cce0f

Browse files
jackspirouclaude
andcommitted
docs: update documentation to reflect simplified CLI usage
- Show help instead of usage when no args provided - Update all README examples to show implicit encoding - Document automatic streaming detection - Add CLI usage section to main README - Emphasize the intuitive default behaviors The documentation now reflects the simplified interface: tokenizer llama3 "text" # encodes echo "text" | tokenizer llama3 # streams tokenizer llama3 # shows help 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
1 parent 6f72418 commit 84cce0f

File tree

3 files changed

+55
-13
lines changed

3 files changed

+55
-13
lines changed

README.md

Lines changed: 30 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -56,14 +56,17 @@ make install
5656
Quick usage:
5757

5858
```bash
59-
# Encode text
60-
tokenizer llama3 encode "Hello, world!"
59+
# Encode text (implicit - recommended)
60+
tokenizer llama3 "Hello, world!"
6161

6262
# Decode tokens
6363
tokenizer llama3 decode 128000 9906 11 1917 0 128001
6464

65-
# Stream large files
66-
cat document.txt | tokenizer llama3 stream
65+
# Stream large files (automatic pipe detection)
66+
cat document.txt | tokenizer llama3
67+
68+
# Get tokenizer information
69+
tokenizer llama3 info
6770
```
6871

6972
See [cmd/tokenizer/README.md](cmd/tokenizer/README.md) for full CLI documentation.
@@ -90,6 +93,29 @@ go get github.com/agentstation/tokenizer/llama3
9093

9194
## Quick Start
9295

96+
### CLI Usage
97+
98+
```bash
99+
# Install via Homebrew
100+
brew install agentstation/tap/tokenizer
101+
102+
# Encode text (simple, intuitive)
103+
tokenizer llama3 "Hello, world!"
104+
# Output: 128000 9906 11 1917 0 128001
105+
106+
# Decode tokens
107+
tokenizer llama3 decode 128000 9906 11 1917 0 128001
108+
# Output: <|begin_of_text|>Hello, world!<|end_of_text|>
109+
110+
# Stream from files (automatic)
111+
cat document.txt | tokenizer llama3
112+
113+
# Get help
114+
tokenizer llama3 help
115+
```
116+
117+
### Library Usage
118+
93119
```go
94120
package main
95121

cmd/tokenizer/README.md

Lines changed: 23 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -21,7 +21,11 @@ The tokenizer CLI uses a subcommand structure where each tokenizer implementatio
2121
### Basic Commands
2222

2323
```bash
24-
# Encode text to token IDs
24+
# Encode text to token IDs (implicit - default action)
25+
tokenizer llama3 "Hello, world!"
26+
# Output: 128000 9906 11 1917 0 128001
27+
28+
# Encode text to token IDs (explicit)
2529
tokenizer llama3 encode "Hello, world!"
2630
# Output: 128000 9906 11 1917 0 128001
2731

@@ -31,6 +35,10 @@ tokenizer llama3 decode 128000 9906 11 1917 0 128001
3135

3236
# Get tokenizer information
3337
tokenizer llama3 info
38+
39+
# Show help
40+
tokenizer llama3 help
41+
# Or just: tokenizer llama3
3442
```
3543

3644
### Encoding Options
@@ -57,16 +65,23 @@ tokenizer llama3 encode -o newline "Hello, world!"
5765
### Piping and Streaming
5866

5967
```bash
60-
# Pipe text to encode
68+
# Pipe text to encode (automatic streaming)
69+
echo "Hello, world!" | tokenizer llama3
70+
# Output: 128000 9906 11 1917 0 128001
71+
72+
# Pipe text to encode (explicit)
6173
echo "Hello, world!" | tokenizer llama3 encode
6274

6375
# Pipe tokens to decode
6476
echo "128000 9906 11 1917 0 128001" | tokenizer llama3 decode
6577

6678
# Round-trip encoding and decoding
67-
tokenizer llama3 encode "test" | tokenizer llama3 decode
79+
tokenizer llama3 "test" | tokenizer llama3 decode
80+
81+
# Stream large files (automatic)
82+
cat large_file.txt | tokenizer llama3
6883

69-
# Stream large files
84+
# Stream large files (explicit)
7085
cat large_file.txt | tokenizer llama3 stream
7186
```
7287

@@ -75,10 +90,11 @@ cat large_file.txt | tokenizer llama3 stream
7590
For processing large files or real-time input:
7691

7792
```bash
78-
# Basic streaming
79-
tokenizer llama3 stream < input.txt
93+
# Automatic streaming (detects piped input)
94+
tokenizer llama3 < input.txt
95+
cat large_file.txt | tokenizer llama3
8096

81-
# Custom buffer settings
97+
# Explicit streaming with options
8298
tokenizer llama3 stream --buffer-size=8192 --max-buffer=2097152 < large_file.txt
8399

84100
# Stream without special tokens

llama3/cmd/llama3/command.go

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -74,8 +74,8 @@ Available commands:
7474
return streamCmd.Execute()
7575
}
7676

77-
// No args and no piped input, show usage
78-
return cmd.Usage()
77+
// No args and no piped input, show help
78+
return cmd.Help()
7979
},
8080
}
8181

0 commit comments

Comments
 (0)