File tree Expand file tree Collapse file tree 6 files changed +17
-17
lines changed
Expand file tree Collapse file tree 6 files changed +17
-17
lines changed Original file line number Diff line number Diff line change 1616- STRINGWARS_TOKENS: Tokenization mode ('lines', 'words', 'file')
1717
1818Examples:
19- python bench_find.py --dataset README.md --tokens lines
20- python bench_find.py --dataset xlsum.csv --tokens words -k "str.find"
21- STRINGWARS_DATASET=data.txt STRINGWARS_TOKENS=lines python bench_find.py
19+ uv run bench_find.py --dataset README.md --tokens lines
20+ uv run bench_find.py --dataset xlsum.csv --tokens words -k "str.find"
21+ STRINGWARS_DATASET=data.txt STRINGWARS_TOKENS=lines uv run bench_find.py
2222
2323Timing via time.monotonic_ns.; throughput in decimal GB/s. Filter with -k/--filter.
2424"""
Original file line number Diff line number Diff line change 1818- STRINGWARS_TOKENS: Tokenization mode ('lines', 'words', 'file')
1919
2020Examples:
21- python bench_fingerprints.py --dataset README.md --tokens lines
22- python bench_fingerprints.py --dataset xlsum.csv --tokens words -k "datasketch"
23- STRINGWARS_DATASET=data.txt STRINGWARS_TOKENS=lines python bench_fingerprints.py
21+ uv run --with stringzillas-cpus bench_fingerprints.py --dataset README.md --tokens lines
22+ uv run --with stringzillas-cpus bench_fingerprints.py --dataset xlsum.csv --tokens words -k "datasketch"
23+ STRINGWARS_DATASET=data.txt STRINGWARS_TOKENS=lines uv run --with stringzillas-cpus bench_fingerprints.py
2424"""
2525
2626import os
Original file line number Diff line number Diff line change 3030- STRINGWARS_TOKENS: Tokenization mode ('lines', 'words', 'file')
3131
3232Examples:
33- python bench_hash.py --dataset README.md --tokens lines
34- python bench_hash.py --dataset xlsum.csv --tokens words -k "xxhash"
35- STRINGWARS_DATASET=data.txt STRINGWARS_TOKENS=lines python bench_hash.py
33+ uv run bench_hash.py --dataset README.md --tokens lines
34+ uv run bench_hash.py --dataset xlsum.csv --tokens words -k "xxhash"
35+ STRINGWARS_DATASET=data.txt STRINGWARS_TOKENS=lines uv run bench_hash.py
3636"""
3737
3838import argparse
Original file line number Diff line number Diff line change 1414- Random byte generation: NumPy PCG64, NumPy Philox, and PyCryptodome AES-CTR
1515
1616Examples:
17- python bench_memory.py --dataset README.md --tokens lines
18- python bench_memory.py --dataset README.md --tokens words -k "translate|LUT|AES-CTR|PCG64|Philox"
17+ uv run bench_memory.py --dataset README.md --tokens lines
18+ uv run bench_memory.py --dataset README.md --tokens words -k "translate|LUT|AES-CTR|PCG64|Philox"
1919"""
2020
2121from __future__ import annotations
Original file line number Diff line number Diff line change 1717- STRINGWARS_TOKENS: Tokenization mode ('lines', 'words', 'file')
1818
1919Examples:
20- python bench_sequence.py --dataset README.md --tokens lines
21- python bench_sequence.py --dataset xlsum.csv --tokens words -k "list.sort"
22- STRINGWARS_DATASET=data.txt STRINGWARS_TOKENS=lines python bench_sequence.py
20+ uv run bench_sequence.py --dataset README.md --tokens lines
21+ uv run bench_sequence.py --dataset xlsum.csv --tokens words -k "list.sort"
22+ STRINGWARS_DATASET=data.txt STRINGWARS_TOKENS=lines uv run bench_sequence.py
2323"""
2424import os
2525
Original file line number Diff line number Diff line change 2929- STRINGWARS_TOKENS: Tokenization mode ('lines', 'words', 'file')
3030
3131Examples:
32- python bench_similarities.py --dataset README.md --max-pairs 1000
33- python bench_similarities.py --dataset xlsum.csv --bio -k "biopython"
34- STRINGWARS_DATASET=data.txt STRINGWARS_TOKENS=lines python bench_similarities.py
32+ uv run --with stringzillas-cpus bench_similarities.py --dataset README.md --max-pairs 1000
33+ uv run --with stringzillas-cpus bench_similarities.py --dataset xlsum.csv --bio -k "biopython"
34+ STRINGWARS_DATASET=data.txt STRINGWARS_TOKENS=lines uv run --with stringzillas-cpus bench_similarities.py
3535"""
3636
3737import os
You can’t perform that action at this time.
0 commit comments