Release v3.0.0: Hierarchical Break Detection · SCKelemen/unicode

Performance Improvements

Version 3.0.0 implements hierarchical optimization for the single-pass FindAllBreaks() API introduced in v2.0.0.

Leverages the natural subset relationships between break types:

This eliminates redundant checks and significantly improves performance.

Performance on Apple M4 Pro comparing v3.0.0 single-pass vs three separate function calls:

Text Length	v2.0.0 Three Passes	v3.0.0 Single Pass	Speedup
Short (33 chars)	3,457 ns/op	2,197 ns/op	1.57x
Medium (86 chars)	16,191 ns/op	9,636 ns/op	1.68x
Long (467 chars)	423,491 ns/op	188,982 ns/op	2.24x

Key benefits:

Speedup increases with text length (hierarchical pruning more effective on longer text)
Single UTF-8 decode and classification pass
Pre-classified data reused across all three break types
No additional memory allocations compared to v2.0.0

Maintains 100% conformance on all official Unicode 17.0.0 test suites:

None - all existing APIs remain backward compatible.