Releases: davidesantangelo/krep
v2.1.0
What's New in v2.1.0
🔧 New Features
- Stdin pattern input (
-f -) — Read patterns from stdin for seamless pipeline integration. Example:echo 'pattern' | krep -f - target.txt(#33) - Gitignore support (
--gitignore) — Respect.gitignorefiles when searching recursively with-r. Supports glob patterns, directory-only rules, and negation (!) patterns (#11) - Algorithm selection (
--algo) — Override the automatic search algorithm selection. Choose betweenauto(default),bm(Boyer-Moore-Horspool), orkmp(Knuth-Morris-Pratt). Aho-Corasick is already built-in and auto-selected for multi-pattern searches (#12) - Automated release binaries — Platform binaries (Linux x86_64, macOS arm64, macOS x86_64) are now automatically built and attached to GitHub releases (#5)
📦 Platform Binaries
Binaries will be attached to this release once the CI workflow completes:
krep-linux-x86_64.tar.gzkrep-macos-arm64.tar.gzkrep-macos-x86_64.tar.gz
✅ Testing
All 171 tests pass (161 unit tests + 10 directory integration tests).
Full Changelog: v2.0.0...v2.1.0
v2.0.0
krep v2.0.0
Highlights
- Major performance improvements in
search_filethreading path (single-thread fast path + lower thread-pool overhead). - Added reproducible
krepvsripgrepbenchmark script:test/benchmark_krep_vs_rg.sh. - Expanded test coverage with real multithread consistency tests and recursive directory integration tests.
- CI hardened: build + unit tests + directory integration tests on Ubuntu and macOS.
- Recursive skip fix for minified assets (
.min.*).
Dataset Benchmark Command
curl -LO 'https://burntsushi.net/stuff/subtitles2016-sample.en.gz'
gzip -dk subtitles2016-sample.en.gz
make bench-rgv1.5.0
Highlights
- Speed up -c line counting by skipping counted lines in scalar and SIMD search paths.
- Fix case-insensitive single-byte scanning and anchored regex line starts.
- Add regression tests for multi-match line counting and single-char case-insensitive searches.
Tests
- make test
v1.4.2
This release brings significant performance improvements, expanded SIMD support, and better cross-platform compatibility.
New Features
AVX-512 SIMD Support
- Added ultra-high-performance AVX-512 search for patterns up to 64 bytes
- Automatic detection and utilization of AVX-512 instructions on supported CPUs
- Graceful fallback to AVX2/SSE4.2 on older hardware
Enhanced Memory Performance - Added prefetching (__builtin_prefetch) in search functions for better cache utilization
- Reduced MIN_CHUNK_SIZE to 2MB for improved parallelism on multi-core systems
- Added compiler optimization hints (LIKELY/UNLIKELY, HOT_FUNCTION)
Thread Pool Improvements - Adaptive mutex using PTHREAD_MUTEX_ADAPTIVE_NP where available
- Reduced thread stack size to 256KB for lower memory overhead
- Added batch task submission for improved efficiency
- Smarter thread count selection (cores - 1 for system headroom)
v1.4.1
What's Changed
- Fix heap buffer overflow in
memchr_short_searchfunction by @Bleem-Fuzzer in #27 - Fix NULL Pointer Dereference in strcmp by @Bleem-Fuzzer in #29
- Fix heap buffer overflow in regexec by @Bleem-Fuzzer in #31
New Contributors
- @Bleem-Fuzzer made their first contribution in #27
Full Changelog: v1.4.0...v1.4.1
v1.4.0
Added
- NEON SIMD Support: Implemented a fully optimized NEON SIMD search algorithm for ARM64 architectures (e.g., Apple Silicon). This significantly reduces CPU usage and improves search speed for patterns of any length.
- Small File Optimization: Added a specialized path for small files (< 64KB) using
read()instead ofmmap(), reducing system call overhead and page faults.
Changed
- Thread Optimization: Added padding to thread data structures to prevent false sharing on multicore systems, improving parallel scaling.
- Performance: General CPU usage reduction across all search modes.
Fixed
- Double Counting Bug: Fixed a logic error in the NEON search optimization that caused some matches to be counted twice when using the
-c(count) option.
v1.3.0
Performance Improvements
-
Pre-selected search algorithm: Search algorithm is now selected once before thread execution rather than redundantly in each worker thread, reducing overhead in multi-threaded searches
-
Sequential file access optimization: Added
posix_fadvisewithPOSIX_FADV_SEQUENTIALhint to encourage kernel readahead on supported platforms (Linux), improving I/O performance for large file searches -
Conditional sorting optimization: Match results are only sorted when there are 2 or more matches, avoiding unnecessary
qsortcalls for single-match cases
Memory & Resource Management
-
Improved match result merging: Introduced
match_result_merge_limited()function to efficiently handle max count limits when merging thread results, replacing the previous item-by-item loop with optimized batch operations -
Stdout buffer initialization fix: Stdout buffer is now initialized only once using a static flag, preventing redundant
setvbufcalls
User Experience
- Reduced warning noise:
madvisewarnings are now emitted only once per execution using an atomic flag, with subsequent warnings suppressed to avoid cluttering output when processing multiple files
Code Quality
-
Type definitions reorganization: Moved
search_func_ttypedef to appear beforethread_data_tstruct in header file for better logical ordering -
Removed binary artifact: Deleted
test_krepbinary from version control
Bug Fixes
- Thread result merging refactored: Simplified and more robust logic for merging thread-local match results with proper handling of max count limits and error conditions
v1.2.1
This commit introduces significant improvements to the regex_search function for more robust cursor advancement and corrects command-line argument parsing in main. Tests have been updated accordingly.
krep.c:
-
regex_searchfunction:- Enhanced Cursor Advancement: Refactored the logic for advancing the search cursor (
cur) after a match or non-match. This aims to prevent infinite loops, especially with zero-length matches (e.g.,^$,a*) or whenregexecreports unusual offsets (pmatch[0].rm_eo < pmatch[0].rm_so). The new logic ensurescuralways progresses by at least one character if a zero-length match occurs or if the current position needs to be skipped (e.g., failed whole-word match). - Removed a redundant loop termination condition (
if (rem == 0 && cur != text_start) break;). - Simplified the
max_countcheck. - Improved handling when
regexecindicates an invalid match range (eo < so) by advancing past the problematic point. - Ensured consistent advancement logic when a
whole_wordcheck fails.
- Enhanced Cursor Advancement: Refactored the logic for advancing the search cursor (
-
mainfunction (Argument Parsing):- Corrected
-s(String Mode) Handling: Fixed the parsing of arguments for the-soption. The argument immediately following-sis now correctly treated as the PATTERN, and the subsequent non-option argument is taken as the STRING_TO_SEARCH. - Improved Pattern and Target Logic: Refined the logic for identifying the primary pattern when not provided by
-e,-f, or-s. - Robust Target Argument Identification: Enhanced the determination of the
target_arg(file, directory, or string to search), especially when dealing with stdin or missing file/directory arguments. - Updated error messages for missing patterns or target strings to be more specific.
- Corrected
test/test_regex.c:
TEST_ASSERTMacro:- Modified the output format to
✓ PASS: messageor✗ FAIL: message. - Removed the printing of file and line numbers for failed assertions to simplify test output.
- Modified the output format to
- Test Case Cleanup: Removed numerous verbose comments and explanations from individual test functions, aiming for conciseness.
test_regex_vs_literal_performance:- Updated the initialization of
bm_paramsto be more explicit and ensure proper memory management (allocation and cleanup). - Slightly adjusted the generation of
large_textfor the performance benchmark. - Switched to using
regex_search_compatfor the regex part of the performance test.
- Updated the initialization of
- Minor cleanups in other test functions, such as removing redundant comments.
These changes enhance the reliability of krep's search functionality and its command-line interface.
v1.2.0
This release focuses on significantly improving the performance and correctness of multi-pattern searching using the Aho-Corasick algorithm, along with other fixes and enhancements.
✨ Features & Performance
- Aho-Corasick Performance Boost: The Aho-Corasick trie is now built only once per file or string search context, instead of repeatedly. This drastically reduces overhead when searching with multiple literal patterns, especially on large files or with multiple threads.
- Optimized
-oOutput: Improved the efficiency and safety of printing only matching parts (-o) using buffered output and better overflow handling.
🐛 Bug Fixes & Correctness
- Aho-Corasick Correctness:
- Fixed the failure link calculation logic during trie construction.
- Corrected the search algorithm to ensure all matches, including overlapping ones (e.g., finding "he", "she", and "hers" in "ushers"), are reported correctly.
- Improved handling of empty patterns (e.g.,
krep "" file.txt) to ensure they match empty lines/files as expected. - Ensured case-insensitive matching (
-i) works correctly with Aho-Corasick.
- Regex Search Robustness:
- Improved error handling during regex execution.
- Fixed potential infinite loops related to zero-length matches.
- Correctly integrated the whole-word (
-w) check with regex searches.
-oOutput Formatting: Replaced internal newlines within matched segments with spaces when using the-oflag for cleaner, single-line output per match.
⚙️ Internal & API
- Aho-Corasick Refactoring: Major internal refactoring of the Aho-Corasick implementation for clarity, robustness (dynamic queue resizing, overflow checks), and better API structure (using
search_params_tto pass the pre-built trie). - Header Decoupling: Reduced dependencies in
aho_corasick.h. - Made the internal
lower_tableglobally accessible for shared use.
✅ Testing
- Updated Aho-Corasick tests to reflect the new pre-build API.
- Added checks for successful trie building.
- Added a new performance test comparing combined Aho-Corasick vs. individual literal searches.
v1.1.2
Changed
- Improved Output Formatting and Batching: Refactored the
print_matching_itemsfunction for enhanced performance and reliability in managing line formatting and batch output. Key improvements include:- Pre-calculating buffer sizes for more efficient memory allocation.
- Utilizing direct pointers for writing operations to potentially speed up output.
- Enhanced batch buffer management, including better handling of lines that might exceed buffer capacity.
- Added error checking for output operations to improve robustness.
- Cleaned up and standardized header comments within source files for better clarity and consistency.