Skip to content

Commit 58c7e15

Browse files
author
TinySemVer
committed
Release: v4.0.0 [skip ci]
### Major - Break: Output error messages via C API (fc94890) - Break: New wording for incremental hashers (66898ad) - Break: Rust namespaces layout (b3338bb) - Break: Refactor `Strs` ops (009b975) - Break: Rename again (fb56b60) - Break: Drop fingerprinting bench (3e04157) - Break: `sz::edit_distance` -> Levenshtein (d44beb4) - Break: C++ `lookup` and `fill_random` (1ce830b) - Break: `charset`/`generate` -> `byteset`/`fill_random` (2ce2b49) - Break: New calling convention in `similarity.h` (095bc2d) - Break: Return error-codes in sort functions (944804e) - Break: `look_up_transform` to `lookup` API (e0055d5) - Break: `checkum` to `bytesum`, new hash, and PRNG (71f1f4b) - Break: Pointer-sized N-gram Sorting (0c38bff) - Break: `sz_sort` now takes allocators (ec81663) - Break: Deprecate old fingerprinting (38014ee) - Break: Replace `char_set` constructor with literals (2c49eae) ### Minor - Add: `Str.count_byteset` for Python (802d699) - Add: GoLang official support (234b758) - Add: `HashMap` traits for Rust (e555cc3) - Add: `try_resize_and_overwrite` (e00a98b) - Add: Hashers for Swift (be363c9) - Add: Big-endian SWAR backends (607dd14) - Add: Zero-copy JS wrapper for buffers (4bf2dd6) - Add: `sz.fill_random` for Python (3565f9b) - Add: Allow resetting the dispatch table (68172a0) - Add: Ephemeral GPU executors if no device is passed (6431900) - Add: Capabilities getters for SW (86c89b7) - Add: StringZillas for Rust draft (48a1120) - Add: Fingerprinting benchmarks (471649e) - Add: On GPU fingerprints in Python (6f26629) - Add: `basic_rolling_hashers` CUDA port (a25e3f2) - Add: Unrolled fingerprinting backends (9a04c95) - Add: NW and SW scoring classes (958c9e1) - Add: SW scoring (3e8a3a6) - Add: NW scoring (079a8c1) - Add: Capability-constrained Py constructors (d0dfa0e) - Add: `LevenshteinDistancesUTF8` in Py (4cd6c7d) - Add: Wrap `DeviceScope` for Python (30b8fd3) - Add: Make `Strs` from lists, tuples, generators (05a5434) - Add: StringZillas Python tests (978d13d) - Add: `Strs` layout conversion tests (4e5df62) - Add: Exportable `_sz_py_api` capsules (9d5935b) - Add: Levenshtein kernels in shared lib (42a815f) - Add: `Strs.from_arrow` conversion (0441b32) - Add: Draft fingerprinting C binding (3f5e004) - Add: Fingerprinting baselines (f997dad) - Add: `SZ_NOINLINE` (4e6e1af) - Add: Draft fingerprinting on GPUs (11af644) - Add: Haswell rolling fingerprints (25d3ee6) - Add: Draft exploration of `float` fingerprints (3955eee) - Add: `lock_guard` to avoid STL (e6cdb93) - Add: Parallel fingerprinting (4ccac06) - Add: `to_span`, `to_view` helpers (4fb9283) - Add: Min-Hashing `basic_rolling_hashers` (63447d3) - Add: Fingerprinting benchmarks (00c68d5) - Add: 64-bit `double` fingerprinting (9ed6006) - Add: Test rolling hashes (a065aa0) - Add: Fingerprinting drafts are back :) (9ab4a2b) - Add: Draft parallel library backend (94147ed) - Add: Long haystack CUDA kernels for find-many (51c9171) - Add: `find_many` minimal counter & benchmarks (3c615d9) - Add: `bench_find_many.cpp` (2d594e4) - Add: CUDA Aho Corasick placeholder (d087afa) - Add: Parallel Ice Lake variants (6aaf16c) - Add: NW and SW for Hopper (e53c8d5) - Add: Hopper Levenshtein kernels (cb7c48d) - Add: Affine gaps Levenshtein on Kepler (f09bbf9) - Add: Ice Lake Affine Levenshtein kernels (51afad5) - Add: Affine Levenshtein variants on GPU (3e1df93) - Add: `sz_similarity_gaps_t` enums (5b410b8) - Add: Concepts & new multithreading scheme (6be0da5) - Add: Draft StringCuZilla C API (1975351) - Add: Draft executors for StringCuZilla (e1a37de) - Add: C++20 concepts (f7b091c) - Add: Unmasked Ice Lake Levenshtein (2b4fa09) - Add: Affine Levenshtein-Gotoh variants (25ab3b6) - Add: Draft global scoring in CUDA (07196d0) - Add: Affine gap extensions baseline (945613c) - Add: Draft device-wide similarity kernels (871d7bc) - Add: Levenshtein on Kepler (938bf6f) - Add: Parallel multi-needle search with OpenMP (c8dc314) - Add: Draft parallel substring search (cf53b9b) - Add: Overflow-risk error codes (783cfa8) - Add: Random access ranges (0a26059) - Add: `safe_vector`, `safe_array` (ac6218a) - Add: Immutable iterators for Arrow tapes (b70096b) - Add: `indexed_container_iterator` (a5ca2ac) - Add: Multi-pattern exact substring search (a83cdb5) - Add: NW benchmarks on GPU (1879aeb) - Add: Multi-flag `sz_caps` (f76fa40) - Add: Repeat similarity benchmarks on CPU (f64fc56) - Add: `gpu_specs` and `cuda_status_t` (c36d2b8) - Add: Ice Lake similarity kernels (3c8d181) - Add: Fetching Nvidia GPU specs (d0bdd17) - Add: New batch similarity benchmarks (aca3301) - Add: Warp-shuffle optimizations (4a62715) - Add: Mem consumption CUDA tests (53d3e0d) - Add: New NW and SW GPU kernels (ca0fece) - Add: Blosum62 and NUC.4.4 matrices (8a99373) - Add: Separate parallel & serial tests (3cb8dd0) - Add: Parallel SW & NW scoring in OpenMP (85b9675) - Add: Thrust-like `constant_iterator` (6751e79) - Add: `arrow_strings_tape::try_append` (6d7f221) - Add: CUDA scoring benchmarks (7ac0fdd) - Add: Baseline NW and SW alignment with ~O(NM) space (044c7cc) - Add: Horizontal scoring in OpenMP (02a2481) - Add: Local alignment adaptation (109d8de) - Add: Parallel GPU kernels (2861a85) - Add: OpenMP `score_diagonally` (427d5b5) - Add: `lookup` transform in Rust (3090472) - Add: OpenMP C++ draft (6f8cdb9) - Add: Stateful hashing in Rust (763538e) - Add: Expose `find_byteset` in Rust (667ea91) - Add: Set intersections in Rust (06784fc) - Add: Sorting in Rust (e467649) - Add: New memory benchmarks (dfd6ddf) - Add: Draft `sz_find_sve` (d7ede5d) - Add: Faster `find_byte` with SVE (56a49c8) - Add: New string similarity benchmarks (c12e6c4) - Add: Short string hashing in SVE2 (a007c7c) - Add: SVE & SVE2 bytesum (d6f87d7) - Add: SVE2 macros (9fbdc9a) - Add: Intersection benchmarks (e860af0) - Add: SVE backend for sorting (148b615) - Add: All new benchmarking suite (4744406) - Add: Comparisons in SVE (c31020d) - Add: Arm NEON hashing (4b3847d) - Add: Missing SVE placeholder definition (63f0368) - Add: C++ `argsort`, `intersect` (5ea0698) - Add: `status_t` for errors in C++ (63daa5f) - Add: Feature-extraction placeholder (de62723) - Add: Intersections on Ice Lake (ea5dc76) - Add: Serial JOINs (c7b841e) - Add: Dispatched version API (9a32744) - Add: Fetching dynamic library version in C (3538e97) - Add: PRNG for Haswell & serial backend (6659aa0) - Add: Streaming hashing on Ice Lake & Skylake X (2607d45) - Add: Streaming hash benchmarks (8ac3a23) - Add: Hashing on Haswell & Skylake-X (3c345bc) - Add: Missing `sz_sequence_t` helpers (dc7c109) - Add: Sorting placeholders & dispatch (cc98389) - Add: `sz_sequence_argsort_ice` (69d4ecb) - Add: AES-based hash placeholders (cb18c78) - Add: Smaller Sorting Networks (cd6859a) - Add: String sorting tests for different lengths (c670ccd) - Add: Separate Skylake-X & Ice Lake checksums (554f50d) - Add: New Levenshtein distance kernels (43471aa) - Add: Missing Rust interfaces (1765f33) ### Patch - Docs: Pre-release stats update (c08c6c3) - Make: Upgrade `setuptools` in CI (492ecc0) - Improve: Allow `RuntimeError` for engine calls (f9cbb00) - Docs: Section titles (755f583) - Improve: PyTest invalid input arguments (f5e46d5) - Improve: All new similarity-scoring benchmarks (3713e72) - Make: Option to disable sanitizers for masked IO (f880621) - Make: Default to Python 3.12 for better `itertools` (f2704d7) - Fix: Naming scorers in Python like in Rust (d0bc604) - Fix: Avoid SWAR on big-endian (230e354) - Improve: More big-endian SWAR tests (7a4b78f) - Fix: Drop old similarity APIs in benchmarks (230fc13) - Make: Packaging for Py & Go (4c819d6) - Improve: Byte-set counting PyTests (bd692d2) - Docs: What to know about CUDA (15349c6) - Improve: `Str_like_*` naming convention (5fdd8ee) - Fix: Merge artifacts & lifetime annotations (c71e67e) - Docs: Wording & AI dashes (4a06f3f) - Make: Packaging for NPM (0d95f13) - Make: Drop long-deprecated `.releaserc` (8b5af6e) - Make: Enable SIMD in NodeJS builds (c8f6a49) - Improve: Polish parallel string test names (61cc860) - Make: Reuse SIMD compilation flags in `build.rs` (d129f95) - Make: Missing AES definitions for lib builds (e951c4d) - Docs: Hashing sections for each SDK (cb4fe1b) - Improve: Expose `.capabilities` to JS (c557a62) - Improve: Test incremental hashers (7c6d37d) - Fix: Self-move construction of `basic_string` (5fb0e74) - Fix: Strict aliasing violation (a5a2421) - Fix: `__cpp_lib_string_resize_and_overwrite` test guards (cb2d6a8) - Docs: How to use parallel algorithms (ecad156) - Fix: Stricter following of `SZ_AVOID_STL` (c3040b7) - Docs: JS Quick Start (c923ed2) - Make: Publish to NPM (b2cedc4) - Make: Drop CodeQL noise (7c9caac) - Docs: Describe dynamic dispatch & linking (e6f2c02) - Improve: NodeJS groundwork & corner-case tests (#151) (00d75f5) - Fix: `_MSC_VER` to `__GNUC__` conditions (83492ac) - Fix: Unknown pragmas in MSVC (#231) (78bbc11) - Make: Parallel algorithms CI/CD for PyPI (9b8c466) - Make: Ignore formatting blames (b69206f) - Make: JSON & YAML uniform formatting (320bddd) - Make: Explicit CodeQL coverage in CI (e9fb38d) - Fix: Match new C-level `DeviceScope` behavior (f217b86) - Fix: Sorting on big-endian `s390x` (8d2d9c8) - Fix: Ruff-statically suggested issues (edaff0e) - Improve: Disable `E722` import exception warning (3aedfb2) - Fix: Check for immutable Py buffers (e8c437e) - Improve: Test PRNG in Py and boundary Strs sizes (3d76e8f) - Fix: `np.random` v1 vs v2 compatibility also in `szs.` (5f8ac03) - Fix: Prevent PyTest from parsing invalid UTF-8 (30c8935) - Fix: Don't repeat seed-ed fuzzy tests (efceb0b) - Fix: `np.random` v1 vs v2 compatibility (aea096b) - Make: Forwarding `SZ_IS_QEMU_` (d433db0) - Fix: Minor logical inconsistencies & unused vars (679f7b9) - Improve: Disable SVE in QEMU runs (072ee2e) - Fix: Avoid `sys.getrefcount` tests on PyPy (80fa90e) - Improve: Check rich comparisons before sorting (a21c44d) - Improve: Session-scope fixture for PyTest env logs (3d84812) - Improve: Fuzz PyTests and log environment (6545a04) - Make: Respect env-vars for `-arch` (6a450f6) - Make: Upgrade GitHub actions (dec93d0) - Make: Avoid universal builds defaults for `pip install .` (c462f42) - Improve: Guard compiler pragmas (42ad14d) - Make: Bump FU to avoid missing `+wfxt` target (337257b) - Make: Detect CPU AES support on Arm (ae74d44) - Make: Reinstall pre-packaged CMake on macOS-14 (24967c7) - Fix: Fall-back CPU alloc for fingerprints (b582d5d) - Make: Drop macOS Universal builds (e5cfb08) - Fix: Sorting difference on 32/64 bit machines (0e17a3d) - Make: Respect `MACOSX_DEPLOYMENT_TARGET` (95a95fe) - Make: Avoid `fail-fast` for Python pre-release wheels (a7c3f04) - Fix: Win32 compilation issues (f792366) - Improve: Test against Affine Gaps (65d323e) - Improve: Avoid many unified memory re-allocs (d0db2d4) - Fix: Intersect scopes HW capabilities (e4245b7) - Make: Drop Python 3.7, require 3.8+ (c65bf5e) - Make: Verbose PyTest logging in CI (c273d52) - Fix: OS-feature-gate AVX checks (cd81b4a) - Fix: Linking to 64-bit symbols (7da4dac) - Make: Log HW caps in CI before tests (2c02d5b) - Fix: Type-casting on MSVC (a347dcd) - Make: Irrelevant links & comments (d6221e6) - Make: Target Hopper `90a` in Py & C (c317280) - Make: Outdated 64-bit detection envs (72d0f4b) - Docs: Wording typos (cf1623c) - Make: Missing `affine-gaps` dep (fa52363) - Docs: No Alpine flow on release CI (f214169) - Fix: Argument order (10dff6c) - Make: Can't read `SWIFT_VERSION` for `container` (a4015f1) - Improve: Differentiate `capabilities_mode` in PyTest (2bee1e4) - Fix: Dispatch serial code for `bytes_per_cell <= 2` (48b4406) - Improve: New error codes for CPU/GPU interop (6e597ca) - Fix: Avoid UB assigning i8x256x256 matrix (7ac3b83) - Make: Bump FU due to sign conversions warnings (cbd160a) - Fix: Detect missing GPUs at runtime (0bc0f8a) - Improve: Formatting Swift (3d6669c) - Improve: Test against `affine-gaps` (48a538c) - Improve: Reduce sign-casting issues (a340131) - Fix: Check CUDA in `szs_capabilities` (42f043c) - Improve: Build & test reproducibility (446e14e) - Make: Override `/std:c++` for MSVC (930b1f0) - Make: Add CUDA to GitHub CI (2a4552c) - Make: Workaround CI issues (46410c6) - Make: No `bare` builds on Windows & macOS (b42a340) - Fix: MSVC compilation issues (5f60fe4) - Improve: Simplify setting thread-counts (20bbe22) - Make: Install `sz` before `szs` in CI (9033649) - Make: Skip x86 intrinsics in `universal` builds (ce6cab2) - Fix: Inferring Ice Lake similarity kernels (a65dc99) - Fix: Disambiguate `szs_` symbols (1fd5db8) - Make: Override VS Code compiler choice on `osx` (d06d4e4) - Make: Avoid SVE builds on macOS (abb970b) - Fix: Wrong `SZ_DYNAMIC_DISPATCH` check (acdab3f) - Make: Preinstall `wheel` in CI (62f8c98) - Make: Bump tapes to 4.0 (72fd6be) - Fix: Avoid forced inlining for HW flags (4cef520) - Fix: Feature-checking STL (b1418b5) - Fix: Missing `allocator_traits` include (44b8d72) - Fix: Workaround for `static_cast` (b6a4cc6) - Fix: Avoid unaligned XMM loads (0f840bb) - Fix: `static_cast` to standard for MSVC (43c953f) - Make: Avoid `uv` in GitHub CI (89ed74d) - Fix: Unused variable in `group_by` (3761107) - Fix: Unused `qsort` on MacOS (cd263ae) - Improve: Unaligned loads in serial hashes (5e9d488) - Make: Parallel backends CI (352b48d) - Make: Referencing old tests (7c4fe32) - Make: Caps introspection flags on Arm (d6c7cf3) - Fix: Converting to string views (6b00ab9) - Fix: Unused symbols (e825ed8) - Fix: Fetching engines `::capability_k` (ddc640b) - Fix: uninitialized intersection `count` (ef3ca96) - Make: Disable NUMA by default (b27552d) - Fix: Guard SVE checks for cross-compilation (dac3941) - Make: Install Git on Alpine (1ff0c15) - Make: Log Alpine version (7ea327e) - Fix: Type-casting seed on Clang (4598f42) - Make: Bump Fork Union to 2.2.2 (e674413) - Fix: Deprecate `levenshteinDistance` in Swift (70c4add) - Fix: Passing StringZillas doctests (529fb76) - Fix: Allow NULL allocator args (b0c33bd) - Make: NVCC flags for Rust (6b38ea2) - Fix: Report requesting 1 CPU core (4ef0464) - Improve: Passing StringZillas.rs tests (34bb89a) - Improve: Use StringTape for GPU backends (87a0767) - Docs: StringZillas C API (34f4137) - Fix: Memalloc initialization on MSVC (#230) (a6e0a77) - Docs: Drop OpenMP and old name (7f45118) - Improve: Infer `capabilities` from `DeviceScope` (a807eba) - Improve: Introspect `sz_device_scope_t` (d9100a3) - Make: Consistent `-O2` optimization (5fcde22) - Improve: Drop unused `info1` (f4d4a76) - Fix: Rendering byte strings in Python (cf73d79) - Fix: Forward errors from `sz_rune_parse` (474dec4) - Improve: PyTest different MinHash dimensions (b5060a8) - Fix: Expose `value_type` for CUDA fingerprinter (7724886) - Fix: Fingerprinting memory management (f8dea13) - Fix: `to_span` compilation (e7fdd98) - Improve: Wrap high-dim fingerprints (bbf30d2) - Fix: Handling empty strings in arrays (517757c) - Improve: More readable PyTest (a6d3ed2) - Fix: Checking for Ice Lake caps (4f7649a) - Improve: Simpler & slower Py args parsing (36745f4) - Improve: Propagate error message to Py (711bd63) - Improve: Comparing 2 mem-allocators (7870cd3) - Fix: Skip missing `affine_levenshtein_utf8_ice_t` (1e112c6) - Make: Custom `CudaBuildExtension` for Python (4abf63c) - Make: Option to disable CUDA builds (84492e5) - Fix: Refer to `prong_t` in executor concepts (57d4ec8) - Fix: MSVC & Clang compilation errors (31fcdb3) - Make: Avoid OpenMP in builds (183ea96) - Fix: Announce `LevenshteinDistancesUTF8Type` (cb81c00) - Improve: Cache hardware capabilities (b9a1109) - Improve: Export capabilities as a tuple (0223b62) - Improve: Printing CUDA caps (94d8c36) - Fix: Match Apache Arrow layout (38c87b4) - Improve: Type-casting `seed`s in `Strs.sample` (5ac53c3) - Improve: Constructing `Strs` from PyArrow (7aebef4) - Fix: Track ownership of `Strs` offsets (65380ca) - Improve: Expose `sz_capabilities` in non-dynamic builds (6206eb4) - Fix: Avoid depending SZS -> SZ (c411e37) - Docs: Mark programming languages correctly (d1fc68c) - Make: Bump C++ & CUDA to 20 for libs (320f3da) - Fix: `rebind_alloc` in C++20 (492d726) - Fix: Tautological compare check (b924d90) - Improve: Random-access similarity outputs (c5e778d) - Make: Building parallel Python packages (0f44c25) - Fix: Clang build warnings (7741272) - Fix: Using braces for Clang builds (325cedb) - Make: Pull submodules in CI (5bb90b3) - Fix: Avoid `_mm256_cvtepi64_epi32` on Haswell (014002e) - Make: Separate parallel library sources (ff019b1) - Make: Forward `march` flags through NVCC (496ae84) - Make: Move CUDA lib into header (d4a66c5) - Make: FMA flag for Haswell (2e1daa4) - Fix: Compilation of all C targets (a2b228c) - Make: Compiling StringZillas shared libs (6ac80e8) - Improve: `gpu_specs_fetch` & GPU args order (3d7b491) - Docs: Sync description one-liner (ab9b617) - Improve: Draft parallel fingerprinting API (e49f570) - Improve: Runtime variable window widths (031d067) - Make: Rename `lib.rs` (5d82454) - Docs: Reuse operators state (e5e2702) - Improve: Naming multi-input processors (a3c3510) - Fix: Scramble results between fingerprint benchmarks (fbf7203) - Make: Format Python to 120 columns (640b7c4) - Improve: Naming internal symbols (3b48c93) - Improve: Unroll CUDA fingerprints (37f3d80) - Docs: Refresh Python benchmarking suite (49cf4ea) - Fix: Weird compiler bug related to `cuda_status_t` (0a0955e) - Fix: Fingerprinting in CUDA (52b1d73) - Fix: Estimating hash counts in fingerprints (058af71) - Improve: Unroll & parallelize fingerprinting (cda36fd) - Fix: Inferring the prong type of executors (531b1e9) - Improve: Align thread-pool within stack-frame (b1077a4) - Improve: Wording inconsistencies (afaf11b) - Make: Launchers for Parallel C++ benchmarks (9f3beac) - Fix: Fingerprinting via Skylake extensions (08c1e86) - Fix: Passing fingerprinting builds (44a058d) - Improve: Include hash counts in fingerprints (0d9ba5b) - Improve: Consistent kernel naming without underscore prefixes (05725b2) - Improve: Expose floating-point SIMD states (4a57789) - Fix: Consistent `barrett_mod` in C++ & Python (1fc1cab) - Docs: Using `uv` for tests (f86dfaa) - Fix: Choosing co-primes with `std::gcd` (a1b3001) - Improve: Using fast calling convention for CPython (97ab23c) - Docs: Show higher recall with better hashes (255d443) - Improve: Ensure `seed` affects hashes (78d39f9) - Improve: Separate StringZillas Python code (883a3cd) - Fix: Fingerprinting compilation (19f92c4) - Improve: Explore Min-Hashes (46dd7d0) - Improve: Test fingerprint equivalence (e4aa3f7) - Fix: `is_same_type` usage over `std::is_same` (0ab5710) - Improve: Ignore previous UB commit in blame (7fc7323) - Fix: Avoid UB with underscore prefixes (74e3b6f) - Fix: `sz_bitcast` strict aliasing (80b97de) - Fix: Avoid `std::swap` in device code (c0aea26) - Fix: C++17 compatibility issues (7ea685d) - Fix: Guard C++20 concepts use (60763b3) - Fix: Backport `std::remove_cvref` to C++17 (ffee12b) - Improve: Move `safe_vector` (78d8c96) - Fix: Limit `constexpr` use in C++11 (866e2f2) - Fix: Minor build issues (46a6d63) - Improve: Extend dummy executors API (498d72a) - Fix: Wrong Fork Union class name (5def3af) - Improve: Compile-time-known `span` extents (8355b6e) - Improve: Move `arrays_equality` (e96d26f) - Improve: Merge fingerprinting drafts (cf6077e) - Make: Deprecate Find Many kernels (a6f799f) - Improve: Extend `find_many` tests (c80ce60) - Improve: Upgrade Fork Union (fb5f429) - Docs: Similar wording in "Explore Levenshtein" (766e250) - Fix: Correct namespaces for scripts (aafcbbd) - Fix: Replace `+g` with `+m,r` like GB (40bd3ed) - Fix: Wrong boundary conditions for `count_many_parallel` (0b22dd9) - Improve: Switch to "StringParaZilla" naming (0f3f928) - Improve: Cleaner haystack splitting (3e93b75) - Improve: Prioritize find-many tests (7a433a9) - Improve: `SZ_DYNAMIC` attributes (2f0334a) - Fix: Forwarding dataset `nothrow`-copyable views in tests (170b61b) - Improve: Use CUDA Atomics to aggregate globally (cf7d9a5) - Make: Missing `bench_search` target (1133cce) - Fix: `find_many.cuh` compilation issues (72affc2) - Fix: AC dictionary `try_assign` with different alloc (4f9d2db) - Fix: Warning around immutable span conversion (d88842a) - Docs: Target names (503e470) - Fix: Match C++ class names (966dd09) - Docs: Sections & links (f5a94b1) - Make: Bump `fork_union` (3f5de03) - Fix: Revert to separate `find_many` algos for different length (0acbeeb) - Improve: `__reduce_max_sync` in SW on Hopper (c45017d) - Fix: `unified_alloc` propagation (13e0201) - Docs: Move segmenting & features drafts (3972cb3) - Fix: Compiler over-optimizing `bench_find_many` (16f5fa4) - Fix: Prefix length in parallel counting of short needles (69c7b6a) - Fix: Including the entire haystack into match (3ee1676) - Fix: Buffer overflow due to wrong thread count (887c16b) - Improve: Skip vocabulary duplicates (5302a4d) - Improve: New multi-pattern search APIs (20c9135) - Fix: Missing `std::generate` include (853a0fa) - Improve: Multi-byte characters support (85c5bf8) - Improve: Propagate allocators in `safe_vector` (0c442d1) - Improve: Custom validators for nullary benchmarks (0cf994a) - Improve: Benchmark early exit (aaa6927) - Make: Rename `bench_search` -> `bench_find` (6f65624) - Docs: Aho-Corasick CUDA design (496e55a) - Improve: New caching primitives (df1f7ef) - Fix: Checking `STRINGWARS_STRESS` env-var (ade74f6) - Improve: Support executors in multi-pattern search (35fad3d) - Improve: Divergent branches on `i16` SW on Hopper (817b15c) - Fix: OOB impact on SW scoring (cd9ff1e) - Fix: NW & SW on Hopper (4134e44) - Improve: More noticeable signaling in tests (ece416d) - Improve: Scheduling speculative kernels (8a6a185) - Improve: Run multiple warps per block (3b2f263) - Fix: Comparing Affine benchmark results (64d8f4d) - Improve: Fuzzy test Ice Lake kernels (d603f7d) - Improve: Shrink Affine Ice Lake kernels (79e7a2f) - Fix: Affine top row initialization (b9e4160) - Fix: Levenshtein w. Affine costs on GPU for zero-length ins (076e58a) - Improve: Test weird affine gaps (3bbebc8) - Improve: Match `fork_union` API in executors (1d6f58d) - Improve: Use `fork_union` pools (81463d4) - Docs: Inconsistent naming (8b62f2c) - Make: Add `fork_union` dependency (bd3d341) - Improve: Duplicate `bytesum` assignment (bfcd10e) - Make: Bump CUDA version (c709621) - Fix: Ice Lake calls for empty inputs (f59ba5e) - Fix: Correcting blends on Kepler (563a73d) - Improve: Scheduling Levenshtein in CUDA (77b1087) - Improve: Avoid `views::group_by` with callback (6a61e1b) - Improve: `bytes_per_cell_t` enum (c6e907a) - Fix: Inconsistent timing in `bench_unary` (60b99ab) - Improve: Bounded methods not supported (23a7f58) - Improve: Use `requires` clause (22c691e) - Improve: Naming "executor" interfaces (7e778d9) - Improve: Generalize Levenshtein in CUDA (29da524) - Fix: Inclusion guard macro names (8e79fb7) - Docs: More datasets on HuggingFace (0b33ac3) - Improve: Aligned ZMM diagonal stores (10b5405) - Improve: Branchless K-mask calculation (3998c1f) - Improve: Measure gap magnitude (b4eb6a4) - Fix: Avoid horizontal walker overflow (9b2b4e5) - Fix: All Gotoh baselines (84397ae) - Fix: Initializing affine DP matrices (640853f) - Improve: Differentiate unary & uniform costs (8c15447) - Fix: Horizontal Affine Walkers (e5d85f3) - Improve: Alloc type-size check in `safe_vector` (9c5a56c) - Improve: Fetch warp-size dynamically (83dd5fb) - Improve: Warp-shuffle reductions in SW (914e98d) - Fix: Over-estimating number of overlapping matches (b85d3ca) - Improve: Faster multi-needle tests (0895d28) - Improve: Splitting jobs in baseline multi-search (d47d849) - Fix: Slicing corner-cases in OpenMP (7bcc803) - Improve: Use `std::execution` for baseline tests (f228264) - Improve: Parallel baseline for substring search (6ebc7b0) - Fix: `bytes_per_core_optimal` estimate (83bc966) - Improve: Pointer-constructible spans (a0fd136) - Fix: Passing multi-needle tests (faad971) - Fix: Calculating `find_many_match_t` properties (d0ebee8) - Fix: Indexing needle IDs (42fa08d) - Fix: Use smaller types in BFS queue (0c6dd00) - Fix: Aho Corasick construction (e27f86f) - Improve: Consistent shuffling behavior in benchmarks (d4d55fa) - Fix: `sz_copy_skylake` tail handling on large input (#222) (6da5e1e) - Fix: Propagate substitutions to benchmarks (eabe605) - Fix: Underutilized 99% of the H100 (7a9a243) - Fix: Using OpenMP directives (41e1a6e) - Fix: Forward GPU specs in CUDA tests (9d86d4c) - Improve: Generalize memory requirements estimates (7bd92c5) - Fix: Missing STL includes (f7d365a) - Improve: Use CUDA constant memory (cebd180) - Improve: Forward-declare substituters (0a04960) - Improve: Show signed integers in SIMD types (382c05d) - Fix: `i16` Ice Lake NW/SW alignment (8cc9794) - Improve: Catch & log exceptions (cb739e2) - Improve: Better UTF8 tests for similarity (b04e934) - Make: Parallel test launchers (1ece547) - Fix: Build issues (9db0287) - Fix: Included filename (ee38a22) - Fix: Uniform costs for UTF-32 runes (6edcf7f) - Improve: New template SFINAE in `similarity.hpp` (10279f9) - Fix: Initializing horizontal aligner (91df0ce) - Improve: Use tagged C `enum`s (35ba76c) - Improve: Allow custom validators in benchmarks (bd2a21d) - Fix: Compiling `constant_iterator` in CUDA (bc59ee3) - Fix: `std::allocator::rebind` deprecated (4c87404) - Fix: Avoid `std::iterator` dependency (b3db596) - Make: Move similarity benchmarks (90bfcb6) - Fix: Report error codes in tests (05f17a6) - Make: Separate Parallel C++ and CUDA tests (b175103) - Make: Rename test files (2eeab83) - Fix: Passing CUDA similarity tests (4c75d81) - Fix: NW/SW test correspondence (fea39ed) - Improve: Differentiate min/max-imizers (c96c5ed) - Improve: Annotate throwing exceptions (2ef667c) - Fix: Missing `sz_i16_t` definition (238c86d) - Fix: Avoid similarity scoring references (e4b1bbd) - Improve: Shorter type aliases (3ef1d26) - Make: Revert to C++ for core tests (0c6ff1f) - Docs: StringCuZilla design choices (f42aa85) - Make: Separate StringCuZilla (c5fd4bc) - Improve: Move capabilities to `types.h` (1c1582f) - Make: Compile with OpenMP (1671b0f) - Improve: Allow datasets in VRAM (b1c9a74) - Fix: Shuffle datasets with over 4B tokens (247c6ec) - Fix: Overflow `mean_token_length` calculation (efcadd1) - Make: NVCC kernel debugging symbols (0c0ff42) - Fix: Arrow-like string array (fa4b0f4) - Fix: Overwriting alignment scores (668a386) - Fix: Synchronizing CUDA kernel launch (3b85a00) - Fix: Shared memory requirements (27ad8f5) - Make: NVCC can't handle `fsanitize` (ea7647f) - Make: Draft CUDA compilation (1b96ef4) - Fix: Hardening `malloc(0)` behavior (7524882) - Improve: Share C++ macros and typedefs (e82d045) - Fix: Accounting for different gap costs (e4e517f) - Fix: NVCC warning for negative size field (a6c0fa2) - Fix: Track capacity in fixed buffer alloc (64b40a9) - Fix: Sign-cast warning in `_mm256_set1_epi64x` (aac2e8f) - Fix: Overriding `SZ_DEBUG` macro (bca734a) - Fix: Calling unused helper struct unit tests (d369170) - Improve: Cleaner API for OpenMP (bc311b3) - Fix: Shifting Levenshtein diagonals in OpenMP (6b5ef98) - Fix: Unaligned loads/stores of hash state (37863c9) - Improve: Expose rune-parsing headers (2727a87) - Fix: Diagonals depth (ef53f75) - Docs: Showcase indexing diagonals (b55c696) - Fix: `sz::lookup` examples in Rs (778d4f0) - Fix: Compiling SVE on MacOS (42270c8) - Improve: Inline cheap calls (28282d2) - Make: List `scripts/` deps for `uv` (1907d2b) - Docs: Formatting (dd57536) - Docs: `uv` instructions (9460fd4) - Fix: Unaligned `sz_hash_state_t` stores (7e65a1e) - Improve: Align inner hash-states (1b3cdd5) - Improve: Use `GiB` over `GB` (811fc59) - Improve: Construct `Byteset::from_bytes` (34660f2) - Fix: Remove missing Ice-Lake benchs (0c08564) - Fix: Compiling Py bindings (7e08180) - Make: CMake formatting (44485fb) - Docs: Formatting and references (928fd79) - Fix: No return (4cb096b) - Docs: Listing bench details (343b858) - Make: Patch passing `SZ_USE_SVE` definitions (45b15b0) - Fix: Expanding feature-detecting macros (75ef77e) - Improve: Bold benchmark names in CLI (2ab635e) - Improve: `fill_random` checksums in benchmarks (0bba772) - Improve: Simpler SVE find nested loop (a1604b7) - Improve: Unrolled serial hashing (77e482e) - Fix: `find_sve` mask update on long needles (a493ab8) - Improve: More simple substring search tests (68449fd) - Fix: `bench_sequence` CMake target (905749c) - Fix: Drop double negation in logging (3d77ec6) - Improve: Use `SQINCP` in SVE for increments (efda23b) - Fix: Dispatching SVE kernels (06ea5f7) - Docs: SVE2 intersects TODO (fafb8b0) - Improve: `do_not_optimize` token-level results (d1a3779) - Make: Rename `bench_sort` (991a78b) - Improve: Naming benchmark names (0074ad7) - Improve: Better sorting benchmarks (7d534fb) - Fix: Computing improvement percent (a5de795) - Improve: Faster equality checks on NEON/SVE (20f35c7) - Improve: New token-level benchmarks (12e1edd) - Docs: Describe trivial types (af686dd) - Fix: Naming byteset signature (9676cdb) - Improve: Naming "vtable" entries (b9794e5) - Make: Upgrade to C++20 for benchmarks (aa7f275) - Improve: New-style "container" benchmarks (3f1c723) - Fix: Reverse order `std::search` offsets (244e605) - Docs: Ignore formatting CMake (366816e) - Make: Formatting CMakeLists.txt (467b4b8) - Fix: Extra comma in `printf` (298d214) - Docs: Outdated function naming & spelling (92b9a56) - Improve: Token benchmarks (3b1897e) - Improve: Logging in container benchmarks (4d955d3) - Fix: No intersect for Skylake (48d70ea) - Fix: Revert to XMM on Haswell (4bec1e5) - Fix: Composing STL collections (f9da4ed) - Fix: `std::string::data` is mutable only since C++17 (ff23c3d) - Improve: Discard state in streaming hash (828263f) - Improve: Discarding buffer in streaming hashes (c4f7a0e) - Improve: Separate PRNG backends in benchmarks (af54e93) - Fix: Guard Skylake benchmarks (2965502) - Fix: Unused `_sz_capabilities` symbols (8bb90e5) - Fix: `sz_intersect` signature (f712de3) - Make: Don't build `stringzilla_bare` on MacOS (a7b35ba) - Fix: Variable in C++14 `constexpr` (feb415f) - Fix: Unused Levenshtein tests (90540d3) - Fix: `find_1byte` signature compatibility (d19e8b8) - Improve: Fix minor inconsistencies (f656577) - Docs: Exploring perfect Unicode hashing (197cd87) - Improve: Test set intersections (1d95601) - Fix: Randomization benchmarks (8dc4a2c) - Docs: Formatting (5c02c4e) - Docs: Details on the Unicode range (b6e4406) - Docs: Ignore C++ docstring updates blame (407dd2d) - Docs: New formatting in C++ (0d982a4) - Fix: Passing `sz_sequence_t::handle` (75fabf1) - Improve: Remove redundant comments from sz_hash_state functions in Rust (9fe25df) - Improve: Expose sz_lookup (471b002) - Improve: Expose sz_hash_state_init, sz_hash_state_stream, and sz_hash_state_fold to Rust (1757e4e) - Improve: Exposed sz_move, sz_fill, and sz_copy for Rust (b2085cc) - Improve: Inline most common Rust APIs (a30b5b7) - Make: `cibuildwheel` env variables (fbf256a) - Make: Decremental Rust builds (8877c82) - Fix: Detecting caps in dynamic builds (d52bf63) - Fix: `fill_random` test condition (8b396c8) - Fix: Compilation of all bindings (2caefac) - Make: Drop unused `build.sh` (2bbafa1) - Improve: Testing hash functions (80688bb) - Fix: Passing new hashing tests (268af53) - Improve: `copy`/`move` on Haswell with interleaving (69dfa10) - Docs: Announce JOINs (2225488) - Improve: Ordering includes (6e71536) - Improve: Vectorize `sz_equal_haswell` (7aad4bb) - Docs: Explaining `compare.h` operations (d7bab8d) - Improve: Clean `memory.h` header (7698392) - Improve: Use default allocator, when not provided (5a12c00) - Docs: Disable sorting includes (8bc161f) - Fix: Ice Lake partitioning logic (1da0e2b) - Improve: Expose Insertion-sort helpers (a38867f) - Fix: Merge-step bug in stable sort (db61d93) - Improve: Introduce typed `_sz_swap` macro (dcf6c65) - Improve: Rename `sz_sort` to `sz_qsort` (6191cc6) - Fix: `sz_sort_serial` passes tests (8bad799) - Fix: `uniform_int_distribution` lower bound (bdee111) - Fix: `sz_sort_serial` passes for same length inputs (0fda5a5) - Improve: Drop hybrid sort code (50d8291) - Fix: Underflow in serial sorting (5970fa4) - Make: Recommend pretty-printing GDB symbols (a818f97) - Fix: `uniform_int_distribution` upper bound (17f28a3) - Fix: In C++11 `constexpr` constructor must be empty (13bace2) - Fix: Sorting benchmarks for new API (66f2ac9) - Improve: Separate fingerprinting benchmarks (187e0bd) - Make: Renamed temp-git-split-file -> scripts/bench_token.cpp (031bedf) - Make: Renamed scripts/bench_token.cpp -> temp-git-split-file (07d2239) - Make: Renamed scripts/bench_token.cpp -> scripts/bench_fingerprint.cpp (a0318eb) - Docs: Signatures and typos (982dd4d) - Improve: Wrap `std::accumulate` for checksums (bce107a) - Improve: Validate checksums in benchmark (abe8d07) - Fix: Tail sum order in `checksum_haswell` (b20d7cd) - Fix: Infer allocators `value_type` (5bbd971) - Fix: Tail handling in `sz_checksum_haswell` (84cb4c8) - Fix: Loops in AVX-512 checksums (4044855) - Fix: Loop in `sz_checksum_haswell` (509b58b) - Improve: Relax many `constexpr`s from C++20 to C++14 (0a3e363) - Make: Move drafts (1de3166) - Make: Renamed temp-git-split-file -> include/stringzilla/hash.h (5a36cb7) - Make: Renamed include/stringzilla/hash.h -> temp-git-split-file (7052266) - Make: Renamed include/stringzilla/hash.h -> include/stringzilla/fingerprint.h (0ef7cf1) - Docs: Spelling `usnigned` (d18a159) - Improve: hybrid bench sort performance (9880f26) - Fix: hybrid bench sorts loading initial stirng bytes incorrectly (455508f) - Fix: stable sort bench tests failing (821d19e) - Fix: Minor dispatch issues (d20e589) - Improve: Faster `levenshtein_baseline` (d9557d3) - Fix: BMI flags for `BZHI` (fa47deb) - Fix: Masks back to using `BZHI` (bd7054e) - Make: Library namespaced aliases (f3811d7) - Fix: `sz_u512_vec_t` members visibility (2007d49) - Fix: Bounded Levenshtein returns (749b0d8) - Fix: Skylake dispatch (48e0913) - Fix: Linking `stderr` (084d653) - Docs: Formatting docstring (c99daf3) - Fix: Initializing `basic_charset` (864ee03) - Fix: Correct `basic_charset` operator (#203) (e20d207) - Improve: Ignore 40 commits in blame (064829a) - Fix: Overriding LibC in 32-bit Windows (645539b) - Improve: C++ version macros naming (19c2ae9) - Make: Detect Apple Universal builds (6d61c21) - Make: Rename `stringzillite` to `stringzilla_bare` (364e2ca) - Fix: Symbols names & visibility (406bf0f) - Fix: Haswell compilation flag (00f27f6) - Fix: Filter `compare.h` file (6512f1d) - Make: Split ./include/stringzilla/find.h to ./include/stringzilla/compare.h (fc9e5d6) - Make: Split ./include/stringzilla/find.h to ./include/stringzilla/compare.h (49e8d9d) - Make: Split ./include/stringzilla/find.h to ./include/stringzilla/compare.h (fc408fa) - Fix: Partially filter `stringzilla.h` file (41e5917) - Fix: Minor macro mismatches (5f7ca59) - Fix: Filter `types.h` file (b835051) - Fix: Filter `sort.h` file (1ba7982) - Fix: Filter `small_string.h` file (5b55e19) - Make: Separate builds for Skylake & Ice (4a1f03c) - Improve: Platform-specific equality checks (8b44d6a) - Fix: Filter `hash.h` file (be4c63d) - Fix: Filter `similarity.h` file (8b401bd) - Fix: Filter `memory.h` file (295d49a) - Fix: Filter `find.h` file (2a1fcd1) - Make: Split ./include/stringzilla/stringzilla.h to ./include/stringzilla/memory.h (2f76521) - Make: Split ./include/stringzilla/stringzilla.h to ./include/stringzilla/memory.h (45e57ee) - Make: Split ./include/stringzilla/stringzilla.h to ./include/stringzilla/memory.h (66778d6) - Make: Split ./include/stringzilla/stringzilla.h to ./include/stringzilla/sort.h (c357c3e) - Make: Split ./include/stringzilla/stringzilla.h to ./include/stringzilla/sort.h (cbfe5c7) - Make: Split ./include/stringzilla/stringzilla.h to ./include/stringzilla/sort.h (085d2d3) - Make: Split ./include/stringzilla/stringzilla.h to ./include/stringzilla/small_string.h (3464cb4) - Make: Split ./include/stringzilla/stringzilla.h to ./include/stringzilla/small_string.h (89c4681) - Make: Split ./include/stringzilla/stringzilla.h to ./include/stringzilla/small_string.h (3f9c248) - Make: Split ./include/stringzilla/stringzilla.h to ./include/stringzilla/similarity.h (e23c35f) - Make: Split ./include/stringzilla/stringzilla.h to ./include/stringzilla/similarity.h (10d829e) - Make: Split ./include/stringzilla/stringzilla.h to ./include/stringzilla/similarity.h (d74e5dc) - Make: Split ./include/stringzilla/stringzilla.h to ./include/stringzilla/hash.h (1f60e6d) - Make: Split ./include/stringzilla/stringzilla.h to ./include/stringzilla/hash.h (08d0a20) - Make: Split ./include/stringzilla/stringzilla.h to ./include/stringzilla/hash.h (9e9f256) - Make: Split ./include/stringzilla/stringzilla.h to ./include/stringzilla/find.h (974ed78) - Make: Split ./include/stringzilla/stringzilla.h to ./include/stringzilla/find.h (14ba3bf) - Make: Split ./include/stringzilla/stringzilla.h to ./include/stringzilla/find.h (9e577be) - Make: Split ./include/stringzilla/stringzilla.h to ./include/stringzilla/types.h (8cb0742) - Make: Split ./include/stringzilla/stringzilla.h to ./include/stringzilla/types.h (22e3d1e) - Make: Split ./include/stringzilla/stringzilla.h to ./include/stringzilla/types.h (ecb3775) - Fix: Wrong env. variable names (d0678f8) - Make: Inline ASM for detecting CPU features on ARM (0ee549a) - Fix: Default Levenshtein upper bound (62ca6a0) - Improve: Levenshtein functions for unicode (d3b423a) - Docs: Levenshtein tutorial in Jupyter (715ad10) - Fix: `sz_look_up_transform_avx512` declaration (585f7d5) - Improve: `#pragma region` dashes (fe4449b)
1 parent 7d17353 commit 58c7e15

File tree

6 files changed

+8
-8
lines changed

6 files changed

+8
-8
lines changed

CMakeLists.txt

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -43,7 +43,7 @@
4343
cmake_minimum_required(VERSION 3.14 FATAL_ERROR)
4444
project(
4545
stringzilla
46-
VERSION 3.12.6
46+
VERSION 4.0.0
4747
LANGUAGES C CXX ASM
4848
DESCRIPTION "Search, hash, sort, fingerprint, and fuzzy-match strings faster via SWAR, SIMD, and GPGPU"
4949
HOMEPAGE_URL "https://github.com/ashvardanian/stringzilla"

Cargo.lock

Lines changed: 1 addition & 1 deletion
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

Cargo.toml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
[package]
22
name = "stringzilla"
3-
version = "3.12.6"
3+
version = "4.0.0"
44
authors = ["Ash Vardanian <1983160+ashvardanian@users.noreply.github.com>"]
55
description = "Search, hash, sort, fingerprint, and fuzzy-match strings faster via SWAR, SIMD, and GPGPU"
66
edition = "2021"

VERSION

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1 +1 @@
1-
3.12.6
1+
4.0.0

include/stringzilla/stringzilla.h

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -65,9 +65,9 @@
6565
#ifndef STRINGZILLA_H_
6666
#define STRINGZILLA_H_
6767

68-
#define STRINGZILLA_H_VERSION_MAJOR 3
69-
#define STRINGZILLA_H_VERSION_MINOR 12
70-
#define STRINGZILLA_H_VERSION_PATCH 6
68+
#define STRINGZILLA_H_VERSION_MAJOR 4
69+
#define STRINGZILLA_H_VERSION_MINOR 0
70+
#define STRINGZILLA_H_VERSION_PATCH 0
7171

7272
#include "types.h" // `sz_size_t`, `sz_bool_t`, `sz_ordering_t`
7373
#include "compare.h" // `sz_equal`, `sz_order`

package.json

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
{
22
"name": "stringzilla",
3-
"version": "3.12.6",
3+
"version": "4.0.0",
44
"description": "Search, hash, sort, fingerprint, and fuzzy-match strings faster via SWAR, SIMD, and GPGPU",
55
"author": "Ash Vardanian",
66
"license": "Apache-2.0",

0 commit comments

Comments
 (0)