Commit 58c7e15
TinySemVer
Release: v4.0.0 [skip ci]
### Major
- Break: Output error messages via C API (fc94890)
- Break: New wording for incremental hashers (66898ad)
- Break: Rust namespaces layout (b3338bb)
- Break: Refactor `Strs` ops (009b975)
- Break: Rename again (fb56b60)
- Break: Drop fingerprinting bench (3e04157)
- Break: `sz::edit_distance` -> Levenshtein (d44beb4)
- Break: C++ `lookup` and `fill_random` (1ce830b)
- Break: `charset`/`generate` -> `byteset`/`fill_random` (2ce2b49)
- Break: New calling convention in `similarity.h` (095bc2d)
- Break: Return error-codes in sort functions (944804e)
- Break: `look_up_transform` to `lookup` API (e0055d5)
- Break: `checkum` to `bytesum`, new hash, and PRNG (71f1f4b)
- Break: Pointer-sized N-gram Sorting (0c38bff)
- Break: `sz_sort` now takes allocators (ec81663)
- Break: Deprecate old fingerprinting (38014ee)
- Break: Replace `char_set` constructor with literals (2c49eae)
### Minor
- Add: `Str.count_byteset` for Python (802d699)
- Add: GoLang official support (234b758)
- Add: `HashMap` traits for Rust (e555cc3)
- Add: `try_resize_and_overwrite` (e00a98b)
- Add: Hashers for Swift (be363c9)
- Add: Big-endian SWAR backends (607dd14)
- Add: Zero-copy JS wrapper for buffers (4bf2dd6)
- Add: `sz.fill_random` for Python (3565f9b)
- Add: Allow resetting the dispatch table (68172a0)
- Add: Ephemeral GPU executors if no device is passed (6431900)
- Add: Capabilities getters for SW (86c89b7)
- Add: StringZillas for Rust draft (48a1120)
- Add: Fingerprinting benchmarks (471649e)
- Add: On GPU fingerprints in Python (6f26629)
- Add: `basic_rolling_hashers` CUDA port (a25e3f2)
- Add: Unrolled fingerprinting backends (9a04c95)
- Add: NW and SW scoring classes (958c9e1)
- Add: SW scoring (3e8a3a6)
- Add: NW scoring (079a8c1)
- Add: Capability-constrained Py constructors (d0dfa0e)
- Add: `LevenshteinDistancesUTF8` in Py (4cd6c7d)
- Add: Wrap `DeviceScope` for Python (30b8fd3)
- Add: Make `Strs` from lists, tuples, generators (05a5434)
- Add: StringZillas Python tests (978d13d)
- Add: `Strs` layout conversion tests (4e5df62)
- Add: Exportable `_sz_py_api` capsules (9d5935b)
- Add: Levenshtein kernels in shared lib (42a815f)
- Add: `Strs.from_arrow` conversion (0441b32)
- Add: Draft fingerprinting C binding (3f5e004)
- Add: Fingerprinting baselines (f997dad)
- Add: `SZ_NOINLINE` (4e6e1af)
- Add: Draft fingerprinting on GPUs (11af644)
- Add: Haswell rolling fingerprints (25d3ee6)
- Add: Draft exploration of `float` fingerprints (3955eee)
- Add: `lock_guard` to avoid STL (e6cdb93)
- Add: Parallel fingerprinting (4ccac06)
- Add: `to_span`, `to_view` helpers (4fb9283)
- Add: Min-Hashing `basic_rolling_hashers` (63447d3)
- Add: Fingerprinting benchmarks (00c68d5)
- Add: 64-bit `double` fingerprinting (9ed6006)
- Add: Test rolling hashes (a065aa0)
- Add: Fingerprinting drafts are back :) (9ab4a2b)
- Add: Draft parallel library backend (94147ed)
- Add: Long haystack CUDA kernels for find-many (51c9171)
- Add: `find_many` minimal counter & benchmarks (3c615d9)
- Add: `bench_find_many.cpp` (2d594e4)
- Add: CUDA Aho Corasick placeholder (d087afa)
- Add: Parallel Ice Lake variants (6aaf16c)
- Add: NW and SW for Hopper (e53c8d5)
- Add: Hopper Levenshtein kernels (cb7c48d)
- Add: Affine gaps Levenshtein on Kepler (f09bbf9)
- Add: Ice Lake Affine Levenshtein kernels (51afad5)
- Add: Affine Levenshtein variants on GPU (3e1df93)
- Add: `sz_similarity_gaps_t` enums (5b410b8)
- Add: Concepts & new multithreading scheme (6be0da5)
- Add: Draft StringCuZilla C API (1975351)
- Add: Draft executors for StringCuZilla (e1a37de)
- Add: C++20 concepts (f7b091c)
- Add: Unmasked Ice Lake Levenshtein (2b4fa09)
- Add: Affine Levenshtein-Gotoh variants (25ab3b6)
- Add: Draft global scoring in CUDA (07196d0)
- Add: Affine gap extensions baseline (945613c)
- Add: Draft device-wide similarity kernels (871d7bc)
- Add: Levenshtein on Kepler (938bf6f)
- Add: Parallel multi-needle search with OpenMP (c8dc314)
- Add: Draft parallel substring search (cf53b9b)
- Add: Overflow-risk error codes (783cfa8)
- Add: Random access ranges (0a26059)
- Add: `safe_vector`, `safe_array` (ac6218a)
- Add: Immutable iterators for Arrow tapes (b70096b)
- Add: `indexed_container_iterator` (a5ca2ac)
- Add: Multi-pattern exact substring search (a83cdb5)
- Add: NW benchmarks on GPU (1879aeb)
- Add: Multi-flag `sz_caps` (f76fa40)
- Add: Repeat similarity benchmarks on CPU (f64fc56)
- Add: `gpu_specs` and `cuda_status_t` (c36d2b8)
- Add: Ice Lake similarity kernels (3c8d181)
- Add: Fetching Nvidia GPU specs (d0bdd17)
- Add: New batch similarity benchmarks (aca3301)
- Add: Warp-shuffle optimizations (4a62715)
- Add: Mem consumption CUDA tests (53d3e0d)
- Add: New NW and SW GPU kernels (ca0fece)
- Add: Blosum62 and NUC.4.4 matrices (8a99373)
- Add: Separate parallel & serial tests (3cb8dd0)
- Add: Parallel SW & NW scoring in OpenMP (85b9675)
- Add: Thrust-like `constant_iterator` (6751e79)
- Add: `arrow_strings_tape::try_append` (6d7f221)
- Add: CUDA scoring benchmarks (7ac0fdd)
- Add: Baseline NW and SW alignment with ~O(NM) space (044c7cc)
- Add: Horizontal scoring in OpenMP (02a2481)
- Add: Local alignment adaptation (109d8de)
- Add: Parallel GPU kernels (2861a85)
- Add: OpenMP `score_diagonally` (427d5b5)
- Add: `lookup` transform in Rust (3090472)
- Add: OpenMP C++ draft (6f8cdb9)
- Add: Stateful hashing in Rust (763538e)
- Add: Expose `find_byteset` in Rust (667ea91)
- Add: Set intersections in Rust (06784fc)
- Add: Sorting in Rust (e467649)
- Add: New memory benchmarks (dfd6ddf)
- Add: Draft `sz_find_sve` (d7ede5d)
- Add: Faster `find_byte` with SVE (56a49c8)
- Add: New string similarity benchmarks (c12e6c4)
- Add: Short string hashing in SVE2 (a007c7c)
- Add: SVE & SVE2 bytesum (d6f87d7)
- Add: SVE2 macros (9fbdc9a)
- Add: Intersection benchmarks (e860af0)
- Add: SVE backend for sorting (148b615)
- Add: All new benchmarking suite (4744406)
- Add: Comparisons in SVE (c31020d)
- Add: Arm NEON hashing (4b3847d)
- Add: Missing SVE placeholder definition (63f0368)
- Add: C++ `argsort`, `intersect` (5ea0698)
- Add: `status_t` for errors in C++ (63daa5f)
- Add: Feature-extraction placeholder (de62723)
- Add: Intersections on Ice Lake (ea5dc76)
- Add: Serial JOINs (c7b841e)
- Add: Dispatched version API (9a32744)
- Add: Fetching dynamic library version in C (3538e97)
- Add: PRNG for Haswell & serial backend (6659aa0)
- Add: Streaming hashing on Ice Lake & Skylake X (2607d45)
- Add: Streaming hash benchmarks (8ac3a23)
- Add: Hashing on Haswell & Skylake-X (3c345bc)
- Add: Missing `sz_sequence_t` helpers (dc7c109)
- Add: Sorting placeholders & dispatch (cc98389)
- Add: `sz_sequence_argsort_ice` (69d4ecb)
- Add: AES-based hash placeholders (cb18c78)
- Add: Smaller Sorting Networks (cd6859a)
- Add: String sorting tests for different lengths (c670ccd)
- Add: Separate Skylake-X & Ice Lake checksums (554f50d)
- Add: New Levenshtein distance kernels (43471aa)
- Add: Missing Rust interfaces (1765f33)
### Patch
- Docs: Pre-release stats update (c08c6c3)
- Make: Upgrade `setuptools` in CI (492ecc0)
- Improve: Allow `RuntimeError` for engine calls (f9cbb00)
- Docs: Section titles (755f583)
- Improve: PyTest invalid input arguments (f5e46d5)
- Improve: All new similarity-scoring benchmarks (3713e72)
- Make: Option to disable sanitizers for masked IO (f880621)
- Make: Default to Python 3.12 for better `itertools` (f2704d7)
- Fix: Naming scorers in Python like in Rust (d0bc604)
- Fix: Avoid SWAR on big-endian (230e354)
- Improve: More big-endian SWAR tests (7a4b78f)
- Fix: Drop old similarity APIs in benchmarks (230fc13)
- Make: Packaging for Py & Go (4c819d6)
- Improve: Byte-set counting PyTests (bd692d2)
- Docs: What to know about CUDA (15349c6)
- Improve: `Str_like_*` naming convention (5fdd8ee)
- Fix: Merge artifacts & lifetime annotations (c71e67e)
- Docs: Wording & AI dashes (4a06f3f)
- Make: Packaging for NPM (0d95f13)
- Make: Drop long-deprecated `.releaserc` (8b5af6e)
- Make: Enable SIMD in NodeJS builds (c8f6a49)
- Improve: Polish parallel string test names (61cc860)
- Make: Reuse SIMD compilation flags in `build.rs` (d129f95)
- Make: Missing AES definitions for lib builds (e951c4d)
- Docs: Hashing sections for each SDK (cb4fe1b)
- Improve: Expose `.capabilities` to JS (c557a62)
- Improve: Test incremental hashers (7c6d37d)
- Fix: Self-move construction of `basic_string` (5fb0e74)
- Fix: Strict aliasing violation (a5a2421)
- Fix: `__cpp_lib_string_resize_and_overwrite` test guards (cb2d6a8)
- Docs: How to use parallel algorithms (ecad156)
- Fix: Stricter following of `SZ_AVOID_STL` (c3040b7)
- Docs: JS Quick Start (c923ed2)
- Make: Publish to NPM (b2cedc4)
- Make: Drop CodeQL noise (7c9caac)
- Docs: Describe dynamic dispatch & linking (e6f2c02)
- Improve: NodeJS groundwork & corner-case tests (#151) (00d75f5)
- Fix: `_MSC_VER` to `__GNUC__` conditions (83492ac)
- Fix: Unknown pragmas in MSVC (#231) (78bbc11)
- Make: Parallel algorithms CI/CD for PyPI (9b8c466)
- Make: Ignore formatting blames (b69206f)
- Make: JSON & YAML uniform formatting (320bddd)
- Make: Explicit CodeQL coverage in CI (e9fb38d)
- Fix: Match new C-level `DeviceScope` behavior (f217b86)
- Fix: Sorting on big-endian `s390x` (8d2d9c8)
- Fix: Ruff-statically suggested issues (edaff0e)
- Improve: Disable `E722` import exception warning (3aedfb2)
- Fix: Check for immutable Py buffers (e8c437e)
- Improve: Test PRNG in Py and boundary Strs sizes (3d76e8f)
- Fix: `np.random` v1 vs v2 compatibility also in `szs.` (5f8ac03)
- Fix: Prevent PyTest from parsing invalid UTF-8 (30c8935)
- Fix: Don't repeat seed-ed fuzzy tests (efceb0b)
- Fix: `np.random` v1 vs v2 compatibility (aea096b)
- Make: Forwarding `SZ_IS_QEMU_` (d433db0)
- Fix: Minor logical inconsistencies & unused vars (679f7b9)
- Improve: Disable SVE in QEMU runs (072ee2e)
- Fix: Avoid `sys.getrefcount` tests on PyPy (80fa90e)
- Improve: Check rich comparisons before sorting (a21c44d)
- Improve: Session-scope fixture for PyTest env logs (3d84812)
- Improve: Fuzz PyTests and log environment (6545a04)
- Make: Respect env-vars for `-arch` (6a450f6)
- Make: Upgrade GitHub actions (dec93d0)
- Make: Avoid universal builds defaults for `pip install .` (c462f42)
- Improve: Guard compiler pragmas (42ad14d)
- Make: Bump FU to avoid missing `+wfxt` target (337257b)
- Make: Detect CPU AES support on Arm (ae74d44)
- Make: Reinstall pre-packaged CMake on macOS-14 (24967c7)
- Fix: Fall-back CPU alloc for fingerprints (b582d5d)
- Make: Drop macOS Universal builds (e5cfb08)
- Fix: Sorting difference on 32/64 bit machines (0e17a3d)
- Make: Respect `MACOSX_DEPLOYMENT_TARGET` (95a95fe)
- Make: Avoid `fail-fast` for Python pre-release wheels (a7c3f04)
- Fix: Win32 compilation issues (f792366)
- Improve: Test against Affine Gaps (65d323e)
- Improve: Avoid many unified memory re-allocs (d0db2d4)
- Fix: Intersect scopes HW capabilities (e4245b7)
- Make: Drop Python 3.7, require 3.8+ (c65bf5e)
- Make: Verbose PyTest logging in CI (c273d52)
- Fix: OS-feature-gate AVX checks (cd81b4a)
- Fix: Linking to 64-bit symbols (7da4dac)
- Make: Log HW caps in CI before tests (2c02d5b)
- Fix: Type-casting on MSVC (a347dcd)
- Make: Irrelevant links & comments (d6221e6)
- Make: Target Hopper `90a` in Py & C (c317280)
- Make: Outdated 64-bit detection envs (72d0f4b)
- Docs: Wording typos (cf1623c)
- Make: Missing `affine-gaps` dep (fa52363)
- Docs: No Alpine flow on release CI (f214169)
- Fix: Argument order (10dff6c)
- Make: Can't read `SWIFT_VERSION` for `container` (a4015f1)
- Improve: Differentiate `capabilities_mode` in PyTest (2bee1e4)
- Fix: Dispatch serial code for `bytes_per_cell <= 2` (48b4406)
- Improve: New error codes for CPU/GPU interop (6e597ca)
- Fix: Avoid UB assigning i8x256x256 matrix (7ac3b83)
- Make: Bump FU due to sign conversions warnings (cbd160a)
- Fix: Detect missing GPUs at runtime (0bc0f8a)
- Improve: Formatting Swift (3d6669c)
- Improve: Test against `affine-gaps` (48a538c)
- Improve: Reduce sign-casting issues (a340131)
- Fix: Check CUDA in `szs_capabilities` (42f043c)
- Improve: Build & test reproducibility (446e14e)
- Make: Override `/std:c++` for MSVC (930b1f0)
- Make: Add CUDA to GitHub CI (2a4552c)
- Make: Workaround CI issues (46410c6)
- Make: No `bare` builds on Windows & macOS (b42a340)
- Fix: MSVC compilation issues (5f60fe4)
- Improve: Simplify setting thread-counts (20bbe22)
- Make: Install `sz` before `szs` in CI (9033649)
- Make: Skip x86 intrinsics in `universal` builds (ce6cab2)
- Fix: Inferring Ice Lake similarity kernels (a65dc99)
- Fix: Disambiguate `szs_` symbols (1fd5db8)
- Make: Override VS Code compiler choice on `osx` (d06d4e4)
- Make: Avoid SVE builds on macOS (abb970b)
- Fix: Wrong `SZ_DYNAMIC_DISPATCH` check (acdab3f)
- Make: Preinstall `wheel` in CI (62f8c98)
- Make: Bump tapes to 4.0 (72fd6be)
- Fix: Avoid forced inlining for HW flags (4cef520)
- Fix: Feature-checking STL (b1418b5)
- Fix: Missing `allocator_traits` include (44b8d72)
- Fix: Workaround for `static_cast` (b6a4cc6)
- Fix: Avoid unaligned XMM loads (0f840bb)
- Fix: `static_cast` to standard for MSVC (43c953f)
- Make: Avoid `uv` in GitHub CI (89ed74d)
- Fix: Unused variable in `group_by` (3761107)
- Fix: Unused `qsort` on MacOS (cd263ae)
- Improve: Unaligned loads in serial hashes (5e9d488)
- Make: Parallel backends CI (352b48d)
- Make: Referencing old tests (7c4fe32)
- Make: Caps introspection flags on Arm (d6c7cf3)
- Fix: Converting to string views (6b00ab9)
- Fix: Unused symbols (e825ed8)
- Fix: Fetching engines `::capability_k` (ddc640b)
- Fix: uninitialized intersection `count` (ef3ca96)
- Make: Disable NUMA by default (b27552d)
- Fix: Guard SVE checks for cross-compilation (dac3941)
- Make: Install Git on Alpine (1ff0c15)
- Make: Log Alpine version (7ea327e)
- Fix: Type-casting seed on Clang (4598f42)
- Make: Bump Fork Union to 2.2.2 (e674413)
- Fix: Deprecate `levenshteinDistance` in Swift (70c4add)
- Fix: Passing StringZillas doctests (529fb76)
- Fix: Allow NULL allocator args (b0c33bd)
- Make: NVCC flags for Rust (6b38ea2)
- Fix: Report requesting 1 CPU core (4ef0464)
- Improve: Passing StringZillas.rs tests (34bb89a)
- Improve: Use StringTape for GPU backends (87a0767)
- Docs: StringZillas C API (34f4137)
- Fix: Memalloc initialization on MSVC (#230) (a6e0a77)
- Docs: Drop OpenMP and old name (7f45118)
- Improve: Infer `capabilities` from `DeviceScope` (a807eba)
- Improve: Introspect `sz_device_scope_t` (d9100a3)
- Make: Consistent `-O2` optimization (5fcde22)
- Improve: Drop unused `info1` (f4d4a76)
- Fix: Rendering byte strings in Python (cf73d79)
- Fix: Forward errors from `sz_rune_parse` (474dec4)
- Improve: PyTest different MinHash dimensions (b5060a8)
- Fix: Expose `value_type` for CUDA fingerprinter (7724886)
- Fix: Fingerprinting memory management (f8dea13)
- Fix: `to_span` compilation (e7fdd98)
- Improve: Wrap high-dim fingerprints (bbf30d2)
- Fix: Handling empty strings in arrays (517757c)
- Improve: More readable PyTest (a6d3ed2)
- Fix: Checking for Ice Lake caps (4f7649a)
- Improve: Simpler & slower Py args parsing (36745f4)
- Improve: Propagate error message to Py (711bd63)
- Improve: Comparing 2 mem-allocators (7870cd3)
- Fix: Skip missing `affine_levenshtein_utf8_ice_t` (1e112c6)
- Make: Custom `CudaBuildExtension` for Python (4abf63c)
- Make: Option to disable CUDA builds (84492e5)
- Fix: Refer to `prong_t` in executor concepts (57d4ec8)
- Fix: MSVC & Clang compilation errors (31fcdb3)
- Make: Avoid OpenMP in builds (183ea96)
- Fix: Announce `LevenshteinDistancesUTF8Type` (cb81c00)
- Improve: Cache hardware capabilities (b9a1109)
- Improve: Export capabilities as a tuple (0223b62)
- Improve: Printing CUDA caps (94d8c36)
- Fix: Match Apache Arrow layout (38c87b4)
- Improve: Type-casting `seed`s in `Strs.sample` (5ac53c3)
- Improve: Constructing `Strs` from PyArrow (7aebef4)
- Fix: Track ownership of `Strs` offsets (65380ca)
- Improve: Expose `sz_capabilities` in non-dynamic builds (6206eb4)
- Fix: Avoid depending SZS -> SZ (c411e37)
- Docs: Mark programming languages correctly (d1fc68c)
- Make: Bump C++ & CUDA to 20 for libs (320f3da)
- Fix: `rebind_alloc` in C++20 (492d726)
- Fix: Tautological compare check (b924d90)
- Improve: Random-access similarity outputs (c5e778d)
- Make: Building parallel Python packages (0f44c25)
- Fix: Clang build warnings (7741272)
- Fix: Using braces for Clang builds (325cedb)
- Make: Pull submodules in CI (5bb90b3)
- Fix: Avoid `_mm256_cvtepi64_epi32` on Haswell (014002e)
- Make: Separate parallel library sources (ff019b1)
- Make: Forward `march` flags through NVCC (496ae84)
- Make: Move CUDA lib into header (d4a66c5)
- Make: FMA flag for Haswell (2e1daa4)
- Fix: Compilation of all C targets (a2b228c)
- Make: Compiling StringZillas shared libs (6ac80e8)
- Improve: `gpu_specs_fetch` & GPU args order (3d7b491)
- Docs: Sync description one-liner (ab9b617)
- Improve: Draft parallel fingerprinting API (e49f570)
- Improve: Runtime variable window widths (031d067)
- Make: Rename `lib.rs` (5d82454)
- Docs: Reuse operators state (e5e2702)
- Improve: Naming multi-input processors (a3c3510)
- Fix: Scramble results between fingerprint benchmarks (fbf7203)
- Make: Format Python to 120 columns (640b7c4)
- Improve: Naming internal symbols (3b48c93)
- Improve: Unroll CUDA fingerprints (37f3d80)
- Docs: Refresh Python benchmarking suite (49cf4ea)
- Fix: Weird compiler bug related to `cuda_status_t` (0a0955e)
- Fix: Fingerprinting in CUDA (52b1d73)
- Fix: Estimating hash counts in fingerprints (058af71)
- Improve: Unroll & parallelize fingerprinting (cda36fd)
- Fix: Inferring the prong type of executors (531b1e9)
- Improve: Align thread-pool within stack-frame (b1077a4)
- Improve: Wording inconsistencies (afaf11b)
- Make: Launchers for Parallel C++ benchmarks (9f3beac)
- Fix: Fingerprinting via Skylake extensions (08c1e86)
- Fix: Passing fingerprinting builds (44a058d)
- Improve: Include hash counts in fingerprints (0d9ba5b)
- Improve: Consistent kernel naming without underscore prefixes (05725b2)
- Improve: Expose floating-point SIMD states (4a57789)
- Fix: Consistent `barrett_mod` in C++ & Python (1fc1cab)
- Docs: Using `uv` for tests (f86dfaa)
- Fix: Choosing co-primes with `std::gcd` (a1b3001)
- Improve: Using fast calling convention for CPython (97ab23c)
- Docs: Show higher recall with better hashes (255d443)
- Improve: Ensure `seed` affects hashes (78d39f9)
- Improve: Separate StringZillas Python code (883a3cd)
- Fix: Fingerprinting compilation (19f92c4)
- Improve: Explore Min-Hashes (46dd7d0)
- Improve: Test fingerprint equivalence (e4aa3f7)
- Fix: `is_same_type` usage over `std::is_same` (0ab5710)
- Improve: Ignore previous UB commit in blame (7fc7323)
- Fix: Avoid UB with underscore prefixes (74e3b6f)
- Fix: `sz_bitcast` strict aliasing (80b97de)
- Fix: Avoid `std::swap` in device code (c0aea26)
- Fix: C++17 compatibility issues (7ea685d)
- Fix: Guard C++20 concepts use (60763b3)
- Fix: Backport `std::remove_cvref` to C++17 (ffee12b)
- Improve: Move `safe_vector` (78d8c96)
- Fix: Limit `constexpr` use in C++11 (866e2f2)
- Fix: Minor build issues (46a6d63)
- Improve: Extend dummy executors API (498d72a)
- Fix: Wrong Fork Union class name (5def3af)
- Improve: Compile-time-known `span` extents (8355b6e)
- Improve: Move `arrays_equality` (e96d26f)
- Improve: Merge fingerprinting drafts (cf6077e)
- Make: Deprecate Find Many kernels (a6f799f)
- Improve: Extend `find_many` tests (c80ce60)
- Improve: Upgrade Fork Union (fb5f429)
- Docs: Similar wording in "Explore Levenshtein" (766e250)
- Fix: Correct namespaces for scripts (aafcbbd)
- Fix: Replace `+g` with `+m,r` like GB (40bd3ed)
- Fix: Wrong boundary conditions for `count_many_parallel` (0b22dd9)
- Improve: Switch to "StringParaZilla" naming (0f3f928)
- Improve: Cleaner haystack splitting (3e93b75)
- Improve: Prioritize find-many tests (7a433a9)
- Improve: `SZ_DYNAMIC` attributes (2f0334a)
- Fix: Forwarding dataset `nothrow`-copyable views in tests (170b61b)
- Improve: Use CUDA Atomics to aggregate globally (cf7d9a5)
- Make: Missing `bench_search` target (1133cce)
- Fix: `find_many.cuh` compilation issues (72affc2)
- Fix: AC dictionary `try_assign` with different alloc (4f9d2db)
- Fix: Warning around immutable span conversion (d88842a)
- Docs: Target names (503e470)
- Fix: Match C++ class names (966dd09)
- Docs: Sections & links (f5a94b1)
- Make: Bump `fork_union` (3f5de03)
- Fix: Revert to separate `find_many` algos for different length (0acbeeb)
- Improve: `__reduce_max_sync` in SW on Hopper (c45017d)
- Fix: `unified_alloc` propagation (13e0201)
- Docs: Move segmenting & features drafts (3972cb3)
- Fix: Compiler over-optimizing `bench_find_many` (16f5fa4)
- Fix: Prefix length in parallel counting of short needles (69c7b6a)
- Fix: Including the entire haystack into match (3ee1676)
- Fix: Buffer overflow due to wrong thread count (887c16b)
- Improve: Skip vocabulary duplicates (5302a4d)
- Improve: New multi-pattern search APIs (20c9135)
- Fix: Missing `std::generate` include (853a0fa)
- Improve: Multi-byte characters support (85c5bf8)
- Improve: Propagate allocators in `safe_vector` (0c442d1)
- Improve: Custom validators for nullary benchmarks (0cf994a)
- Improve: Benchmark early exit (aaa6927)
- Make: Rename `bench_search` -> `bench_find` (6f65624)
- Docs: Aho-Corasick CUDA design (496e55a)
- Improve: New caching primitives (df1f7ef)
- Fix: Checking `STRINGWARS_STRESS` env-var (ade74f6)
- Improve: Support executors in multi-pattern search (35fad3d)
- Improve: Divergent branches on `i16` SW on Hopper (817b15c)
- Fix: OOB impact on SW scoring (cd9ff1e)
- Fix: NW & SW on Hopper (4134e44)
- Improve: More noticeable signaling in tests (ece416d)
- Improve: Scheduling speculative kernels (8a6a185)
- Improve: Run multiple warps per block (3b2f263)
- Fix: Comparing Affine benchmark results (64d8f4d)
- Improve: Fuzzy test Ice Lake kernels (d603f7d)
- Improve: Shrink Affine Ice Lake kernels (79e7a2f)
- Fix: Affine top row initialization (b9e4160)
- Fix: Levenshtein w. Affine costs on GPU for zero-length ins (076e58a)
- Improve: Test weird affine gaps (3bbebc8)
- Improve: Match `fork_union` API in executors (1d6f58d)
- Improve: Use `fork_union` pools (81463d4)
- Docs: Inconsistent naming (8b62f2c)
- Make: Add `fork_union` dependency (bd3d341)
- Improve: Duplicate `bytesum` assignment (bfcd10e)
- Make: Bump CUDA version (c709621)
- Fix: Ice Lake calls for empty inputs (f59ba5e)
- Fix: Correcting blends on Kepler (563a73d)
- Improve: Scheduling Levenshtein in CUDA (77b1087)
- Improve: Avoid `views::group_by` with callback (6a61e1b)
- Improve: `bytes_per_cell_t` enum (c6e907a)
- Fix: Inconsistent timing in `bench_unary` (60b99ab)
- Improve: Bounded methods not supported (23a7f58)
- Improve: Use `requires` clause (22c691e)
- Improve: Naming "executor" interfaces (7e778d9)
- Improve: Generalize Levenshtein in CUDA (29da524)
- Fix: Inclusion guard macro names (8e79fb7)
- Docs: More datasets on HuggingFace (0b33ac3)
- Improve: Aligned ZMM diagonal stores (10b5405)
- Improve: Branchless K-mask calculation (3998c1f)
- Improve: Measure gap magnitude (b4eb6a4)
- Fix: Avoid horizontal walker overflow (9b2b4e5)
- Fix: All Gotoh baselines (84397ae)
- Fix: Initializing affine DP matrices (640853f)
- Improve: Differentiate unary & uniform costs (8c15447)
- Fix: Horizontal Affine Walkers (e5d85f3)
- Improve: Alloc type-size check in `safe_vector` (9c5a56c)
- Improve: Fetch warp-size dynamically (83dd5fb)
- Improve: Warp-shuffle reductions in SW (914e98d)
- Fix: Over-estimating number of overlapping matches (b85d3ca)
- Improve: Faster multi-needle tests (0895d28)
- Improve: Splitting jobs in baseline multi-search (d47d849)
- Fix: Slicing corner-cases in OpenMP (7bcc803)
- Improve: Use `std::execution` for baseline tests (f228264)
- Improve: Parallel baseline for substring search (6ebc7b0)
- Fix: `bytes_per_core_optimal` estimate (83bc966)
- Improve: Pointer-constructible spans (a0fd136)
- Fix: Passing multi-needle tests (faad971)
- Fix: Calculating `find_many_match_t` properties (d0ebee8)
- Fix: Indexing needle IDs (42fa08d)
- Fix: Use smaller types in BFS queue (0c6dd00)
- Fix: Aho Corasick construction (e27f86f)
- Improve: Consistent shuffling behavior in benchmarks (d4d55fa)
- Fix: `sz_copy_skylake` tail handling on large input (#222) (6da5e1e)
- Fix: Propagate substitutions to benchmarks (eabe605)
- Fix: Underutilized 99% of the H100 (7a9a243)
- Fix: Using OpenMP directives (41e1a6e)
- Fix: Forward GPU specs in CUDA tests (9d86d4c)
- Improve: Generalize memory requirements estimates (7bd92c5)
- Fix: Missing STL includes (f7d365a)
- Improve: Use CUDA constant memory (cebd180)
- Improve: Forward-declare substituters (0a04960)
- Improve: Show signed integers in SIMD types (382c05d)
- Fix: `i16` Ice Lake NW/SW alignment (8cc9794)
- Improve: Catch & log exceptions (cb739e2)
- Improve: Better UTF8 tests for similarity (b04e934)
- Make: Parallel test launchers (1ece547)
- Fix: Build issues (9db0287)
- Fix: Included filename (ee38a22)
- Fix: Uniform costs for UTF-32 runes (6edcf7f)
- Improve: New template SFINAE in `similarity.hpp` (10279f9)
- Fix: Initializing horizontal aligner (91df0ce)
- Improve: Use tagged C `enum`s (35ba76c)
- Improve: Allow custom validators in benchmarks (bd2a21d)
- Fix: Compiling `constant_iterator` in CUDA (bc59ee3)
- Fix: `std::allocator::rebind` deprecated (4c87404)
- Fix: Avoid `std::iterator` dependency (b3db596)
- Make: Move similarity benchmarks (90bfcb6)
- Fix: Report error codes in tests (05f17a6)
- Make: Separate Parallel C++ and CUDA tests (b175103)
- Make: Rename test files (2eeab83)
- Fix: Passing CUDA similarity tests (4c75d81)
- Fix: NW/SW test correspondence (fea39ed)
- Improve: Differentiate min/max-imizers (c96c5ed)
- Improve: Annotate throwing exceptions (2ef667c)
- Fix: Missing `sz_i16_t` definition (238c86d)
- Fix: Avoid similarity scoring references (e4b1bbd)
- Improve: Shorter type aliases (3ef1d26)
- Make: Revert to C++ for core tests (0c6ff1f)
- Docs: StringCuZilla design choices (f42aa85)
- Make: Separate StringCuZilla (c5fd4bc)
- Improve: Move capabilities to `types.h` (1c1582f)
- Make: Compile with OpenMP (1671b0f)
- Improve: Allow datasets in VRAM (b1c9a74)
- Fix: Shuffle datasets with over 4B tokens (247c6ec)
- Fix: Overflow `mean_token_length` calculation (efcadd1)
- Make: NVCC kernel debugging symbols (0c0ff42)
- Fix: Arrow-like string array (fa4b0f4)
- Fix: Overwriting alignment scores (668a386)
- Fix: Synchronizing CUDA kernel launch (3b85a00)
- Fix: Shared memory requirements (27ad8f5)
- Make: NVCC can't handle `fsanitize` (ea7647f)
- Make: Draft CUDA compilation (1b96ef4)
- Fix: Hardening `malloc(0)` behavior (7524882)
- Improve: Share C++ macros and typedefs (e82d045)
- Fix: Accounting for different gap costs (e4e517f)
- Fix: NVCC warning for negative size field (a6c0fa2)
- Fix: Track capacity in fixed buffer alloc (64b40a9)
- Fix: Sign-cast warning in `_mm256_set1_epi64x` (aac2e8f)
- Fix: Overriding `SZ_DEBUG` macro (bca734a)
- Fix: Calling unused helper struct unit tests (d369170)
- Improve: Cleaner API for OpenMP (bc311b3)
- Fix: Shifting Levenshtein diagonals in OpenMP (6b5ef98)
- Fix: Unaligned loads/stores of hash state (37863c9)
- Improve: Expose rune-parsing headers (2727a87)
- Fix: Diagonals depth (ef53f75)
- Docs: Showcase indexing diagonals (b55c696)
- Fix: `sz::lookup` examples in Rs (778d4f0)
- Fix: Compiling SVE on MacOS (42270c8)
- Improve: Inline cheap calls (28282d2)
- Make: List `scripts/` deps for `uv` (1907d2b)
- Docs: Formatting (dd57536)
- Docs: `uv` instructions (9460fd4)
- Fix: Unaligned `sz_hash_state_t` stores (7e65a1e)
- Improve: Align inner hash-states (1b3cdd5)
- Improve: Use `GiB` over `GB` (811fc59)
- Improve: Construct `Byteset::from_bytes` (34660f2)
- Fix: Remove missing Ice-Lake benchs (0c08564)
- Fix: Compiling Py bindings (7e08180)
- Make: CMake formatting (44485fb)
- Docs: Formatting and references (928fd79)
- Fix: No return (4cb096b)
- Docs: Listing bench details (343b858)
- Make: Patch passing `SZ_USE_SVE` definitions (45b15b0)
- Fix: Expanding feature-detecting macros (75ef77e)
- Improve: Bold benchmark names in CLI (2ab635e)
- Improve: `fill_random` checksums in benchmarks (0bba772)
- Improve: Simpler SVE find nested loop (a1604b7)
- Improve: Unrolled serial hashing (77e482e)
- Fix: `find_sve` mask update on long needles (a493ab8)
- Improve: More simple substring search tests (68449fd)
- Fix: `bench_sequence` CMake target (905749c)
- Fix: Drop double negation in logging (3d77ec6)
- Improve: Use `SQINCP` in SVE for increments (efda23b)
- Fix: Dispatching SVE kernels (06ea5f7)
- Docs: SVE2 intersects TODO (fafb8b0)
- Improve: `do_not_optimize` token-level results (d1a3779)
- Make: Rename `bench_sort` (991a78b)
- Improve: Naming benchmark names (0074ad7)
- Improve: Better sorting benchmarks (7d534fb)
- Fix: Computing improvement percent (a5de795)
- Improve: Faster equality checks on NEON/SVE (20f35c7)
- Improve: New token-level benchmarks (12e1edd)
- Docs: Describe trivial types (af686dd)
- Fix: Naming byteset signature (9676cdb)
- Improve: Naming "vtable" entries (b9794e5)
- Make: Upgrade to C++20 for benchmarks (aa7f275)
- Improve: New-style "container" benchmarks (3f1c723)
- Fix: Reverse order `std::search` offsets (244e605)
- Docs: Ignore formatting CMake (366816e)
- Make: Formatting CMakeLists.txt (467b4b8)
- Fix: Extra comma in `printf` (298d214)
- Docs: Outdated function naming & spelling (92b9a56)
- Improve: Token benchmarks (3b1897e)
- Improve: Logging in container benchmarks (4d955d3)
- Fix: No intersect for Skylake (48d70ea)
- Fix: Revert to XMM on Haswell (4bec1e5)
- Fix: Composing STL collections (f9da4ed)
- Fix: `std::string::data` is mutable only since C++17 (ff23c3d)
- Improve: Discard state in streaming hash (828263f)
- Improve: Discarding buffer in streaming hashes (c4f7a0e)
- Improve: Separate PRNG backends in benchmarks (af54e93)
- Fix: Guard Skylake benchmarks (2965502)
- Fix: Unused `_sz_capabilities` symbols (8bb90e5)
- Fix: `sz_intersect` signature (f712de3)
- Make: Don't build `stringzilla_bare` on MacOS (a7b35ba)
- Fix: Variable in C++14 `constexpr` (feb415f)
- Fix: Unused Levenshtein tests (90540d3)
- Fix: `find_1byte` signature compatibility (d19e8b8)
- Improve: Fix minor inconsistencies (f656577)
- Docs: Exploring perfect Unicode hashing (197cd87)
- Improve: Test set intersections (1d95601)
- Fix: Randomization benchmarks (8dc4a2c)
- Docs: Formatting (5c02c4e)
- Docs: Details on the Unicode range (b6e4406)
- Docs: Ignore C++ docstring updates blame (407dd2d)
- Docs: New formatting in C++ (0d982a4)
- Fix: Passing `sz_sequence_t::handle` (75fabf1)
- Improve: Remove redundant comments from sz_hash_state functions in Rust (9fe25df)
- Improve: Expose sz_lookup (471b002)
- Improve: Expose sz_hash_state_init, sz_hash_state_stream, and sz_hash_state_fold to Rust (1757e4e)
- Improve: Exposed sz_move, sz_fill, and sz_copy for Rust (b2085cc)
- Improve: Inline most common Rust APIs (a30b5b7)
- Make: `cibuildwheel` env variables (fbf256a)
- Make: Decremental Rust builds (8877c82)
- Fix: Detecting caps in dynamic builds (d52bf63)
- Fix: `fill_random` test condition (8b396c8)
- Fix: Compilation of all bindings (2caefac)
- Make: Drop unused `build.sh` (2bbafa1)
- Improve: Testing hash functions (80688bb)
- Fix: Passing new hashing tests (268af53)
- Improve: `copy`/`move` on Haswell with interleaving (69dfa10)
- Docs: Announce JOINs (2225488)
- Improve: Ordering includes (6e71536)
- Improve: Vectorize `sz_equal_haswell` (7aad4bb)
- Docs: Explaining `compare.h` operations (d7bab8d)
- Improve: Clean `memory.h` header (7698392)
- Improve: Use default allocator, when not provided (5a12c00)
- Docs: Disable sorting includes (8bc161f)
- Fix: Ice Lake partitioning logic (1da0e2b)
- Improve: Expose Insertion-sort helpers (a38867f)
- Fix: Merge-step bug in stable sort (db61d93)
- Improve: Introduce typed `_sz_swap` macro (dcf6c65)
- Improve: Rename `sz_sort` to `sz_qsort` (6191cc6)
- Fix: `sz_sort_serial` passes tests (8bad799)
- Fix: `uniform_int_distribution` lower bound (bdee111)
- Fix: `sz_sort_serial` passes for same length inputs (0fda5a5)
- Improve: Drop hybrid sort code (50d8291)
- Fix: Underflow in serial sorting (5970fa4)
- Make: Recommend pretty-printing GDB symbols (a818f97)
- Fix: `uniform_int_distribution` upper bound (17f28a3)
- Fix: In C++11 `constexpr` constructor must be empty (13bace2)
- Fix: Sorting benchmarks for new API (66f2ac9)
- Improve: Separate fingerprinting benchmarks (187e0bd)
- Make: Renamed temp-git-split-file -> scripts/bench_token.cpp (031bedf)
- Make: Renamed scripts/bench_token.cpp -> temp-git-split-file (07d2239)
- Make: Renamed scripts/bench_token.cpp -> scripts/bench_fingerprint.cpp (a0318eb)
- Docs: Signatures and typos (982dd4d)
- Improve: Wrap `std::accumulate` for checksums (bce107a)
- Improve: Validate checksums in benchmark (abe8d07)
- Fix: Tail sum order in `checksum_haswell` (b20d7cd)
- Fix: Infer allocators `value_type` (5bbd971)
- Fix: Tail handling in `sz_checksum_haswell` (84cb4c8)
- Fix: Loops in AVX-512 checksums (4044855)
- Fix: Loop in `sz_checksum_haswell` (509b58b)
- Improve: Relax many `constexpr`s from C++20 to C++14 (0a3e363)
- Make: Move drafts (1de3166)
- Make: Renamed temp-git-split-file -> include/stringzilla/hash.h (5a36cb7)
- Make: Renamed include/stringzilla/hash.h -> temp-git-split-file (7052266)
- Make: Renamed include/stringzilla/hash.h -> include/stringzilla/fingerprint.h (0ef7cf1)
- Docs: Spelling `usnigned` (d18a159)
- Improve: hybrid bench sort performance (9880f26)
- Fix: hybrid bench sorts loading initial stirng bytes incorrectly (455508f)
- Fix: stable sort bench tests failing (821d19e)
- Fix: Minor dispatch issues (d20e589)
- Improve: Faster `levenshtein_baseline` (d9557d3)
- Fix: BMI flags for `BZHI` (fa47deb)
- Fix: Masks back to using `BZHI` (bd7054e)
- Make: Library namespaced aliases (f3811d7)
- Fix: `sz_u512_vec_t` members visibility (2007d49)
- Fix: Bounded Levenshtein returns (749b0d8)
- Fix: Skylake dispatch (48e0913)
- Fix: Linking `stderr` (084d653)
- Docs: Formatting docstring (c99daf3)
- Fix: Initializing `basic_charset` (864ee03)
- Fix: Correct `basic_charset` operator (#203) (e20d207)
- Improve: Ignore 40 commits in blame (064829a)
- Fix: Overriding LibC in 32-bit Windows (645539b)
- Improve: C++ version macros naming (19c2ae9)
- Make: Detect Apple Universal builds (6d61c21)
- Make: Rename `stringzillite` to `stringzilla_bare` (364e2ca)
- Fix: Symbols names & visibility (406bf0f)
- Fix: Haswell compilation flag (00f27f6)
- Fix: Filter `compare.h` file (6512f1d)
- Make: Split ./include/stringzilla/find.h to ./include/stringzilla/compare.h (fc9e5d6)
- Make: Split ./include/stringzilla/find.h to ./include/stringzilla/compare.h (49e8d9d)
- Make: Split ./include/stringzilla/find.h to ./include/stringzilla/compare.h (fc408fa)
- Fix: Partially filter `stringzilla.h` file (41e5917)
- Fix: Minor macro mismatches (5f7ca59)
- Fix: Filter `types.h` file (b835051)
- Fix: Filter `sort.h` file (1ba7982)
- Fix: Filter `small_string.h` file (5b55e19)
- Make: Separate builds for Skylake & Ice (4a1f03c)
- Improve: Platform-specific equality checks (8b44d6a)
- Fix: Filter `hash.h` file (be4c63d)
- Fix: Filter `similarity.h` file (8b401bd)
- Fix: Filter `memory.h` file (295d49a)
- Fix: Filter `find.h` file (2a1fcd1)
- Make: Split ./include/stringzilla/stringzilla.h to ./include/stringzilla/memory.h (2f76521)
- Make: Split ./include/stringzilla/stringzilla.h to ./include/stringzilla/memory.h (45e57ee)
- Make: Split ./include/stringzilla/stringzilla.h to ./include/stringzilla/memory.h (66778d6)
- Make: Split ./include/stringzilla/stringzilla.h to ./include/stringzilla/sort.h (c357c3e)
- Make: Split ./include/stringzilla/stringzilla.h to ./include/stringzilla/sort.h (cbfe5c7)
- Make: Split ./include/stringzilla/stringzilla.h to ./include/stringzilla/sort.h (085d2d3)
- Make: Split ./include/stringzilla/stringzilla.h to ./include/stringzilla/small_string.h (3464cb4)
- Make: Split ./include/stringzilla/stringzilla.h to ./include/stringzilla/small_string.h (89c4681)
- Make: Split ./include/stringzilla/stringzilla.h to ./include/stringzilla/small_string.h (3f9c248)
- Make: Split ./include/stringzilla/stringzilla.h to ./include/stringzilla/similarity.h (e23c35f)
- Make: Split ./include/stringzilla/stringzilla.h to ./include/stringzilla/similarity.h (10d829e)
- Make: Split ./include/stringzilla/stringzilla.h to ./include/stringzilla/similarity.h (d74e5dc)
- Make: Split ./include/stringzilla/stringzilla.h to ./include/stringzilla/hash.h (1f60e6d)
- Make: Split ./include/stringzilla/stringzilla.h to ./include/stringzilla/hash.h (08d0a20)
- Make: Split ./include/stringzilla/stringzilla.h to ./include/stringzilla/hash.h (9e9f256)
- Make: Split ./include/stringzilla/stringzilla.h to ./include/stringzilla/find.h (974ed78)
- Make: Split ./include/stringzilla/stringzilla.h to ./include/stringzilla/find.h (14ba3bf)
- Make: Split ./include/stringzilla/stringzilla.h to ./include/stringzilla/find.h (9e577be)
- Make: Split ./include/stringzilla/stringzilla.h to ./include/stringzilla/types.h (8cb0742)
- Make: Split ./include/stringzilla/stringzilla.h to ./include/stringzilla/types.h (22e3d1e)
- Make: Split ./include/stringzilla/stringzilla.h to ./include/stringzilla/types.h (ecb3775)
- Fix: Wrong env. variable names (d0678f8)
- Make: Inline ASM for detecting CPU features on ARM (0ee549a)
- Fix: Default Levenshtein upper bound (62ca6a0)
- Improve: Levenshtein functions for unicode (d3b423a)
- Docs: Levenshtein tutorial in Jupyter (715ad10)
- Fix: `sz_look_up_transform_avx512` declaration (585f7d5)
- Improve: `#pragma region` dashes (fe4449b)1 parent 7d17353 commit 58c7e15
File tree
6 files changed
+8
-8
lines changed- include/stringzilla
6 files changed
+8
-8
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
43 | 43 | | |
44 | 44 | | |
45 | 45 | | |
46 | | - | |
| 46 | + | |
47 | 47 | | |
48 | 48 | | |
49 | 49 | | |
| |||
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1 | 1 | | |
2 | 2 | | |
3 | | - | |
| 3 | + | |
4 | 4 | | |
5 | 5 | | |
6 | 6 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1 | | - | |
| 1 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
65 | 65 | | |
66 | 66 | | |
67 | 67 | | |
68 | | - | |
69 | | - | |
70 | | - | |
| 68 | + | |
| 69 | + | |
| 70 | + | |
71 | 71 | | |
72 | 72 | | |
73 | 73 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1 | 1 | | |
2 | 2 | | |
3 | | - | |
| 3 | + | |
4 | 4 | | |
5 | 5 | | |
6 | 6 | | |
| |||
0 commit comments