-
Notifications
You must be signed in to change notification settings - Fork 117
Initial NodeJS string wrapper #151
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
430f05e to
f7e4c05
Compare
19e6998 to
4e33434
Compare
e0a9e4e to
c8c6c7c
Compare
javascript/lib.c
Outdated
| sz_wrapper_t *this; | ||
| sz_wrapper_t *arg; | ||
|
|
||
| napi_get_cb_info(env, info, &argc, args, &jsthis, NULL); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing error handling for napi_get_cb_info call. The function returns napi_status which should be checked. If it fails, the function could proceed with invalid args/jsthis values. Should check return value and handle error cases.
React with 👍 to tell me that this comment was useful, or 👎 if not (and I'll stop posting more comments like this in the future)
javascript/lib.c
Outdated
| size_t argc = 1; | ||
| napi_value args[1]; | ||
| napi_value jsthis; | ||
| sz_wrapper_t *obj = malloc(sizeof(sz_wrapper_t)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No NULL check after malloc. If allocation fails, the function continues with an invalid pointer which could cause a crash. Should check if obj is NULL and handle the error case appropriately.
React with 👍 to tell me that this comment was useful, or 👎 if not (and I'll stop posting more comments like this in the future)
scripts/test.js
Outdated
| import assert from 'node:assert'; | ||
|
|
||
| const stringzilla = bindings('stringzilla'); | ||
| const stringzilla = bindings('../../build/Release/stringzilla'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using a relative path ('../../build/Release/stringzilla') with the 'bindings' module is incorrect and could cause module resolution failures. The 'bindings' module is designed to automatically locate native addons and should be used with just the module name. The original code using bindings('stringzilla') was correct. The change to a relative path breaks the standard module resolution mechanism and could fail when the package is installed in different locations.
React with 👍 to tell me that this comment was useful, or 👎 if not (and I'll stop posting more comments like this in the future)
5deaa1b to
42f043c
Compare
7ab1e51 to
ae74d44
Compare
|
Thank for laying all the groundwork, @MarkReedZ! Even though we can't access string contents without copy, we still have a zero-copy NAPI for |
### Major - Break: Output error messages via C API (fc94890) - Break: New wording for incremental hashers (66898ad) - Break: Rust namespaces layout (b3338bb) - Break: Refactor `Strs` ops (009b975) - Break: Rename again (fb56b60) - Break: Drop fingerprinting bench (3e04157) - Break: `sz::edit_distance` -> Levenshtein (d44beb4) - Break: C++ `lookup` and `fill_random` (1ce830b) - Break: `charset`/`generate` -> `byteset`/`fill_random` (2ce2b49) - Break: New calling convention in `similarity.h` (095bc2d) - Break: Return error-codes in sort functions (944804e) - Break: `look_up_transform` to `lookup` API (e0055d5) - Break: `checkum` to `bytesum`, new hash, and PRNG (71f1f4b) - Break: Pointer-sized N-gram Sorting (0c38bff) - Break: `sz_sort` now takes allocators (ec81663) - Break: Deprecate old fingerprinting (38014ee) - Break: Replace `char_set` constructor with literals (2c49eae) ### Minor - Add: `Str.count_byteset` for Python (802d699) - Add: GoLang official support (234b758) - Add: `HashMap` traits for Rust (e555cc3) - Add: `try_resize_and_overwrite` (e00a98b) - Add: Hashers for Swift (be363c9) - Add: Big-endian SWAR backends (607dd14) - Add: Zero-copy JS wrapper for buffers (4bf2dd6) - Add: `sz.fill_random` for Python (3565f9b) - Add: Allow resetting the dispatch table (68172a0) - Add: Ephemeral GPU executors if no device is passed (6431900) - Add: Capabilities getters for SW (86c89b7) - Add: StringZillas for Rust draft (48a1120) - Add: Fingerprinting benchmarks (471649e) - Add: On GPU fingerprints in Python (6f26629) - Add: `basic_rolling_hashers` CUDA port (a25e3f2) - Add: Unrolled fingerprinting backends (9a04c95) - Add: NW and SW scoring classes (958c9e1) - Add: SW scoring (3e8a3a6) - Add: NW scoring (079a8c1) - Add: Capability-constrained Py constructors (d0dfa0e) - Add: `LevenshteinDistancesUTF8` in Py (4cd6c7d) - Add: Wrap `DeviceScope` for Python (30b8fd3) - Add: Make `Strs` from lists, tuples, generators (05a5434) - Add: StringZillas Python tests (978d13d) - Add: `Strs` layout conversion tests (4e5df62) - Add: Exportable `_sz_py_api` capsules (9d5935b) - Add: Levenshtein kernels in shared lib (42a815f) - Add: `Strs.from_arrow` conversion (0441b32) - Add: Draft fingerprinting C binding (3f5e004) - Add: Fingerprinting baselines (f997dad) - Add: `SZ_NOINLINE` (4e6e1af) - Add: Draft fingerprinting on GPUs (11af644) - Add: Haswell rolling fingerprints (25d3ee6) - Add: Draft exploration of `float` fingerprints (3955eee) - Add: `lock_guard` to avoid STL (e6cdb93) - Add: Parallel fingerprinting (4ccac06) - Add: `to_span`, `to_view` helpers (4fb9283) - Add: Min-Hashing `basic_rolling_hashers` (63447d3) - Add: Fingerprinting benchmarks (00c68d5) - Add: 64-bit `double` fingerprinting (9ed6006) - Add: Test rolling hashes (a065aa0) - Add: Fingerprinting drafts are back :) (9ab4a2b) - Add: Draft parallel library backend (94147ed) - Add: Long haystack CUDA kernels for find-many (51c9171) - Add: `find_many` minimal counter & benchmarks (3c615d9) - Add: `bench_find_many.cpp` (2d594e4) - Add: CUDA Aho Corasick placeholder (d087afa) - Add: Parallel Ice Lake variants (6aaf16c) - Add: NW and SW for Hopper (e53c8d5) - Add: Hopper Levenshtein kernels (cb7c48d) - Add: Affine gaps Levenshtein on Kepler (f09bbf9) - Add: Ice Lake Affine Levenshtein kernels (51afad5) - Add: Affine Levenshtein variants on GPU (3e1df93) - Add: `sz_similarity_gaps_t` enums (5b410b8) - Add: Concepts & new multithreading scheme (6be0da5) - Add: Draft StringCuZilla C API (1975351) - Add: Draft executors for StringCuZilla (e1a37de) - Add: C++20 concepts (f7b091c) - Add: Unmasked Ice Lake Levenshtein (2b4fa09) - Add: Affine Levenshtein-Gotoh variants (25ab3b6) - Add: Draft global scoring in CUDA (07196d0) - Add: Affine gap extensions baseline (945613c) - Add: Draft device-wide similarity kernels (871d7bc) - Add: Levenshtein on Kepler (938bf6f) - Add: Parallel multi-needle search with OpenMP (c8dc314) - Add: Draft parallel substring search (cf53b9b) - Add: Overflow-risk error codes (783cfa8) - Add: Random access ranges (0a26059) - Add: `safe_vector`, `safe_array` (ac6218a) - Add: Immutable iterators for Arrow tapes (b70096b) - Add: `indexed_container_iterator` (a5ca2ac) - Add: Multi-pattern exact substring search (a83cdb5) - Add: NW benchmarks on GPU (1879aeb) - Add: Multi-flag `sz_caps` (f76fa40) - Add: Repeat similarity benchmarks on CPU (f64fc56) - Add: `gpu_specs` and `cuda_status_t` (c36d2b8) - Add: Ice Lake similarity kernels (3c8d181) - Add: Fetching Nvidia GPU specs (d0bdd17) - Add: New batch similarity benchmarks (aca3301) - Add: Warp-shuffle optimizations (4a62715) - Add: Mem consumption CUDA tests (53d3e0d) - Add: New NW and SW GPU kernels (ca0fece) - Add: Blosum62 and NUC.4.4 matrices (8a99373) - Add: Separate parallel & serial tests (3cb8dd0) - Add: Parallel SW & NW scoring in OpenMP (85b9675) - Add: Thrust-like `constant_iterator` (6751e79) - Add: `arrow_strings_tape::try_append` (6d7f221) - Add: CUDA scoring benchmarks (7ac0fdd) - Add: Baseline NW and SW alignment with ~O(NM) space (044c7cc) - Add: Horizontal scoring in OpenMP (02a2481) - Add: Local alignment adaptation (109d8de) - Add: Parallel GPU kernels (2861a85) - Add: OpenMP `score_diagonally` (427d5b5) - Add: `lookup` transform in Rust (3090472) - Add: OpenMP C++ draft (6f8cdb9) - Add: Stateful hashing in Rust (763538e) - Add: Expose `find_byteset` in Rust (667ea91) - Add: Set intersections in Rust (06784fc) - Add: Sorting in Rust (e467649) - Add: New memory benchmarks (dfd6ddf) - Add: Draft `sz_find_sve` (d7ede5d) - Add: Faster `find_byte` with SVE (56a49c8) - Add: New string similarity benchmarks (c12e6c4) - Add: Short string hashing in SVE2 (a007c7c) - Add: SVE & SVE2 bytesum (d6f87d7) - Add: SVE2 macros (9fbdc9a) - Add: Intersection benchmarks (e860af0) - Add: SVE backend for sorting (148b615) - Add: All new benchmarking suite (4744406) - Add: Comparisons in SVE (c31020d) - Add: Arm NEON hashing (4b3847d) - Add: Missing SVE placeholder definition (63f0368) - Add: C++ `argsort`, `intersect` (5ea0698) - Add: `status_t` for errors in C++ (63daa5f) - Add: Feature-extraction placeholder (de62723) - Add: Intersections on Ice Lake (ea5dc76) - Add: Serial JOINs (c7b841e) - Add: Dispatched version API (9a32744) - Add: Fetching dynamic library version in C (3538e97) - Add: PRNG for Haswell & serial backend (6659aa0) - Add: Streaming hashing on Ice Lake & Skylake X (2607d45) - Add: Streaming hash benchmarks (8ac3a23) - Add: Hashing on Haswell & Skylake-X (3c345bc) - Add: Missing `sz_sequence_t` helpers (dc7c109) - Add: Sorting placeholders & dispatch (cc98389) - Add: `sz_sequence_argsort_ice` (69d4ecb) - Add: AES-based hash placeholders (cb18c78) - Add: Smaller Sorting Networks (cd6859a) - Add: String sorting tests for different lengths (c670ccd) - Add: Separate Skylake-X & Ice Lake checksums (554f50d) - Add: New Levenshtein distance kernels (43471aa) - Add: Missing Rust interfaces (1765f33) ### Patch - Docs: Pre-release stats update (c08c6c3) - Make: Upgrade `setuptools` in CI (492ecc0) - Improve: Allow `RuntimeError` for engine calls (f9cbb00) - Docs: Section titles (755f583) - Improve: PyTest invalid input arguments (f5e46d5) - Improve: All new similarity-scoring benchmarks (3713e72) - Make: Option to disable sanitizers for masked IO (f880621) - Make: Default to Python 3.12 for better `itertools` (f2704d7) - Fix: Naming scorers in Python like in Rust (d0bc604) - Fix: Avoid SWAR on big-endian (230e354) - Improve: More big-endian SWAR tests (7a4b78f) - Fix: Drop old similarity APIs in benchmarks (230fc13) - Make: Packaging for Py & Go (4c819d6) - Improve: Byte-set counting PyTests (bd692d2) - Docs: What to know about CUDA (15349c6) - Improve: `Str_like_*` naming convention (5fdd8ee) - Fix: Merge artifacts & lifetime annotations (c71e67e) - Docs: Wording & AI dashes (4a06f3f) - Make: Packaging for NPM (0d95f13) - Make: Drop long-deprecated `.releaserc` (8b5af6e) - Make: Enable SIMD in NodeJS builds (c8f6a49) - Improve: Polish parallel string test names (61cc860) - Make: Reuse SIMD compilation flags in `build.rs` (d129f95) - Make: Missing AES definitions for lib builds (e951c4d) - Docs: Hashing sections for each SDK (cb4fe1b) - Improve: Expose `.capabilities` to JS (c557a62) - Improve: Test incremental hashers (7c6d37d) - Fix: Self-move construction of `basic_string` (5fb0e74) - Fix: Strict aliasing violation (a5a2421) - Fix: `__cpp_lib_string_resize_and_overwrite` test guards (cb2d6a8) - Docs: How to use parallel algorithms (ecad156) - Fix: Stricter following of `SZ_AVOID_STL` (c3040b7) - Docs: JS Quick Start (c923ed2) - Make: Publish to NPM (b2cedc4) - Make: Drop CodeQL noise (7c9caac) - Docs: Describe dynamic dispatch & linking (e6f2c02) - Improve: NodeJS groundwork & corner-case tests (#151) (00d75f5) - Fix: `_MSC_VER` to `__GNUC__` conditions (83492ac) - Fix: Unknown pragmas in MSVC (#231) (78bbc11) - Make: Parallel algorithms CI/CD for PyPI (9b8c466) - Make: Ignore formatting blames (b69206f) - Make: JSON & YAML uniform formatting (320bddd) - Make: Explicit CodeQL coverage in CI (e9fb38d) - Fix: Match new C-level `DeviceScope` behavior (f217b86) - Fix: Sorting on big-endian `s390x` (8d2d9c8) - Fix: Ruff-statically suggested issues (edaff0e) - Improve: Disable `E722` import exception warning (3aedfb2) - Fix: Check for immutable Py buffers (e8c437e) - Improve: Test PRNG in Py and boundary Strs sizes (3d76e8f) - Fix: `np.random` v1 vs v2 compatibility also in `szs.` (5f8ac03) - Fix: Prevent PyTest from parsing invalid UTF-8 (30c8935) - Fix: Don't repeat seed-ed fuzzy tests (efceb0b) - Fix: `np.random` v1 vs v2 compatibility (aea096b) - Make: Forwarding `SZ_IS_QEMU_` (d433db0) - Fix: Minor logical inconsistencies & unused vars (679f7b9) - Improve: Disable SVE in QEMU runs (072ee2e) - Fix: Avoid `sys.getrefcount` tests on PyPy (80fa90e) - Improve: Check rich comparisons before sorting (a21c44d) - Improve: Session-scope fixture for PyTest env logs (3d84812) - Improve: Fuzz PyTests and log environment (6545a04) - Make: Respect env-vars for `-arch` (6a450f6) - Make: Upgrade GitHub actions (dec93d0) - Make: Avoid universal builds defaults for `pip install .` (c462f42) - Improve: Guard compiler pragmas (42ad14d) - Make: Bump FU to avoid missing `+wfxt` target (337257b) - Make: Detect CPU AES support on Arm (ae74d44) - Make: Reinstall pre-packaged CMake on macOS-14 (24967c7) - Fix: Fall-back CPU alloc for fingerprints (b582d5d) - Make: Drop macOS Universal builds (e5cfb08) - Fix: Sorting difference on 32/64 bit machines (0e17a3d) - Make: Respect `MACOSX_DEPLOYMENT_TARGET` (95a95fe) - Make: Avoid `fail-fast` for Python pre-release wheels (a7c3f04) - Fix: Win32 compilation issues (f792366) - Improve: Test against Affine Gaps (65d323e) - Improve: Avoid many unified memory re-allocs (d0db2d4) - Fix: Intersect scopes HW capabilities (e4245b7) - Make: Drop Python 3.7, require 3.8+ (c65bf5e) - Make: Verbose PyTest logging in CI (c273d52) - Fix: OS-feature-gate AVX checks (cd81b4a) - Fix: Linking to 64-bit symbols (7da4dac) - Make: Log HW caps in CI before tests (2c02d5b) - Fix: Type-casting on MSVC (a347dcd) - Make: Irrelevant links & comments (d6221e6) - Make: Target Hopper `90a` in Py & C (c317280) - Make: Outdated 64-bit detection envs (72d0f4b) - Docs: Wording typos (cf1623c) - Make: Missing `affine-gaps` dep (fa52363) - Docs: No Alpine flow on release CI (f214169) - Fix: Argument order (10dff6c) - Make: Can't read `SWIFT_VERSION` for `container` (a4015f1) - Improve: Differentiate `capabilities_mode` in PyTest (2bee1e4) - Fix: Dispatch serial code for `bytes_per_cell <= 2` (48b4406) - Improve: New error codes for CPU/GPU interop (6e597ca) - Fix: Avoid UB assigning i8x256x256 matrix (7ac3b83) - Make: Bump FU due to sign conversions warnings (cbd160a) - Fix: Detect missing GPUs at runtime (0bc0f8a) - Improve: Formatting Swift (3d6669c) - Improve: Test against `affine-gaps` (48a538c) - Improve: Reduce sign-casting issues (a340131) - Fix: Check CUDA in `szs_capabilities` (42f043c) - Improve: Build & test reproducibility (446e14e) - Make: Override `/std:c++` for MSVC (930b1f0) - Make: Add CUDA to GitHub CI (2a4552c) - Make: Workaround CI issues (46410c6) - Make: No `bare` builds on Windows & macOS (b42a340) - Fix: MSVC compilation issues (5f60fe4) - Improve: Simplify setting thread-counts (20bbe22) - Make: Install `sz` before `szs` in CI (9033649) - Make: Skip x86 intrinsics in `universal` builds (ce6cab2) - Fix: Inferring Ice Lake similarity kernels (a65dc99) - Fix: Disambiguate `szs_` symbols (1fd5db8) - Make: Override VS Code compiler choice on `osx` (d06d4e4) - Make: Avoid SVE builds on macOS (abb970b) - Fix: Wrong `SZ_DYNAMIC_DISPATCH` check (acdab3f) - Make: Preinstall `wheel` in CI (62f8c98) - Make: Bump tapes to 4.0 (72fd6be) - Fix: Avoid forced inlining for HW flags (4cef520) - Fix: Feature-checking STL (b1418b5) - Fix: Missing `allocator_traits` include (44b8d72) - Fix: Workaround for `static_cast` (b6a4cc6) - Fix: Avoid unaligned XMM loads (0f840bb) - Fix: `static_cast` to standard for MSVC (43c953f) - Make: Avoid `uv` in GitHub CI (89ed74d) - Fix: Unused variable in `group_by` (3761107) - Fix: Unused `qsort` on MacOS (cd263ae) - Improve: Unaligned loads in serial hashes (5e9d488) - Make: Parallel backends CI (352b48d) - Make: Referencing old tests (7c4fe32) - Make: Caps introspection flags on Arm (d6c7cf3) - Fix: Converting to string views (6b00ab9) - Fix: Unused symbols (e825ed8) - Fix: Fetching engines `::capability_k` (ddc640b) - Fix: uninitialized intersection `count` (ef3ca96) - Make: Disable NUMA by default (b27552d) - Fix: Guard SVE checks for cross-compilation (dac3941) - Make: Install Git on Alpine (1ff0c15) - Make: Log Alpine version (7ea327e) - Fix: Type-casting seed on Clang (4598f42) - Make: Bump Fork Union to 2.2.2 (e674413) - Fix: Deprecate `levenshteinDistance` in Swift (70c4add) - Fix: Passing StringZillas doctests (529fb76) - Fix: Allow NULL allocator args (b0c33bd) - Make: NVCC flags for Rust (6b38ea2) - Fix: Report requesting 1 CPU core (4ef0464) - Improve: Passing StringZillas.rs tests (34bb89a) - Improve: Use StringTape for GPU backends (87a0767) - Docs: StringZillas C API (34f4137) - Fix: Memalloc initialization on MSVC (#230) (a6e0a77) - Docs: Drop OpenMP and old name (7f45118) - Improve: Infer `capabilities` from `DeviceScope` (a807eba) - Improve: Introspect `sz_device_scope_t` (d9100a3) - Make: Consistent `-O2` optimization (5fcde22) - Improve: Drop unused `info1` (f4d4a76) - Fix: Rendering byte strings in Python (cf73d79) - Fix: Forward errors from `sz_rune_parse` (474dec4) - Improve: PyTest different MinHash dimensions (b5060a8) - Fix: Expose `value_type` for CUDA fingerprinter (7724886) - Fix: Fingerprinting memory management (f8dea13) - Fix: `to_span` compilation (e7fdd98) - Improve: Wrap high-dim fingerprints (bbf30d2) - Fix: Handling empty strings in arrays (517757c) - Improve: More readable PyTest (a6d3ed2) - Fix: Checking for Ice Lake caps (4f7649a) - Improve: Simpler & slower Py args parsing (36745f4) - Improve: Propagate error message to Py (711bd63) - Improve: Comparing 2 mem-allocators (7870cd3) - Fix: Skip missing `affine_levenshtein_utf8_ice_t` (1e112c6) - Make: Custom `CudaBuildExtension` for Python (4abf63c) - Make: Option to disable CUDA builds (84492e5) - Fix: Refer to `prong_t` in executor concepts (57d4ec8) - Fix: MSVC & Clang compilation errors (31fcdb3) - Make: Avoid OpenMP in builds (183ea96) - Fix: Announce `LevenshteinDistancesUTF8Type` (cb81c00) - Improve: Cache hardware capabilities (b9a1109) - Improve: Export capabilities as a tuple (0223b62) - Improve: Printing CUDA caps (94d8c36) - Fix: Match Apache Arrow layout (38c87b4) - Improve: Type-casting `seed`s in `Strs.sample` (5ac53c3) - Improve: Constructing `Strs` from PyArrow (7aebef4) - Fix: Track ownership of `Strs` offsets (65380ca) - Improve: Expose `sz_capabilities` in non-dynamic builds (6206eb4) - Fix: Avoid depending SZS -> SZ (c411e37) - Docs: Mark programming languages correctly (d1fc68c) - Make: Bump C++ & CUDA to 20 for libs (320f3da) - Fix: `rebind_alloc` in C++20 (492d726) - Fix: Tautological compare check (b924d90) - Improve: Random-access similarity outputs (c5e778d) - Make: Building parallel Python packages (0f44c25) - Fix: Clang build warnings (7741272) - Fix: Using braces for Clang builds (325cedb) - Make: Pull submodules in CI (5bb90b3) - Fix: Avoid `_mm256_cvtepi64_epi32` on Haswell (014002e) - Make: Separate parallel library sources (ff019b1) - Make: Forward `march` flags through NVCC (496ae84) - Make: Move CUDA lib into header (d4a66c5) - Make: FMA flag for Haswell (2e1daa4) - Fix: Compilation of all C targets (a2b228c) - Make: Compiling StringZillas shared libs (6ac80e8) - Improve: `gpu_specs_fetch` & GPU args order (3d7b491) - Docs: Sync description one-liner (ab9b617) - Improve: Draft parallel fingerprinting API (e49f570) - Improve: Runtime variable window widths (031d067) - Make: Rename `lib.rs` (5d82454) - Docs: Reuse operators state (e5e2702) - Improve: Naming multi-input processors (a3c3510) - Fix: Scramble results between fingerprint benchmarks (fbf7203) - Make: Format Python to 120 columns (640b7c4) - Improve: Naming internal symbols (3b48c93) - Improve: Unroll CUDA fingerprints (37f3d80) - Docs: Refresh Python benchmarking suite (49cf4ea) - Fix: Weird compiler bug related to `cuda_status_t` (0a0955e) - Fix: Fingerprinting in CUDA (52b1d73) - Fix: Estimating hash counts in fingerprints (058af71) - Improve: Unroll & parallelize fingerprinting (cda36fd) - Fix: Inferring the prong type of executors (531b1e9) - Improve: Align thread-pool within stack-frame (b1077a4) - Improve: Wording inconsistencies (afaf11b) - Make: Launchers for Parallel C++ benchmarks (9f3beac) - Fix: Fingerprinting via Skylake extensions (08c1e86) - Fix: Passing fingerprinting builds (44a058d) - Improve: Include hash counts in fingerprints (0d9ba5b) - Improve: Consistent kernel naming without underscore prefixes (05725b2) - Improve: Expose floating-point SIMD states (4a57789) - Fix: Consistent `barrett_mod` in C++ & Python (1fc1cab) - Docs: Using `uv` for tests (f86dfaa) - Fix: Choosing co-primes with `std::gcd` (a1b3001) - Improve: Using fast calling convention for CPython (97ab23c) - Docs: Show higher recall with better hashes (255d443) - Improve: Ensure `seed` affects hashes (78d39f9) - Improve: Separate StringZillas Python code (883a3cd) - Fix: Fingerprinting compilation (19f92c4) - Improve: Explore Min-Hashes (46dd7d0) - Improve: Test fingerprint equivalence (e4aa3f7) - Fix: `is_same_type` usage over `std::is_same` (0ab5710) - Improve: Ignore previous UB commit in blame (7fc7323) - Fix: Avoid UB with underscore prefixes (74e3b6f) - Fix: `sz_bitcast` strict aliasing (80b97de) - Fix: Avoid `std::swap` in device code (c0aea26) - Fix: C++17 compatibility issues (7ea685d) - Fix: Guard C++20 concepts use (60763b3) - Fix: Backport `std::remove_cvref` to C++17 (ffee12b) - Improve: Move `safe_vector` (78d8c96) - Fix: Limit `constexpr` use in C++11 (866e2f2) - Fix: Minor build issues (46a6d63) - Improve: Extend dummy executors API (498d72a) - Fix: Wrong Fork Union class name (5def3af) - Improve: Compile-time-known `span` extents (8355b6e) - Improve: Move `arrays_equality` (e96d26f) - Improve: Merge fingerprinting drafts (cf6077e) - Make: Deprecate Find Many kernels (a6f799f) - Improve: Extend `find_many` tests (c80ce60) - Improve: Upgrade Fork Union (fb5f429) - Docs: Similar wording in "Explore Levenshtein" (766e250) - Fix: Correct namespaces for scripts (aafcbbd) - Fix: Replace `+g` with `+m,r` like GB (40bd3ed) - Fix: Wrong boundary conditions for `count_many_parallel` (0b22dd9) - Improve: Switch to "StringParaZilla" naming (0f3f928) - Improve: Cleaner haystack splitting (3e93b75) - Improve: Prioritize find-many tests (7a433a9) - Improve: `SZ_DYNAMIC` attributes (2f0334a) - Fix: Forwarding dataset `nothrow`-copyable views in tests (170b61b) - Improve: Use CUDA Atomics to aggregate globally (cf7d9a5) - Make: Missing `bench_search` target (1133cce) - Fix: `find_many.cuh` compilation issues (72affc2) - Fix: AC dictionary `try_assign` with different alloc (4f9d2db) - Fix: Warning around immutable span conversion (d88842a) - Docs: Target names (503e470) - Fix: Match C++ class names (966dd09) - Docs: Sections & links (f5a94b1) - Make: Bump `fork_union` (3f5de03) - Fix: Revert to separate `find_many` algos for different length (0acbeeb) - Improve: `__reduce_max_sync` in SW on Hopper (c45017d) - Fix: `unified_alloc` propagation (13e0201) - Docs: Move segmenting & features drafts (3972cb3) - Fix: Compiler over-optimizing `bench_find_many` (16f5fa4) - Fix: Prefix length in parallel counting of short needles (69c7b6a) - Fix: Including the entire haystack into match (3ee1676) - Fix: Buffer overflow due to wrong thread count (887c16b) - Improve: Skip vocabulary duplicates (5302a4d) - Improve: New multi-pattern search APIs (20c9135) - Fix: Missing `std::generate` include (853a0fa) - Improve: Multi-byte characters support (85c5bf8) - Improve: Propagate allocators in `safe_vector` (0c442d1) - Improve: Custom validators for nullary benchmarks (0cf994a) - Improve: Benchmark early exit (aaa6927) - Make: Rename `bench_search` -> `bench_find` (6f65624) - Docs: Aho-Corasick CUDA design (496e55a) - Improve: New caching primitives (df1f7ef) - Fix: Checking `STRINGWARS_STRESS` env-var (ade74f6) - Improve: Support executors in multi-pattern search (35fad3d) - Improve: Divergent branches on `i16` SW on Hopper (817b15c) - Fix: OOB impact on SW scoring (cd9ff1e) - Fix: NW & SW on Hopper (4134e44) - Improve: More noticeable signaling in tests (ece416d) - Improve: Scheduling speculative kernels (8a6a185) - Improve: Run multiple warps per block (3b2f263) - Fix: Comparing Affine benchmark results (64d8f4d) - Improve: Fuzzy test Ice Lake kernels (d603f7d) - Improve: Shrink Affine Ice Lake kernels (79e7a2f) - Fix: Affine top row initialization (b9e4160) - Fix: Levenshtein w. Affine costs on GPU for zero-length ins (076e58a) - Improve: Test weird affine gaps (3bbebc8) - Improve: Match `fork_union` API in executors (1d6f58d) - Improve: Use `fork_union` pools (81463d4) - Docs: Inconsistent naming (8b62f2c) - Make: Add `fork_union` dependency (bd3d341) - Improve: Duplicate `bytesum` assignment (bfcd10e) - Make: Bump CUDA version (c709621) - Fix: Ice Lake calls for empty inputs (f59ba5e) - Fix: Correcting blends on Kepler (563a73d) - Improve: Scheduling Levenshtein in CUDA (77b1087) - Improve: Avoid `views::group_by` with callback (6a61e1b) - Improve: `bytes_per_cell_t` enum (c6e907a) - Fix: Inconsistent timing in `bench_unary` (60b99ab) - Improve: Bounded methods not supported (23a7f58) - Improve: Use `requires` clause (22c691e) - Improve: Naming "executor" interfaces (7e778d9) - Improve: Generalize Levenshtein in CUDA (29da524) - Fix: Inclusion guard macro names (8e79fb7) - Docs: More datasets on HuggingFace (0b33ac3) - Improve: Aligned ZMM diagonal stores (10b5405) - Improve: Branchless K-mask calculation (3998c1f) - Improve: Measure gap magnitude (b4eb6a4) - Fix: Avoid horizontal walker overflow (9b2b4e5) - Fix: All Gotoh baselines (84397ae) - Fix: Initializing affine DP matrices (640853f) - Improve: Differentiate unary & uniform costs (8c15447) - Fix: Horizontal Affine Walkers (e5d85f3) - Improve: Alloc type-size check in `safe_vector` (9c5a56c) - Improve: Fetch warp-size dynamically (83dd5fb) - Improve: Warp-shuffle reductions in SW (914e98d) - Fix: Over-estimating number of overlapping matches (b85d3ca) - Improve: Faster multi-needle tests (0895d28) - Improve: Splitting jobs in baseline multi-search (d47d849) - Fix: Slicing corner-cases in OpenMP (7bcc803) - Improve: Use `std::execution` for baseline tests (f228264) - Improve: Parallel baseline for substring search (6ebc7b0) - Fix: `bytes_per_core_optimal` estimate (83bc966) - Improve: Pointer-constructible spans (a0fd136) - Fix: Passing multi-needle tests (faad971) - Fix: Calculating `find_many_match_t` properties (d0ebee8) - Fix: Indexing needle IDs (42fa08d) - Fix: Use smaller types in BFS queue (0c6dd00) - Fix: Aho Corasick construction (e27f86f) - Improve: Consistent shuffling behavior in benchmarks (d4d55fa) - Fix: `sz_copy_skylake` tail handling on large input (#222) (6da5e1e) - Fix: Propagate substitutions to benchmarks (eabe605) - Fix: Underutilized 99% of the H100 (7a9a243) - Fix: Using OpenMP directives (41e1a6e) - Fix: Forward GPU specs in CUDA tests (9d86d4c) - Improve: Generalize memory requirements estimates (7bd92c5) - Fix: Missing STL includes (f7d365a) - Improve: Use CUDA constant memory (cebd180) - Improve: Forward-declare substituters (0a04960) - Improve: Show signed integers in SIMD types (382c05d) - Fix: `i16` Ice Lake NW/SW alignment (8cc9794) - Improve: Catch & log exceptions (cb739e2) - Improve: Better UTF8 tests for similarity (b04e934) - Make: Parallel test launchers (1ece547) - Fix: Build issues (9db0287) - Fix: Included filename (ee38a22) - Fix: Uniform costs for UTF-32 runes (6edcf7f) - Improve: New template SFINAE in `similarity.hpp` (10279f9) - Fix: Initializing horizontal aligner (91df0ce) - Improve: Use tagged C `enum`s (35ba76c) - Improve: Allow custom validators in benchmarks (bd2a21d) - Fix: Compiling `constant_iterator` in CUDA (bc59ee3) - Fix: `std::allocator::rebind` deprecated (4c87404) - Fix: Avoid `std::iterator` dependency (b3db596) - Make: Move similarity benchmarks (90bfcb6) - Fix: Report error codes in tests (05f17a6) - Make: Separate Parallel C++ and CUDA tests (b175103) - Make: Rename test files (2eeab83) - Fix: Passing CUDA similarity tests (4c75d81) - Fix: NW/SW test correspondence (fea39ed) - Improve: Differentiate min/max-imizers (c96c5ed) - Improve: Annotate throwing exceptions (2ef667c) - Fix: Missing `sz_i16_t` definition (238c86d) - Fix: Avoid similarity scoring references (e4b1bbd) - Improve: Shorter type aliases (3ef1d26) - Make: Revert to C++ for core tests (0c6ff1f) - Docs: StringCuZilla design choices (f42aa85) - Make: Separate StringCuZilla (c5fd4bc) - Improve: Move capabilities to `types.h` (1c1582f) - Make: Compile with OpenMP (1671b0f) - Improve: Allow datasets in VRAM (b1c9a74) - Fix: Shuffle datasets with over 4B tokens (247c6ec) - Fix: Overflow `mean_token_length` calculation (efcadd1) - Make: NVCC kernel debugging symbols (0c0ff42) - Fix: Arrow-like string array (fa4b0f4) - Fix: Overwriting alignment scores (668a386) - Fix: Synchronizing CUDA kernel launch (3b85a00) - Fix: Shared memory requirements (27ad8f5) - Make: NVCC can't handle `fsanitize` (ea7647f) - Make: Draft CUDA compilation (1b96ef4) - Fix: Hardening `malloc(0)` behavior (7524882) - Improve: Share C++ macros and typedefs (e82d045) - Fix: Accounting for different gap costs (e4e517f) - Fix: NVCC warning for negative size field (a6c0fa2) - Fix: Track capacity in fixed buffer alloc (64b40a9) - Fix: Sign-cast warning in `_mm256_set1_epi64x` (aac2e8f) - Fix: Overriding `SZ_DEBUG` macro (bca734a) - Fix: Calling unused helper struct unit tests (d369170) - Improve: Cleaner API for OpenMP (bc311b3) - Fix: Shifting Levenshtein diagonals in OpenMP (6b5ef98) - Fix: Unaligned loads/stores of hash state (37863c9) - Improve: Expose rune-parsing headers (2727a87) - Fix: Diagonals depth (ef53f75) - Docs: Showcase indexing diagonals (b55c696) - Fix: `sz::lookup` examples in Rs (778d4f0) - Fix: Compiling SVE on MacOS (42270c8) - Improve: Inline cheap calls (28282d2) - Make: List `scripts/` deps for `uv` (1907d2b) - Docs: Formatting (dd57536) - Docs: `uv` instructions (9460fd4) - Fix: Unaligned `sz_hash_state_t` stores (7e65a1e) - Improve: Align inner hash-states (1b3cdd5) - Improve: Use `GiB` over `GB` (811fc59) - Improve: Construct `Byteset::from_bytes` (34660f2) - Fix: Remove missing Ice-Lake benchs (0c08564) - Fix: Compiling Py bindings (7e08180) - Make: CMake formatting (44485fb) - Docs: Formatting and references (928fd79) - Fix: No return (4cb096b) - Docs: Listing bench details (343b858) - Make: Patch passing `SZ_USE_SVE` definitions (45b15b0) - Fix: Expanding feature-detecting macros (75ef77e) - Improve: Bold benchmark names in CLI (2ab635e) - Improve: `fill_random` checksums in benchmarks (0bba772) - Improve: Simpler SVE find nested loop (a1604b7) - Improve: Unrolled serial hashing (77e482e) - Fix: `find_sve` mask update on long needles (a493ab8) - Improve: More simple substring search tests (68449fd) - Fix: `bench_sequence` CMake target (905749c) - Fix: Drop double negation in logging (3d77ec6) - Improve: Use `SQINCP` in SVE for increments (efda23b) - Fix: Dispatching SVE kernels (06ea5f7) - Docs: SVE2 intersects TODO (fafb8b0) - Improve: `do_not_optimize` token-level results (d1a3779) - Make: Rename `bench_sort` (991a78b) - Improve: Naming benchmark names (0074ad7) - Improve: Better sorting benchmarks (7d534fb) - Fix: Computing improvement percent (a5de795) - Improve: Faster equality checks on NEON/SVE (20f35c7) - Improve: New token-level benchmarks (12e1edd) - Docs: Describe trivial types (af686dd) - Fix: Naming byteset signature (9676cdb) - Improve: Naming "vtable" entries (b9794e5) - Make: Upgrade to C++20 for benchmarks (aa7f275) - Improve: New-style "container" benchmarks (3f1c723) - Fix: Reverse order `std::search` offsets (244e605) - Docs: Ignore formatting CMake (366816e) - Make: Formatting CMakeLists.txt (467b4b8) - Fix: Extra comma in `printf` (298d214) - Docs: Outdated function naming & spelling (92b9a56) - Improve: Token benchmarks (3b1897e) - Improve: Logging in container benchmarks (4d955d3) - Fix: No intersect for Skylake (48d70ea) - Fix: Revert to XMM on Haswell (4bec1e5) - Fix: Composing STL collections (f9da4ed) - Fix: `std::string::data` is mutable only since C++17 (ff23c3d) - Improve: Discard state in streaming hash (828263f) - Improve: Discarding buffer in streaming hashes (c4f7a0e) - Improve: Separate PRNG backends in benchmarks (af54e93) - Fix: Guard Skylake benchmarks (2965502) - Fix: Unused `_sz_capabilities` symbols (8bb90e5) - Fix: `sz_intersect` signature (f712de3) - Make: Don't build `stringzilla_bare` on MacOS (a7b35ba) - Fix: Variable in C++14 `constexpr` (feb415f) - Fix: Unused Levenshtein tests (90540d3) - Fix: `find_1byte` signature compatibility (d19e8b8) - Improve: Fix minor inconsistencies (f656577) - Docs: Exploring perfect Unicode hashing (197cd87) - Improve: Test set intersections (1d95601) - Fix: Randomization benchmarks (8dc4a2c) - Docs: Formatting (5c02c4e) - Docs: Details on the Unicode range (b6e4406) - Docs: Ignore C++ docstring updates blame (407dd2d) - Docs: New formatting in C++ (0d982a4) - Fix: Passing `sz_sequence_t::handle` (75fabf1) - Improve: Remove redundant comments from sz_hash_state functions in Rust (9fe25df) - Improve: Expose sz_lookup (471b002) - Improve: Expose sz_hash_state_init, sz_hash_state_stream, and sz_hash_state_fold to Rust (1757e4e) - Improve: Exposed sz_move, sz_fill, and sz_copy for Rust (b2085cc) - Improve: Inline most common Rust APIs (a30b5b7) - Make: `cibuildwheel` env variables (fbf256a) - Make: Decremental Rust builds (8877c82) - Fix: Detecting caps in dynamic builds (d52bf63) - Fix: `fill_random` test condition (8b396c8) - Fix: Compilation of all bindings (2caefac) - Make: Drop unused `build.sh` (2bbafa1) - Improve: Testing hash functions (80688bb) - Fix: Passing new hashing tests (268af53) - Improve: `copy`/`move` on Haswell with interleaving (69dfa10) - Docs: Announce JOINs (2225488) - Improve: Ordering includes (6e71536) - Improve: Vectorize `sz_equal_haswell` (7aad4bb) - Docs: Explaining `compare.h` operations (d7bab8d) - Improve: Clean `memory.h` header (7698392) - Improve: Use default allocator, when not provided (5a12c00) - Docs: Disable sorting includes (8bc161f) - Fix: Ice Lake partitioning logic (1da0e2b) - Improve: Expose Insertion-sort helpers (a38867f) - Fix: Merge-step bug in stable sort (db61d93) - Improve: Introduce typed `_sz_swap` macro (dcf6c65) - Improve: Rename `sz_sort` to `sz_qsort` (6191cc6) - Fix: `sz_sort_serial` passes tests (8bad799) - Fix: `uniform_int_distribution` lower bound (bdee111) - Fix: `sz_sort_serial` passes for same length inputs (0fda5a5) - Improve: Drop hybrid sort code (50d8291) - Fix: Underflow in serial sorting (5970fa4) - Make: Recommend pretty-printing GDB symbols (a818f97) - Fix: `uniform_int_distribution` upper bound (17f28a3) - Fix: In C++11 `constexpr` constructor must be empty (13bace2) - Fix: Sorting benchmarks for new API (66f2ac9) - Improve: Separate fingerprinting benchmarks (187e0bd) - Make: Renamed temp-git-split-file -> scripts/bench_token.cpp (031bedf) - Make: Renamed scripts/bench_token.cpp -> temp-git-split-file (07d2239) - Make: Renamed scripts/bench_token.cpp -> scripts/bench_fingerprint.cpp (a0318eb) - Docs: Signatures and typos (982dd4d) - Improve: Wrap `std::accumulate` for checksums (bce107a) - Improve: Validate checksums in benchmark (abe8d07) - Fix: Tail sum order in `checksum_haswell` (b20d7cd) - Fix: Infer allocators `value_type` (5bbd971) - Fix: Tail handling in `sz_checksum_haswell` (84cb4c8) - Fix: Loops in AVX-512 checksums (4044855) - Fix: Loop in `sz_checksum_haswell` (509b58b) - Improve: Relax many `constexpr`s from C++20 to C++14 (0a3e363) - Make: Move drafts (1de3166) - Make: Renamed temp-git-split-file -> include/stringzilla/hash.h (5a36cb7) - Make: Renamed include/stringzilla/hash.h -> temp-git-split-file (7052266) - Make: Renamed include/stringzilla/hash.h -> include/stringzilla/fingerprint.h (0ef7cf1) - Docs: Spelling `usnigned` (d18a159) - Improve: hybrid bench sort performance (9880f26) - Fix: hybrid bench sorts loading initial stirng bytes incorrectly (455508f) - Fix: stable sort bench tests failing (821d19e) - Fix: Minor dispatch issues (d20e589) - Improve: Faster `levenshtein_baseline` (d9557d3) - Fix: BMI flags for `BZHI` (fa47deb) - Fix: Masks back to using `BZHI` (bd7054e) - Make: Library namespaced aliases (f3811d7) - Fix: `sz_u512_vec_t` members visibility (2007d49) - Fix: Bounded Levenshtein returns (749b0d8) - Fix: Skylake dispatch (48e0913) - Fix: Linking `stderr` (084d653) - Docs: Formatting docstring (c99daf3) - Fix: Initializing `basic_charset` (864ee03) - Fix: Correct `basic_charset` operator (#203) (e20d207) - Improve: Ignore 40 commits in blame (064829a) - Fix: Overriding LibC in 32-bit Windows (645539b) - Improve: C++ version macros naming (19c2ae9) - Make: Detect Apple Universal builds (6d61c21) - Make: Rename `stringzillite` to `stringzilla_bare` (364e2ca) - Fix: Symbols names & visibility (406bf0f) - Fix: Haswell compilation flag (00f27f6) - Fix: Filter `compare.h` file (6512f1d) - Make: Split ./include/stringzilla/find.h to ./include/stringzilla/compare.h (fc9e5d6) - Make: Split ./include/stringzilla/find.h to ./include/stringzilla/compare.h (49e8d9d) - Make: Split ./include/stringzilla/find.h to ./include/stringzilla/compare.h (fc408fa) - Fix: Partially filter `stringzilla.h` file (41e5917) - Fix: Minor macro mismatches (5f7ca59) - Fix: Filter `types.h` file (b835051) - Fix: Filter `sort.h` file (1ba7982) - Fix: Filter `small_string.h` file (5b55e19) - Make: Separate builds for Skylake & Ice (4a1f03c) - Improve: Platform-specific equality checks (8b44d6a) - Fix: Filter `hash.h` file (be4c63d) - Fix: Filter `similarity.h` file (8b401bd) - Fix: Filter `memory.h` file (295d49a) - Fix: Filter `find.h` file (2a1fcd1) - Make: Split ./include/stringzilla/stringzilla.h to ./include/stringzilla/memory.h (2f76521) - Make: Split ./include/stringzilla/stringzilla.h to ./include/stringzilla/memory.h (45e57ee) - Make: Split ./include/stringzilla/stringzilla.h to ./include/stringzilla/memory.h (66778d6) - Make: Split ./include/stringzilla/stringzilla.h to ./include/stringzilla/sort.h (c357c3e) - Make: Split ./include/stringzilla/stringzilla.h to ./include/stringzilla/sort.h (cbfe5c7) - Make: Split ./include/stringzilla/stringzilla.h to ./include/stringzilla/sort.h (085d2d3) - Make: Split ./include/stringzilla/stringzilla.h to ./include/stringzilla/small_string.h (3464cb4) - Make: Split ./include/stringzilla/stringzilla.h to ./include/stringzilla/small_string.h (89c4681) - Make: Split ./include/stringzilla/stringzilla.h to ./include/stringzilla/small_string.h (3f9c248) - Make: Split ./include/stringzilla/stringzilla.h to ./include/stringzilla/similarity.h (e23c35f) - Make: Split ./include/stringzilla/stringzilla.h to ./include/stringzilla/similarity.h (10d829e) - Make: Split ./include/stringzilla/stringzilla.h to ./include/stringzilla/similarity.h (d74e5dc) - Make: Split ./include/stringzilla/stringzilla.h to ./include/stringzilla/hash.h (1f60e6d) - Make: Split ./include/stringzilla/stringzilla.h to ./include/stringzilla/hash.h (08d0a20) - Make: Split ./include/stringzilla/stringzilla.h to ./include/stringzilla/hash.h (9e9f256) - Make: Split ./include/stringzilla/stringzilla.h to ./include/stringzilla/find.h (974ed78) - Make: Split ./include/stringzilla/stringzilla.h to ./include/stringzilla/find.h (14ba3bf) - Make: Split ./include/stringzilla/stringzilla.h to ./include/stringzilla/find.h (9e577be) - Make: Split ./include/stringzilla/stringzilla.h to ./include/stringzilla/types.h (8cb0742) - Make: Split ./include/stringzilla/stringzilla.h to ./include/stringzilla/types.h (22e3d1e) - Make: Split ./include/stringzilla/stringzilla.h to ./include/stringzilla/types.h (ecb3775) - Fix: Wrong env. variable names (d0678f8) - Make: Inline ASM for detecting CPU features on ARM (0ee549a) - Fix: Default Levenshtein upper bound (62ca6a0) - Improve: Levenshtein functions for unicode (d3b423a) - Docs: Levenshtein tutorial in Jupyter (715ad10) - Fix: `sz_look_up_transform_avx512` declaration (585f7d5) - Improve: `#pragma region` dashes (fe4449b)
I've created a Str object wrapper for Node.js
And implemented a few functions with tests.