Thank you for coming here! It's always nice to have third-party contributors 🤗
To keep the quality of the code high, we have a set of guidelines common to all Unum projects.
Before building the first time, please pull git submodules.
That's how we bring in SimSIMD and other optional dependencies to test all of the available functionality.
git submodule update --init --recursiveOur primary C++ implementation uses CMake for builds. If this is your first experience with CMake, use the following commands to get started:
sudo apt-get update && sudo apt-get install cmake build-essential libjemalloc-dev g++-12 gcc-12 # Ubuntu
brew install libomp llvm # MacOSUsing modern syntax, this is how you build and run the test suite:
cmake -D USEARCH_BUILD_TEST_CPP=1 -D CMAKE_BUILD_TYPE=Debug -B build_debug
cmake --build build_debug --config Debug
build_debug/test_cppIf there build mode is not specified, the default is Release.
cmake -D USEARCH_BUILD_TEST_CPP=1 -B build_release
cmake --build build_release --config Release
build_release/test_cppFor development purposes, you may want to include symbols information in the build:
cmake -D USEARCH_BUILD_TEST_CPP=1 -D CMAKE_BUILD_TYPE=RelWithDebInfo -B build_relwithdebinfo
cmake --build build_relwithdebinfo --config RelWithDebInfo
build_relwithdebinfo/test_cppThe CMakeLists.txt file has a number of options you can pass:
- What to build:
USEARCH_BUILD_TEST_CPP- build the C++ test suiteUSEARCH_BUILD_BENCH_CPP- build the C++ benchmark suiteUSEARCH_BUILD_LIB_C- build the C libraryUSEARCH_BUILD_TEST_C- build the C test suiteUSEARCH_BUILD_SQLITE- build the SQLite extension (no Windows)
- Which dependencies to use:
USEARCH_USE_OPENMP- use OpenMP for parallelismUSEARCH_USE_SIMSIMD- use SimSIMD for vectorizationUSEARCH_USE_JEMALLOC- use Jemalloc for memory managementUSEARCH_USE_FP16LIB- use software emulation for half-precision floating point
Putting all of this together, compiling all targets on most platforms should work with the following snippet:
cmake -D CMAKE_BUILD_TYPE=Release -D USEARCH_USE_FP16LIB=1 -D USEARCH_USE_OPENMP=1 -D USEARCH_USE_SIMSIMD=1 -D USEARCH_USE_JEMALLOC=1 -D USEARCH_BUILD_TEST_CPP=1 -D USEARCH_BUILD_BENCH_CPP=1 -D USEARCH_BUILD_LIB_C=1 -D USEARCH_BUILD_TEST_C=1 -D USEARCH_BUILD_SQLITE=0 -B build_release
cmake --build build_release --config Release
build_release/test_cpp
build_release/test_cSimilarly, to use the most recent Clang compiler version from HomeBrew on MacOS:
brew install clang++ clang cmake
cmake \
-D CMAKE_BUILD_TYPE=Release \
-D CMAKE_C_COMPILER="$(brew --prefix llvm)/bin/clang" \
-D CMAKE_CXX_COMPILER="$(brew --prefix llvm)/bin/clang++" \
-D USEARCH_USE_FP16LIB=1 \
-D USEARCH_USE_OPENMP=1 \
-D USEARCH_USE_SIMSIMD=1 \
-D USEARCH_USE_JEMALLOC=1 \
-D USEARCH_BUILD_TEST_CPP=1 \
-D USEARCH_BUILD_BENCH_CPP=1 \
-D USEARCH_BUILD_LIB_C=1 \
-D USEARCH_BUILD_TEST_C=1 \
-B build_release
cmake --build build_release --config Release
build_release/test_cpp
build_release/test_cLinting:
cppcheck --enable=all --force --suppress=cstyleCast --suppress=unusedFunction \
include/usearch/index.hpp \
include/index_dense.hpp \
include/index_plugins.hppI'd recommend putting the following breakpoints when debugging the code in GDB:
__asan::ReportGenericError- to detect illegal memory accesses.__ubsan::ScopedReport::~ScopedReport- to catch undefined behavior.__GI_exit- to stop at exit points - the end of running any executable.__builtin_unreachable- to catch all the places where the code is expected to be unreachable.__usearch_raise_runtime_error- for USearch-specific assertions.
Unlike GCC, LLVM handles cross compilation very easily.
You just need to pass the right TARGET_ARCH and BUILD_ARCH to CMake.
The list includes:
crossbuild-essential-amd64for 64-bit x86crossbuild-essential-arm64for 64-bit Armcrossbuild-essential-armhffor 32-bit ARM hard-floatcrossbuild-essential-armelfor 32-bit ARM soft-float (emulatesfloat)crossbuild-essential-riscv64for RISC-Vcrossbuild-essential-powerpcfor PowerPCcrossbuild-essential-s390xfor IBM Zcrossbuild-essential-mipsfor MIPScrossbuild-essential-ppc64elfor PowerPC 64-bit little-endian
Here is an example for cross-compiling for Arm64 on an x86_64 machine:
sudo apt-get update
sudo apt-get install -y clang lld make crossbuild-essential-arm64 crossbuild-essential-armhf
export CC="clang"
export CXX="clang++"
export AR="llvm-ar"
export NM="llvm-nm"
export RANLIB="llvm-ranlib"
export TARGET_ARCH="aarch64-linux-gnu" # Or "x86_64-linux-gnu"
export BUILD_ARCH="arm64" # Or "amd64"
cmake -D CMAKE_BUILD_TYPE=Release \
-D CMAKE_C_COMPILER_TARGET=${TARGET_ARCH} \
-D CMAKE_CXX_COMPILER_TARGET=${TARGET_ARCH} \
-D CMAKE_SYSTEM_NAME=Linux \
-D CMAKE_SYSTEM_PROCESSOR=${BUILD_ARCH} \
-B build_artifacts
cmake --build build_artifacts --config ReleasePython bindings are built using PyBind11 and are available on PyPi.
The compilation settings are controlled by the setup.py and are independent from CMake used for C/C++ builds.
To install USearch locally:
pip install -e .For testing USearch uses PyTest, which is pre-configured in pyproject.toml.
Following options are enabled:
- The
-soption will disable capturing the logs. - The
-xoption will exit after first failure to simplify debugging. - The
-p no:warningsoption will suppress and allow warnings.
pip install pytest pytest-repeat # for repeated fuzzy tests
pytest # if you trust the default settings
pytest python/scripts/ -s -x -p no:warnings # to overwrite the default settingsLinting:
pip install ruff
ruff --format=github --select=E9,F63,F7,F82 --target-version=py37 pythonBefore merging your changes you may want to test your changes against the entire matrix of Python versions USearch supports.
For that you need the cibuildwheel, which is tricky to use on MacOS and Windows, as it would target just the local environment.
Still, if you have Docker running on any desktop OS, you can use it to build and test the Python bindings for all Python versions for Linux:
pip install cibuildwheel
cibuildwheel
cibuildwheel --platform linux # works on any OS and builds all Linux backends
cibuildwheel --platform linux --archs x86_64 # 64-bit x86, the most common on desktop and servers
cibuildwheel --platform linux --archs aarch64 # 64-bit Arm for mobile devices, Apple M-series, and AWS Graviton
cibuildwheel --platform macos # works only on MacOS
cibuildwheel --platform windows # works only on WindowsYou may need root privileges for multi-architecture builds:
sudo $(which cibuildwheel) --platform linuxOn Windows and MacOS, to avoid frequent path resolution issues, you may want to use:
python -m cibuildwheel --platform windowsUSearch provides NAPI bindings for NodeJS available on NPM.
The compilation settings are controlled by the binding.gyp and are independent from CMake used for C/C++ builds.
If you don't have NPM installed, first the Node Version Manager:
wget -qO- https://raw.githubusercontent.com/nvm-sh/nvm/v0.39.7/install.sh | bash
nvm install 20Testing:
npm install -g typescript
npm install
npm run build-js
npm testTo compile for AWS Lambda you'd need to recompile the binding. You can test the setup locally, overriding some of the compilation variables in Docker image:
FROM public.ecr.aws/lambda/nodejs:18-x86_64
RUN npm init -y
RUN yum install tar git python3 cmake gcc-c++ -y && yum groupinstall "Development Tools" -y
# Assuming AWS Linux 2 uses old compilers:
ENV USEARCH_USE_FP16LIB 1
ENV DUSEARCH_USE_SIMSIMD 1
ENV SIMSIMD_TARGET_HASWELL 1
ENV SIMSIMD_TARGET_SKYLAKE 0
ENV SIMSIMD_TARGET_ICE 0
ENV SIMSIMD_TARGET_SAPPHIRE 0
ENV SIMSIMD_TARGET_NEON 1
ENV SIMSIMD_TARGET_SVE 0
# For specific PR:
# RUN npm install --build-from-source unum-cloud/usearch#pull/302/head
# For specific version:
# RUN npm install --build-from-source usearch@2.8.8
RUN npm install --build-from-source usearchTo compile to WebAssembly make sure you have emscripten installed and run the following script:
emcmake cmake -B build -DCMAKE_CXX_FLAGS="${CMAKE_CXX_FLAGS} -s TOTAL_MEMORY=64MB" && emmake make -C build
node build/usearch.test.jsIf you don't yet have emcmake installed:
git clone https://github.com/emscripten-core/emsdk.git && ./emsdk/emsdk install latest && ./emsdk/emsdk activate latest && source ./emsdk/emsdk_env.shUSearch provides Rust bindings available on Crates.io.
The compilation settings are controlled by the build.rs and are independent from CMake used for C/C++ builds.
cargo test -p usearch -- --nocapture --test-threads=1Publishing the crate is a bit more complicated than normally. If you simply pull the repository with submodules and run the following command it will list fewer files than expected:
cargo package --list --allow-dirtyThe reason for that is the heuristic that Cargo uses to determine the files to include in the package.
Regardless of whether exclude or include is specified, the following files are always excluded: Any sub-packages will be skipped (any subdirectory that contains a Cargo.toml file).
Assuming both SimSIMD and StringZilla contain their own Cargo.toml files, we need to temporarily exclude them from the package.
mv simsimd/Cargo.toml simsimd/Cargo.toml.bak
mv stringzilla/Cargo.toml stringzilla/Cargo.toml.bak
cargo package --list --allow-dirty
cargo publish
# Revert back
mv simsimd/Cargo.toml.bak simsimd/Cargo.toml
mv stringzilla/Cargo.toml.bak stringzilla/Cargo.tomlUSearch provides both Objective-C and Swift bindings through the Swift Package Manager.
The compilation settings are controlled by the Package.swift and are independent from CMake used for C/C++ builds.
swift build && swift test -vThose depend on Apple's
Foundationlibrary and can only run on Apple devices.
Swift formatting is enforced with swift-format default utility from Apple.
To install and run it on all the files in the project, use the following command:
brew install swift-format
swift-format . -i -rThe style is controlled by the .swift-format JSON file in the root of the repository.
As there is no standard for Swift formatting, even Apple's own swift-format tool and Xcode differ in their formatting rules, and available settings.
USearch provides GoLang bindings, that depend on the C library that must be installed beforehand. So one should first compile the C library, link it with GoLang, and only then run tests.
cmake -B build_release -D USEARCH_BUILD_LIB_C=1 -D USEARCH_BUILD_TEST_C=1 -D USEARCH_USE_OPENMP=1 -D USEARCH_USE_SIMSIMD=1
cmake --build build_release --config Release -j
cp build_release/libusearch_c.so golang/ # or .dylib to install the library on MacOS
cp c/usearch.h golang/ # to make the header available to GoLang
cd golang && LD_LIBRARY_PATH=. go test -v ; cd ..USearch provides Java bindings available from the GitHub Maven registry and the Sonatype Maven Central Repository.
The compilation settings are controlled by the build.gradle and are independent from CMake used for C/C++ builds.
To setup the Gradle environment:
sudo apt-get install zip
curl -s "https://get.sdkman.io" | bash
sdk install java
sdk install gradleAfterwards, in a new terminal:
gradle clean build
gradle testAlternatively, to run the Index.main:
java -cp "$(pwd)/build/classes/java/main" -Djava.library.path="$(pwd)/build/libs/usearch/shared" java/cloud/unum/usearch/Index.javaOr step by-step:
cd java/cloud/unum/usearch
javac -h . Index.java NativeUtils.java
# Ensure JAVA_HOME system environment variable has been set
# e.g. export JAVA_HOME=/usr/lib/jvm/java-11-openjdk-amd64
# Ubuntu:
g++ -c -fPIC -I${JAVA_HOME}/include -I${JAVA_HOME}/include/linux -I../../../../include cloud_unum_usearch_Index.cpp -o cloud_unum_usearch_Index.o
g++ -shared -fPIC -o libusearch.so cloud_unum_usearch_Index.o -lc
# Windows
g++ -c -I%JAVA_HOME%\include -I%JAVA_HOME%\include\win32 cloud_unum_usearch_Index.cpp -I..\..\..\..\include -o cloud_unum_usearch_Index.o
g++ -shared -o USearchJNI.dll cloud_unum_usearch_Index.o -Wl,--add-stdcall-alias
# MacOS
g++ -std=c++11 -c -fPIC \
-I../../../../include \
-I../../../../fp16/include \
-I../../../../simsimd/include \
-I${JAVA_HOME}/include -I${JAVA_HOME}/include/darwin cloud_unum_usearch_Index.cpp -o cloud_unum_usearch_Index.o
g++ -dynamiclib -o libusearch.dylib cloud_unum_usearch_Index.o -lc
# Run linking to that directory
cd ../../../..
cp cloud/unum/usearch/libusearch.* .
java -cp . -Djava.library.path="$(pwd)" cloud.unum.usearch.IndexSetup the .NET environment:
dotnet nuget add source https://api.nuget.org/v3/index.json -n nuget.orgUSearch provides CSharp bindings, that depend on the C library that must be installed beforehand. So one should first compile the C library, link it with CSharp, and only then run tests.
cmake -B build_artifacts -D USEARCH_BUILD_LIB_C=1 -D USEARCH_BUILD_TEST_C=1 -D USEARCH_USE_OPENMP=1 -D USEARCH_USE_SIMSIMD=1
cmake --build build_artifacts --config Release -jThen, on Windows, copy the library to the CSharp project and run the tests:
mkdir -p ".\csharp\lib\runtimes\win-x64\native"
cp ".\build_artifacts\libusearch_c.dll" ".\csharp\lib\runtimes\win-x64\native"
cd csharp
dotnet test -c Debug --logger "console;verbosity=detailed"
dotnet test -c ReleaseOn Linux, the process is similar:
mkdir -p "csharp/lib/runtimes/linux-x64/native" # for x86
cp "build_artifacts/libusearch_c.so" "csharp/lib/runtimes/linux-x64/native" # for x86
mkdir -p "csharp/lib/runtimes/linux-arm64/native" # for ARM
cp "build_artifacts/libusearch_c.so" "csharp/lib/runtimes/linux-arm64/native" # for ARM
cd csharp
dotnet test -c Debug --logger "console;verbosity=detailed"
dotnet test -c ReleaseOn macOS with Arm-based chips:
mkdir -p "csharp/lib/runtimes/osx-arm64/native"
cp "build_artifacts/libusearch_c.dylib" "csharp/lib/runtimes/osx-arm64/native"
cd csharp
dotnet test -c Debug --logger "console;verbosity=detailed"
dotnet test -c Releasebrew install --cask wolfram-enginedocker build -t unum/usearch . && docker run unum/usearchFor multi-architecture builds and publications:
version=$(cat VERSION)
docker buildx create --use &&
docker login &&
docker buildx build \
--platform "linux/amd64,linux/arm64" \
--build-arg version=$version \
--file Dockerfile \
--tag unum/usearch:$version \
--tag unum/usearch:latest \
--push .export WASI_VERSION=21
export WASI_VERSION_FULL=${WASI_VERSION}.0
wget https://github.com/WebAssembly/wasi-sdk/releases/download/wasi-sdk-${WASI_VERSION}/wasi-sdk-${WASI_VERSION_FULL}-linux.tar.gz
tar xvf wasi-sdk-${WASI_VERSION_FULL}-linux.tar.gzAfter the installation, we can pass WASI SDK to CMake as a new toolchain:
cmake -DCMAKE_TOOLCHAIN_FILE=${WASI_SDK_PATH}/share/cmake/wasi-sdk.cmake .Extending metrics in SimSIMD:
git push --set-upstream https://github.com/ashvardanian/simsimd.git HEAD:main