This repository builds an Arm KleidiAI-enabled LLM wrapper library with a thin, backend-agnostic C++ API and optional JNI bindings. Supported backends are selected at CMake configure time: llama.cpp, onnxruntime-genai, mediapipe, and mnn.
README.md: build options, supported frameworks/models, basic usage.skills/: project-local skills for repeatable workflows.CMakePresets.json: supported build presets.scripts/cmake/: CMake modules, toolchains, downloads, framework glue.scripts/dev/: doctor checks, version reports, debug bundle helpers.scripts/py/download_resources.pyandscripts/py/requirements.json: deterministic download definitions.model_configuration_files/: model config JSONs consumed by tests and examples.src/cpp/Llm.cppandsrc/cpp/LlmJni.cpp: top-level native and JNI entry points; common init, teardown, and benchmark integration live here.src/cpp/interface/: public C++ API.src/cpp/config/: config parsing and schema handling.src/cpp/log/: shared logging macros, formatting helpers, and generated build metadata reporting.src/cpp/chat/: prompt templating, query building, and conversation formatting helpers.src/cpp/frameworks/: backend integrations.src/cpp/benchmark/: benchmarking pipeline;LlmBenchadapts the LLM API andBenchRunnerowns iteration/report formatting.src/cpp/profiling/: optional Arm Streamline integration and timeline annotations.src/java/: Java/JNI surface.test/cpp/: Catch2 coverage for config, logging, benchmark runner, and core wrapper behavior.test/java/: JNI-facing tests for the Java surface.TROUBLESHOOTING.md: platform-specific issues and limitations.
Project-specific skills live under skills/. Use them when the task clearly matches a documented workflow. Keep AGENTS.md for durable repo rules; keep detailed procedures in the skills themselves.
When a change affects build, test, runtime behavior, CMake, scripts, or public API/configuration, run a local build and CTest before considering the change done:
cmake --preset=native -B build
cmake --build ./build
ctest --test-dir ./build --output-on-failureIf JNI is not relevant, it is reasonable to iterate with -DBUILD_JNI_LIB=OFF.
Update README.md when a change affects something users or reviewers need to know how to build, configure, run, or use. Update TROUBLESHOOTING.md for platform-specific issues or new limitations.
- Use the shared logging macros in
src/cpp/log/Logger.hpp:LOG_ERROR,LOG_WARN,LOG_INF,LOG_DEBUG, andLOG_BUILD_INFO. - Prefer
THROW_ERRORandTHROW_INVALID_ARGUMENTfor error paths that should log and throw together. - Do not add new raw
printf,fprintf,std::cout, orstd::cerrcalls in native code unless there is a clear file-local reason. - Keep log messages concise and structured. Include framework/backend names and relevant paths or config values when diagnosing initialization or runtime failures.
- Route new native/JNI diagnostics through
Logger.hppso they appear consistently in CLI output and Android logcat.
- Put benchmark report and output formatting changes in
src/cpp/benchmark/BenchRunner.*. - Put benchmark adapter/runtime interaction changes in
src/cpp/benchmark/LlmBench.*.
- Keep JNI-visible behavior in
src/cpp/LlmJni.cppaligned with the Java surface insrc/java/com/arm/Llm.java.
- Route build/version metadata changes through
src/cpp/log/BuildInfo.*andsrc/cpp/log/LlmBuildInfoConfig.hpp.ininstead of hardcoding values in multiple places.
Configure may trigger resource downloads through scripts/cmake/download-resources.cmake and scripts/py/download_resources.py. Some models are gated on Hugging Face.
Preferred token sources:
HF_TOKEN~/.netrcentry forhuggingface.co
Do not commit anything from resources_downloaded/. Avoid introducing changes that force large re-downloads unless explicitly needed.
- Do not commit
build*/,resources_downloaded/, ordownload.log. - Avoid checking in large binaries or model artifacts; prefer stable URLs and
sha256sumupdates inscripts/py/requirements.json. - Treat SPDX maintenance as part of the same change, not follow-up work.
- Add the standard repo SPDX header to new source/doc/script files that support comments.
- When editing an existing source/doc/script file that supports comments, ensure the SPDX header is present and the year/range is current. Edit the year only; do not change holder or license text.
- Do not inject SPDX comments into formats that would break consumers, such as JSON. Use a repo-approved sidecar or documentation approach instead.
- If
python3is unavailable, usepythonorpy -3. - Useful debug helpers:
python3 scripts/dev/llm_doctor.py --build-dir buildpython3 scripts/dev/collect_debug_bundle.py build .