Skip to content

Added Lora + Fixed Build-All-Work-Flow#389

Merged
shubhammalhotra28 merged 5 commits intoRunanywhereAI:devfrom
Siddhesh2377:main
Feb 21, 2026
Merged

Added Lora + Fixed Build-All-Work-Flow#389
shubhammalhotra28 merged 5 commits intoRunanywhereAI:devfrom
Siddhesh2377:main

Conversation

@Siddhesh2377
Copy link
Copy Markdown
Collaborator

@Siddhesh2377 Siddhesh2377 commented Feb 19, 2026

Description

Brief description of the changes made.

Type of Change

  • Bug fix
  • New feature
  • Documentation update
  • Refactoring

Testing

  • Lint passes locally
  • Added/updated tests for changes

Platform-Specific Testing (check all that apply)

Swift SDK / iOS Sample:

  • Tested on iPhone (Simulator or Device)
  • Tested on iPad / Tablet
  • Tested on Mac (macOS target)

Kotlin SDK / Android Sample:

  • Tested on Android Phone (Emulator or Device)
  • Tested on Android Tablet

Flutter SDK / Flutter Sample:

  • Tested on iOS
  • Tested on Android

React Native SDK / React Native Sample:

  • Tested on iOS
  • Tested on Android

Web SDK / Web Sample:

  • Tested in Chrome (Desktop)
  • Tested in Firefox
  • Tested in Safari
  • WASM backends load (LlamaCpp + ONNX)
  • OPFS storage persistence verified (survives page refresh)
  • Settings persistence verified (localStorage)

Labels

Please add the appropriate label(s): New Feat Lora in Kotlin SDK

SDKs:

  • Swift SDK - Changes to Swift SDK (sdk/runanywhere-swift)
  • Kotlin SDK - Changes to Kotlin SDK (sdk/runanywhere-kotlin)
  • Flutter SDK - Changes to Flutter SDK (sdk/runanywhere-flutter)
  • React Native SDK - Changes to React Native SDK (sdk/runanywhere-react-native)
  • Web SDK - Changes to Web SDK (sdk/runanywhere-web)
  • Commons - Changes to shared native code (sdk/runanywhere-commons)

Sample Apps:

  • iOS Sample - Changes to iOS example app (examples/ios)
  • Android Sample - Changes to Android example app (examples/android)
  • Flutter Sample - Changes to Flutter example app (examples/flutter)
  • React Native Sample - Changes to React Native example app (examples/react-native)
  • Web Sample - Changes to Web example app (examples/web)

Checklist

  • Code follows project style guidelines
  • Self-review completed
  • Documentation updated (if needed)

Screenshots

Attach relevant UI screenshots for changes (if applicable):

  • Mobile (Phone)
  • Tablet / iPad
  • Desktop / Mac
image

Important

Added LoRA adapter support in Kotlin SDK and updated build workflow for comprehensive platform verification.

  • LoRA Adapter Support:
    • Added LoRAAdapterConfig and LoRAAdapterInfo data classes in LLMTypes.kt.
    • Introduced loadLoraAdapter, removeLoraAdapter, clearLoraAdapters, and getLoadedLoraAdapters functions in RunAnywhere+LoRA.kt and RunAnywhere+LoRA.jvmAndroid.kt.
    • Implemented JNI functions in RunAnywhereBridge.kt for LoRA operations.
  • Build Workflow:
    • Updated build-all-test.yml to include platform-specific build options and artifact uploads.
    • Added new jobs for C++ Android/iOS backends, Kotlin, Swift, Web, Flutter, React Native SDKs, and Android example apps.
  • Example App:
    • Added RunAnyWhereLora Android example app demonstrating LoRA adapter usage.
    • Includes UI components for loading models and adapters, and displaying results.

This description was created by Ellipsis for 638da00. You can customize this summary. It will automatically update as commits are pushed.

Summary by CodeRabbit

  • New Features

    • LoRA adapter support across C++, Kotlin, Swift with runtime load/remove/stacking and inspectable adapter info.
    • RunAnyWhere-Lora Android example app with UI for loading models/LoRA adapters, asking questions, and viewing metrics.
    • Multi-target build workflow enabling selective builds for native SDKs, mobile apps, and plugins.
  • Documentation

    • Comprehensive LoRA adapter integration guide added.

Greptile Summary

This PR adds comprehensive LoRA (Low-Rank Adaptation) adapter support to the Kotlin SDK and updates the build workflow for better platform verification.

LoRA Adapter Implementation:

  • Added complete LoRA adapter management in C++ commons layer (llamacpp_backend.cpp) with load, remove, clear, and info operations
  • Implemented proper context recreation when adapters are loaded/removed, with automatic KV cache clearing
  • Thread-safe implementation using mutex locks throughout the stack
  • Exposed LoRA API through C headers (rac_llm_component.h), JNI bridge (runanywhere_commons_jni.cpp), and Kotlin SDK (RunAnywhere+LoRA.kt)
  • Added LoRAAdapterConfig and LoRAAdapterInfo data classes in Kotlin with proper validation
  • Created RunAnyWhereLora Android example app demonstrating LoRA usage with model loading, adapter management, and streaming generation

Build Workflow Improvements:

  • Enhanced build-all-test.yml with per-platform checkboxes for selective building
  • Added matrix builds for C++ Android (arm64-v8a, armeabi-v7a, x86_64)
  • Included jobs for iOS backends, all SDKs (Kotlin, Swift, Web, Flutter, React Native), and example apps
  • Added artifact uploads for build verification

Architecture:

  • Follows the established pattern: C++ commons → C API → JNI → Kotlin bridge → Public Kotlin API
  • Proper error handling and state validation at each layer
  • Matches the project's architectural principles from CLAUDE.md

Confidence Score: 5/5

  • This PR is safe to merge with no critical issues found
  • The implementation is well-architected following established patterns, with proper thread safety, error handling, memory management, and comprehensive testing via the example app. The code is clean, follows SOLID principles, and integrates seamlessly with existing architecture.
  • No files require special attention

Important Files Changed

Filename Overview
sdk/runanywhere-commons/src/backends/llamacpp/llamacpp_backend.cpp Implemented LoRA adapter management with proper context recreation, adapter loading/removal, and memory cleanup. Thread-safe with mutex protection.
sdk/runanywhere-commons/src/backends/llamacpp/llamacpp_backend.h Added LoraAdapterEntry struct and LoRA management methods to LlamaCppTextGeneration class. Clean API design matching C++ patterns.
sdk/runanywhere-commons/include/rac/features/llm/rac_llm_component.h Added four LoRA adapter API functions to LLM component header: load_lora, remove_lora, clear_lora, and get_lora_info. Well-documented C API.
sdk/runanywhere-commons/src/features/llm/llm_component.cpp Implemented LoRA adapter API functions with proper backend dispatch through vtable. Thread-safe with mutex locks. Returns appropriate error codes.
sdk/runanywhere-kotlin/src/commonMain/kotlin/com/runanywhere/sdk/public/extensions/LLM/LLMTypes.kt Added LoRAAdapterConfig and LoRAAdapterInfo data classes with proper validation and serialization support.
sdk/runanywhere-kotlin/src/jvmAndroidMain/kotlin/com/runanywhere/sdk/public/extensions/RunAnywhere+LoRA.jvmAndroid.kt Implemented actual functions for JVM/Android with proper error handling, JSON parsing, and SDK state validation.
examples/android/RunAnyWhereLora/app/src/main/java/com/runanywhere/run_anywhere_lora/LoraViewModel.kt Implemented ViewModel with model loading, LoRA adapter management, and text generation. Proper coroutine usage and state management.
.github/workflows/build-all-test.yml Comprehensive workflow with per-platform checkboxes for selective builds. Includes C++ backends, SDKs, and example apps with artifact uploads.

Sequence Diagram

sequenceDiagram
    participant App as Android App
    participant KotlinAPI as RunAnywhere+LoRA.kt
    participant Bridge as CppBridgeLLM
    participant JNI as runanywhere_commons_jni
    participant Component as rac_llm_component
    participant Backend as LlamaCppTextGeneration
    participant LlamaCpp as llama.cpp

    App->>KotlinAPI: loadLoraAdapter(config)
    KotlinAPI->>Bridge: loadLoraAdapter(path, scale)
    Bridge->>Bridge: Validate state == READY
    Bridge->>JNI: racLlmComponentLoadLora(handle, path, scale)
    JNI->>Component: rac_llm_component_load_lora(handle, path, scale)
    Component->>Component: Get service via lifecycle_get_service
    Component->>Backend: load_lora (via vtable)
    Backend->>Backend: Check model_loaded, lock mutex
    Backend->>LlamaCpp: llama_adapter_lora_init(model, path)
    LlamaCpp-->>Backend: adapter handle
    Backend->>Backend: Store in lora_adapters_ vector
    Backend->>Backend: recreate_context()
    Backend->>LlamaCpp: llama_free(old_context)
    Backend->>LlamaCpp: llama_init_from_model(model)
    LlamaCpp-->>Backend: new_context
    Backend->>Backend: apply_lora_adapters()
    Backend->>LlamaCpp: llama_set_adapter_lora(context, adapter, scale)
    Backend->>LlamaCpp: llama_memory_clear(context)
    Backend-->>Component: RAC_SUCCESS
    Component-->>JNI: RAC_SUCCESS
    JNI-->>Bridge: 0
    Bridge-->>KotlinAPI: success
    KotlinAPI-->>App: adapter loaded
Loading

Last reviewed commit: ef98bcd

(5/5) You can turn off certain types of comments like style here!

  Implement LoRA (Low-Rank Adaptation) adapter hot-swapping for llama.cpp
  backend across all 6 SDK layers (C++ -> C API -> Component -> JNI ->
  Kotlin Bridge -> Kotlin Public API).

  - Add load/remove/clear/query LoRA adapter operations
  - Use vtable dispatch in component layer to decouple librac_commons
    from librac_backend_llamacpp (fixes linker errors)
  - Add LoRA vtable entries to rac_llm_service_ops_t
  - Fix AttachCurrentThread cast for Android NDK C++ JNI build
  - Add RunAnyWhereLora Android demo app with Material 3 Q&A UI
  - Add comprehensive implementation docs with C/C++ API reference
…ift concurrency errors

  Rewrite build-all-test.yml with 9 boolean checkbox inputs so each build
  target can be toggled independently from the GitHub Actions UI:
  - C++ Android Backends (arm64-v8a, armeabi-v7a, x86_64 matrix)
  - C++ iOS Backends (XCFramework)
  - Kotlin SDK (JVM + Android)
  - Swift SDK (iOS/macOS)
  - Web SDK (TypeScript)
  - Flutter SDK (Dart analyze via Melos)
  - React Native SDK (TypeScript via Lerna)
  - Android Example Apps (RunAnywhereAI + RunAnyWhereLora)
  - IntelliJ Plugin

  Fix two Swift strict-concurrency errors that fail the Swift SDK build:
  - LiveTranscriptionSession: add @unchecked Sendable (safe because class
    is @mainactor, all access serialized)
  - RunAnywhere+VisionLanguage: add Sendable conformance to rac_vlm_image_t
    so the C struct can cross the Task boundary in the streaming builder;
    simplify StreamingCollector to start timing at init
…sion and VLM streaming

  LiveTranscriptionSession.swift:
  - Replace [weak self] captures with strong `let session = self` before
    closures to avoid captured var in @Sendable/@task contexts (class is
    @mainactor @unchecked Sendable so strong ref is safe, bounded by
    stream lifecycle)
  - Wrap deprecated startStreamingTranscription call in @available helper
    to silence deprecation warning until migration to transcribeStream API

  RunAnywhere+VisionLanguage.swift:
  - Add `let capturedCImage = cImage` before AsyncThrowingStream closure
    so the Task captures an immutable let instead of a mutable var
  - Add `extension rac_vlm_image_t: @unchecked Sendable {}` for the C
    struct to cross Task concurrency boundaries safely
  - Simplify StreamingCollector to initialize startTime at init instead
    of requiring a separate async start() call
@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Feb 19, 2026

📝 Walkthrough

Walkthrough

This PR adds LoRA adapter support across the SDK (C/C++ backend, JNI, Kotlin multiplatform, and Swift), implements LoRA lifecycle in the LlamaCPP backend, extends JNI and Kotlin APIs, provides JVM/Android implementations, and adds a full RunAnyWhereLora Android example. It also restructures CI with a multi-target build workflow.

Changes

Cohort / File(s) Summary
Build & CI/CD
​.github/workflows/build-all-test.yml
Replaced single workflow with multi-target, input-gated build jobs (cpp-android, cpp-ios, kotlin-sdk, swift-sdk, web-sdk, flutter-sdk, react-native-sdk, android-apps, intellij-plugin), added COMMONS_DIR env, and a consolidated Build Summary job.
IDE / Project config
.idea/vcs.xml, settings.gradle.kts
Added VCS mappings and included the RunAnyWhereLora composite build in Gradle settings for IDE support.
Docs
docs/impl/lora_adapter_support.md
New detailed LoRA integration guide: APIs, examples (C/Kotlin/Swift), vtable notes, build/verification, and error handling.
LoRA C/C++ Headers
sdk/runanywhere-commons/include/rac/...
Added LoRA API declarations (load/remove/clear/get_info) to backend/component/service headers; some declarations duplicated and should be de-duplicated.
LlamaCPP backend implementation
sdk/runanywhere-commons/src/backends/llamacpp/...
Implemented LoRA adapter lifecycle: load, apply (stacking), remove, clear, context recreation, KV-cache clearing, and JSON metadata exposure.
Component glue
sdk/runanywhere-commons/src/features/llm/llm_component.cpp
Added component-level dispatch functions for load/remove/clear/get_info that forward to backend vtable with synchronization and error handling (duplication present).
JNI bridge
sdk/runanywhere-commons/src/jni/runanywhere_commons_jni.cpp
Added JNI functions for LoRA operations and fixed AttachCurrentThread usage in callbacks; exposes load/remove/clear/get_info to JVM.
Kotlin multiplatform API
sdk/runanywhere-kotlin/src/commonMain/...
Added LoRAAdapterConfig and LoRAAdapterInfo types and expect suspend functions to load/remove/clear/get loaded adapters (some duplicate declarations present).
JVM/Android Kotlin
sdk/runanywhere-kotlin/src/jvmAndroidMain/...
Added RunAnywhereBridge JNI bindings, CppBridgeLLM wrappers, and actual suspend implementations (JSON parsing, validation, error handling) for LoRA operations.
Swift changes
sdk/runanywhere-swift/...
Minor concurrency adjustments: Sendable conformance and timing/streaming tweaks for Vision/Transcription code paths.
RunAnyWhereLora Android app
examples/android/RunAnyWhereLora/**
New complete example app: Gradle config, manifest, Application with SDK init, Compose UI (LoraScreen), ViewModel, MainActivity, theme, resources, tests, and Gradle wrapper and settings files. Large addition of Kotlin/Gradle assets.
Misc resources
various .gitignore, .idea/*, drawables, colors, strings
IDE files, .gitignore entries, launcher drawables, color/strings/themes and backup/data rules for the new example app.

Sequence Diagram(s)

sequenceDiagram
    participant UI as "App UI (Compose)"
    participant VM as "LoraViewModel"
    participant SDK as "RunAnywhere Kotlin API"
    participant JNI as "RunAnywhereBridge (JNI)"
    participant CPP as "LlamaCPP Backend"

    UI->>VM: user selects LoRA file + scale
    VM->>SDK: call loadLoraAdapter(config)
    SDK->>JNI: racLlmComponentLoadLora(handle, path, scale)
    JNI->>CPP: load_lora(adapter_path, scale)
    CPP->>CPP: load adapter, recreate_context(), apply_lora_adapters(), clear_kv_cache
    CPP-->>JNI: result code / info
    JNI-->>SDK: result/out_json
    SDK-->>VM: success / updated adapter list
    VM-->>UI: update UI state (adapter applied)
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

Suggested labels

android-sample

Suggested reviewers

  • shubhammalhotra28
  • sanchitmonga22

Poem

🐰 Hopping through code with LoRA in tow,

Adapters stack up, the models now grow.
From Kotlin to C++ I dance and I cheer,
An app and a bridge — progress is here! 🎉

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 25.93% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title 'Added Lora + Fixed Build-All-Work-Flow' clearly summarizes the main changes: LoRA adapter support and build workflow improvements.
Description check ✅ Passed PR description is comprehensive with detailed sections covering LoRA implementation, build workflow changes, example app, and testing status. However, the brief description placeholder text ('Brief description of the changes made.') was not replaced with an actual summary.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Important

Looks good to me! 👍

Reviewed everything up to 638da00 in 23 seconds. Click for details.
  • Reviewed 4667 lines of code in 55 files
  • Skipped 10 files when reviewing.
  • Skipped posting 0 draft comments. View those below.
  • Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.

Workflow ID: wflow_Dk65vG7BOH704JFM

You can customize Ellipsis by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.

Copy link
Copy Markdown
Collaborator Author

@Siddhesh2377 Siddhesh2377 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good

@Siddhesh2377
Copy link
Copy Markdown
Collaborator Author

@shubhammalhotra28 once u r free merge this PR
This PR Contains :

  • Support for Lora in C++ / C Layer and Kotlin SDK
  • Docs to implement those in other SDK

@Siddhesh2377 Siddhesh2377 added documentation Improvements or additions to documentation enhancement New feature or request kotlin-sdk labels Feb 19, 2026
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 9

Note

Due to the large number of review comments, Critical, Major severity comments were prioritized as inline comments.

🟡 Minor comments (14)
examples/android/RunAnyWhereLora/.idea/misc.xml-4-4 (1)

4-4: ⚠️ Potential issue | 🟡 Minor

Update misc.xml to match the project's Java 17 configuration.

The Gradle build (lines 58–61 of app/build.gradle.kts) explicitly sets sourceCompatibility = JavaVersion.VERSION_17 and targetCompatibility = JavaVersion.VERSION_17, but misc.xml declares languageLevel="JDK_1_7". This mismatch will cause Android Studio's IDE analyzer to flag valid Java 8+ syntax (lambdas, streams, method references) as errors until a Gradle sync overwrites the stale metadata. Update misc.xml line 4 to languageLevel="JDK_17" to reflect the actual build configuration.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@examples/android/RunAnyWhereLora/.idea/misc.xml` at line 4, misc.xml
currently sets the IDE language level to JDK_1_7 which mismatches the project's
Gradle config; update the ProjectRootManager component's languageLevel attribute
(the element containing ProjectRootManager version="2") from "JDK_1_7" to
"JDK_17" so the IDE reflects sourceCompatibility = JavaVersion.VERSION_17 and
stops flagging Java 8+/17 syntax as errors.
examples/android/RunAnyWhereLora/app/proguard-rules.pro-6-6 (1)

6-6: ⚠️ Potential issue | 🟡 Minor

Stale HTTP documentation URL.

http://developer.android.com/guide/developing/tools/proguard.html redirects to a deprecated page. The current ProGuard/R8 shrinking guide is at https://developer.android.com/build/shrink-code.

📝 Proposed fix
-#   http://developer.android.com/guide/developing/tools/proguard.html
+#   https://developer.android.com/build/shrink-code
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@examples/android/RunAnyWhereLora/app/proguard-rules.pro` at line 6, Replace
the stale ProGuard documentation URL string
"http://developer.android.com/guide/developing/tools/proguard.html" with the
current shrinker guide URL "https://developer.android.com/build/shrink-code" so
the comment in proguard-rules.pro points to the up-to-date Android ProGuard/R8
documentation.
.idea/vcs.xml-5-8 (1)

5-8: ⚠️ Potential issue | 🟡 Minor

Avoid committing build-directory paths as VCS roots.

All five new mappings (lines 5–9) point into build/…/_deps/ subdirectories — ephemeral CMake FetchContent clones whose exact paths depend on which build targets the local developer has run. Committing these will cause every other contributor who has not built those specific targets (e.g. android/unified/arm64-v8a, dev-core) to receive spurious "VCS root not registered" warnings in IntelliJ IDEA. The root mapping on line 4 (directory="") already covers the whole repository; no additional entries are needed.

Recommended fix: remove lines 5–9 and ensure build/ remains listed in .gitignore so these transient directories are never picked up again.

🗑️ Proposed fix
 <component name="VcsDirectoryMappings">
   <mapping directory="" vcs="Git" />
-  <mapping directory="$PROJECT_DIR$/sdk/runanywhere-commons/build/android/unified/arm64-v8a/_deps/llamacpp-src" vcs="Git" />
-  <mapping directory="$PROJECT_DIR$/sdk/runanywhere-commons/build/android/unified/arm64-v8a/_deps/nlohmann_json-src" vcs="Git" />
-  <mapping directory="$PROJECT_DIR$/sdk/runanywhere-commons/build/dev-core/_deps/nlohmann_json-src" vcs="Git" />
-  <mapping directory="$PROJECT_DIR$/sdk/runanywhere-commons/build/dev/_deps/llamacpp-src" vcs="Git" />
-  <mapping directory="$PROJECT_DIR$/sdk/runanywhere-commons/build/dev/_deps/nlohmann_json-src" vcs="Git" />
 </component>
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.idea/vcs.xml around lines 5 - 8, Remove the transient VCS root mappings
that point inside build/_deps (the <mapping directory=".../_deps/llamacpp-src"
vcs="Git" />, <mapping directory=".../_deps/nlohmann_json-src" vcs="Git" />
entries) from the .idea/vcs.xml so IntelliJ doesn't warn about unregistered VCS
roots; keep only the existing repository root mapping and ensure build/ is
listed in .gitignore so those FetchContent-generated directories are never
committed again.
examples/android/RunAnyWhereLora/app/src/main/res/xml/backup_rules.xml-5-6 (1)

5-6: ⚠️ Potential issue | 🟡 Minor

Misleading comment — older than should be running API 31 or higher.

The note on line 5 says "This file is ignored for devices older than API 31", but the opposite is true. <full-backup-content> controls files backed up on devices running Android 11 (API level 30) or lower; these rules are also used for Android 12+ devices if the app targets Android 11 or lower. The dataExtractionRules attribute applies to Android 12 and above APIs, whereas allowBackup and fullBackupContent attributes are for Android versions prior to API 31.

As written, the comment implies pre-API 31 devices ignore this file, which could lead a future maintainer to skip configuring backup exclusions for those devices entirely.

✏️ Suggested fix
-   Note: This file is ignored for devices older than API 31
+   Note: This file is ignored for devices running API 31 or higher.
+   For API 31+ devices, use data_extraction_rules.xml instead.
    See https://developer.android.com/about/versions/12/backup-restore
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@examples/android/RunAnyWhereLora/app/src/main/res/xml/backup_rules.xml`
around lines 5 - 6, Update the misleading comment in backup_rules.xml: replace
"This file is ignored for devices older than API 31" with a correct note such as
"This file is ignored on devices running API 31 or higher" and add a brief
clarification that <full-backup-content> and allowBackup apply to Android 11
(API 30) and below while dataExtractionRules applies to Android 12+ (API 31+),
so maintainers know which attributes control backups on which API levels.
sdk/runanywhere-swift/Sources/RunAnywhere/Public/Sessions/LiveTranscriptionSession.swift-130-130 (1)

130-130: ⚠️ Potential issue | 🟡 Minor

Deprecation wrapper does not silence the warning at the call site in start().

Calling a @available(*, deprecated) method from a non-deprecated context emits a deprecation warning. The wrapper startLegacyStreaming at line 130 will produce:

'startLegacyStreaming(options:onPartialResult:onFinalResult:onError:)' is deprecated: Migrate to transcribeStream API

The deprecation annotation on the wrapper silences warnings inside the wrapper but surfaces the warning at the call site instead. To fully suppress the warning until migration, remove the deprecation marker and add a TODO comment:

Proposed fix
-    // Wrapper to silence deprecation warning until migration to transcribeStream
-    `@available`(*, deprecated, message: "Migrate to transcribeStream API")
-    private static func startLegacyStreaming(
+    // TODO: Migrate to transcribeStream API and remove this wrapper
+    private static func startLegacyStreaming(
         options: STTOptions,
         onPartialResult: `@escaping` (STTTranscriptionResult) -> Void,
         onFinalResult: `@escaping` (STTOutput) -> Void,
         onError: `@escaping` (Error) -> Void
     ) async throws {
+        // swiftlint:disable:next deprecation_warning
         try await RunAnywhere.startStreamingTranscription(
             options: options,
             onPartialResult: onPartialResult,
             onFinalResult: onFinalResult,
             onError: onError
         )
     }

Note: // swiftlint:disable:next only suppresses SwiftLint linting rules, not compiler warnings. For compiler-level suppression, the pattern above (removing the wrapper's deprecation and adding the comment) is the standard workaround in Swift 6 when line-level pragma support is unavailable.

Also applies to: 216-230

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@sdk/runanywhere-swift/Sources/RunAnywhere/Public/Sessions/LiveTranscriptionSession.swift`
at line 130, The call site in start() is still getting a compiler deprecation
warning because the wrapper
startLegacyStreaming(options:onPartialResult:onFinalResult:onError:) is
annotated `@available`(deprecated); remove the deprecation attribute from that
wrapper (and the analogous deprecated wrapper around lines 216-230) so calls
from start() do not emit warnings, and add a TODO comment above each wrapper
indicating it should be migrated to transcribeStream in the future (e.g., "//
TODO: migrate to transcribeStream; temporary non-deprecated shim to avoid
compiler warnings"). Ensure signatures (startLegacyStreaming(...)) remain
unchanged so callers like start() still call the same method.
docs/impl/lora_adapter_support.md-467-491 (1)

467-491: ⚠️ Potential issue | 🟡 Minor

Add a language specifier to the fenced code block (MD040).

The layer-diagram fence has no language tag; markdownlint-cli2 reports MD040. Use ```text (or ```plain) to satisfy the linter.

📝 Proposed fix
-```
+```text
 Kotlin Public API (RunAnywhere.loadLoraAdapter)
 ...
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/impl/lora_adapter_support.md` around lines 467 - 491, The fenced diagram
block starting with "Kotlin Public API (RunAnywhere.loadLoraAdapter)" is missing
a language tag; update the opening fence from ``` to a plain text specifier such
as ```text (or ```plain) so the block becomes a labeled plain-text fence,
keeping the diagram contents unchanged and ensuring the linter MD040 is
satisfied.
docs/impl/lora_adapter_support.md-208-213 (1)

208-213: ⚠️ Potential issue | 🟡 Minor

clearAdapters() inconsistent with the stated error-handling contract.

Line 162 explicitly states "All LoRA functions throw SDKError on failure," but the clearAdapters() ViewModel example has no try-catch. While rac_llm_component_clear_lora always returns RAC_SUCCESS at the C level, the Kotlin wrapper can still throw SDKError.notInitialized if the SDK is not ready. The example should be consistent.

📝 Proposed fix
     fun clearAdapters() {
         viewModelScope.launch {
-            RunAnywhere.clearLoraAdapters()
-            refreshAdapterList()
+            try {
+                RunAnywhere.clearLoraAdapters()
+                refreshAdapterList()
+            } catch (e: SDKError) {
+                _state.update { it.copy(error = e.message) }
+            }
         }
     }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/impl/lora_adapter_support.md` around lines 208 - 213, The
clearAdapters() example violates the documented error contract by not handling
SDKError; wrap the RunAnywhere.clearLoraAdapters() call inside a try-catch in
clearAdapters() (the ViewModel method) to catch SDKError (e.g.,
SDKError.notInitialized) and handle/log it before/after calling
refreshAdapterList(), ensuring the Kotlin wrapper's possible exceptions are
caught and dealt with according to the project’s error-handling pattern.
docs/impl/lora_adapter_support.md-653-653 (1)

653-653: ⚠️ Potential issue | 🟡 Minor

Sentence fragment flagged by LanguageTool (MISSING_IT_THERE).

"Could be done by calling…" is missing a subject.

📝 Proposed fix
-Could be done by calling `llama_set_adapter_lora(ctx, adapter, new_scale)`
+This could be done by calling `llama_set_adapter_lora(ctx, adapter, new_scale)`
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/impl/lora_adapter_support.md` at line 653, The sentence fragment "Could
be done by calling…" lacks a subject; update the sentence to include an explicit
subject (e.g., "This could be done by calling…" or "It could be done by
calling…") or rephrase the line to a complete sentence; locate the exact phrase
"Could be done by calling…" in docs/impl/lora_adapter_support.md and replace it
with the chosen complete sentence.
sdk/runanywhere-swift/Sources/RunAnywhere/Public/Extensions/VLM/RunAnywhere+VisionLanguage.swift-12-14 (1)

12-14: ⚠️ Potential issue | 🟡 Minor

Correct the documented safety invariant for @unchecked Sendable — not all pointer fields are backed by rgbData.

The @unchecked Sendable conformance is safe, but the comment's claim is inaccurate. rac_vlm_image_t has three pointer members: file_path, pixel_data, and base64_data. Only pixel_data is backed by the captured rgbData Data object. The file_path and base64_data pointers come from path.withCString and encoded.withCString respectively.

The actual safety mechanism is scoped pointer lifetime: all pointers are only assigned and used within the withCPointers closure where the underlying data (strings, Data buffers) remains alive. The C function calls happen immediately within that closure scope, so pointers never escape their valid lifetime.

Update the comment to reflect this:

// C struct with raw pointers — safe to send across concurrency boundaries
// because all pointer fields (file_path, base64_data, pixel_data) are only
// assigned and dereferenced within withCPointers, keeping them alive.
extension rac_vlm_image_t: `@unchecked` Sendable {}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@sdk/runanywhere-swift/Sources/RunAnywhere/Public/Extensions/VLM/RunAnywhere`+VisionLanguage.swift
around lines 12 - 14, Update the comment on the `@unchecked` Sendable conformance
for rac_vlm_image_t to accurately describe the safety invariant: state that all
pointer fields (file_path, base64_data, pixel_data) are only assigned and
dereferenced within withCPointers so their underlying storage (strings/Data
including rgbData) remains alive for the C call; mention scoped pointer lifetime
rather than claiming all pointers are backed by rgbData. Reference
rac_vlm_image_t, withCPointers, and the pointer fields (file_path, base64_data,
pixel_data) in the comment.
examples/android/RunAnyWhereLora/app/src/main/java/com/runanywhere/run_anywhere_lora/LoraScreen.kt-262-272 (1)

262-272: ⚠️ Potential issue | 🟡 Minor

Dual weight(1f) produces a 50/50 split when generating with an empty answer.

When isGenerating is true and answer is empty, both the SelectionContainer (line 222, showing empty text) and the spinner Box (line 264) receive weight(1f), splitting the available space equally. The empty SelectionContainer takes half the card for no visible content.

Consider wrapping the spinner in a full-size Box without competing weight, or gating the SelectionContainer on answer.isNotEmpty().

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@examples/android/RunAnyWhereLora/app/src/main/java/com/runanywhere/run_anywhere_lora/LoraScreen.kt`
around lines 262 - 272, The UI shows a 50/50 split because both the
SelectionContainer (the empty text block inside LoraScreen) and the spinner Box
(shown when state.isGenerating && state.answer.isEmpty()) have weight(1f); fix
by preventing competing weights: either conditionally render the
SelectionContainer only when state.answer.isNotEmpty() or remove the .weight(1f)
from the spinner Box and make it fill available space (e.g., use
fillMaxSize()/fillMaxWidth()) so the spinner occupies the full card without
splitting, updating the logic inside LoraScreen where state.isGenerating and
state.answer are checked and the CircularProgressIndicator Box is created.
examples/android/RunAnyWhereLora/app/src/main/java/com/runanywhere/run_anywhere_lora/MainActivity.kt-80-82 (1)

80-82: ⚠️ Potential issue | 🟡 Minor

Retry button is a no-op.

The onClick handler is empty. Either wire it up to actually retry SDK initialization (e.g., pass a lambda from the LoraApplication), or remove the button to avoid misleading users.

Would you like me to suggest an implementation that wires the retry action to LoraApplication?

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@examples/android/RunAnyWhereLora/app/src/main/java/com/runanywhere/run_anywhere_lora/MainActivity.kt`
around lines 80 - 82, The Retry TextButton currently has an empty onClick and
should either be removed or wired to actually retry SDK initialization;
implement a retry lambda on the activity/application side (expose a
retryInitialization or initializeSdk method on LoraApplication) and call that
from the TextButton's onClick in MainActivity.kt (or pass the lambda down into
the Composable), ensuring the onClick invokes
LoraApplication.retryInitialization() (or the chosen method) to re-attempt SDK
init and update UI state accordingly.
sdk/runanywhere-commons/include/rac/backends/rac_llm_llamacpp.h-195-209 (1)

195-209: ⚠️ Potential issue | 🟡 Minor

Documentation says scale is "0.0-1.0" but the actual range is unbounded.

Line 204 documents the scale parameter as (0.0-1.0, default 1.0), but:

  • The Android UI (LoraScreen.kt line 477) allows 0f..2f.
  • The C++ implementation (load_lora_adapter) does not clamp the value.

Either update the documentation to reflect the actual valid range (e.g., 0.0–2.0 or just state the default without an upper bound), or add validation in the C layer to reject out-of-range values.

Proposed doc fix
- * `@param` scale Adapter scale factor (0.0-1.0, default 1.0)
+ * `@param` scale Adapter scale factor (typically 0.0-2.0, default 1.0)

Based on learnings: "Public C API headers in include/rac/ must document vtable operations, error codes, and lifecycle requirements"

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@sdk/runanywhere-commons/include/rac/backends/rac_llm_llamacpp.h` around lines
195 - 209, The documentation for rac_llm_llamacpp_load_lora incorrectly
constrains the scale to "0.0-1.0"; update either the header docs or the C shim
to match actual behavior: either change the parameter description in
rac_llm_llamacpp_load_lora to state the true range (e.g., "default 1.0, accepts
values >=0.0 (UI allows up to 2.0)") and note no clamping, or add validation in
the C implementation (load_lora_adapter) to reject/clamp values outside the
desired range and return a clear error code; reference
rac_llm_llamacpp_load_lora and load_lora_adapter so the maintainers can find and
update the declaration, docs, and/or validation to stay consistent with
LoraScreen.kt behavior.
examples/android/RunAnyWhereLora/app/src/main/java/com/runanywhere/run_anywhere_lora/LoraApplication.kt-45-48 (1)

45-48: ⚠️ Potential issue | 🟡 Minor

onTerminate() is never called on production Android devices.

Per Android documentation, Application.onTerminate() is only invoked in emulated environments. Relying on it for scope cleanup means the applicationScope is never cancelled on real devices. For an example app this is low-risk, but worth noting — consider using ProcessLifecycleOwner or accepting the scope lives for the process lifetime.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@examples/android/RunAnyWhereLora/app/src/main/java/com/runanywhere/run_anywhere_lora/LoraApplication.kt`
around lines 45 - 48, onTerminate() is only called in emulators so cancelling
applicationScope there won't run on real devices; replace or supplement that
cleanup by observing process lifecycle (use
ProcessLifecycleOwner.get().lifecycle) to cancel applicationScope in
onStop()/onDestroy or else document/accept that applicationScope intentionally
lives for the process lifetime. Locate the Application subclass (methods
onTerminate and variable applicationScope) and either register a
LifecycleObserver/DefaultLifecycleObserver with ProcessLifecycleOwner to call
applicationScope.cancel() at appropriate terminal lifecycle event, or remove the
onTerminate-based cancel and add a comment explaining the scope is purposely
process-scoped.
sdk/runanywhere-commons/src/backends/llamacpp/rac_llm_llamacpp.cpp-381-396 (1)

381-396: ⚠️ Potential issue | 🟡 Minor

strdup return value unchecked — potential null dereference on OOM.

strdup at line 393 can return NULL if memory allocation fails. While unlikely for a short JSON string, the pattern should handle it for robustness. The same applies to the pre-existing strdup at line 320 (get_model_info).

Proposed fix
     auto info = h->text_gen->get_lora_info();
     std::string json_str = info.dump();
-    *out_json = strdup(json_str.c_str());
+    *out_json = strdup(json_str.c_str());
+    if (!*out_json) {
+        return RAC_ERROR_OUT_OF_MEMORY;
+    }
 
     return RAC_SUCCESS;
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@sdk/runanywhere-commons/src/backends/llamacpp/rac_llm_llamacpp.cpp` around
lines 381 - 396, The strdup call in rac_llm_llamacpp_get_lora_info can return
NULL on allocation failure; update the function to check the return of
strdup(json_str.c_str()) and handle OOM by returning an appropriate error (e.g.,
RAC_ERROR_OUT_OF_MEMORY) and leaving *out_json in a defined state (set to
nullptr on failure). Apply the same defensive check/pattern used for
get_model_info so both rac_llm_llamacpp_get_lora_info and get_model_info
validate strdup, avoid null-deref, and return a clear error when allocation
fails.
🧹 Nitpick comments (20)
examples/android/RunAnyWhereLora/.gitignore (1)

3-3: Redundant root-anchored local.properties entry.

Line 3 (/local.properties) is a strict subset of the unanchored local.properties pattern on Line 15, which already matches the file at any directory depth including the root. Line 3 can be removed.

🧹 Proposed cleanup
-.gradle
-/local.properties
+.gradle
 /.idea/caches
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@examples/android/RunAnyWhereLora/.gitignore` at line 3, Remove the redundant
root-anchored ignore entry "/local.properties" from the .gitignore; the
unanchored "local.properties" pattern already covers files at any directory
depth (including the repo root), so keep the unanchored "local.properties" rule
and delete the "/local.properties" line to avoid duplication.
examples/android/RunAnyWhereLora/app/src/androidTest/java/com/runanywhere/run_anywhere_lora/ExampleInstrumentedTest.kt (1)

17-23: LGTM — standard boilerplate; consider adding feature-level tests.

This is the default Android Studio–generated instrumented test and is correct as-is. The package name assertion at Line 22 matches the declared package at Line 1.

Since this PR introduces a new Android demo app with LoRA model loading and management, the only optional improvement is expanding test coverage beyond the boilerplate to cover the actual feature flows (e.g., loading a model, attaching/detaching a LoRA adapter, verifying inference output).

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@examples/android/RunAnyWhereLora/app/src/androidTest/java/com/runanywhere/run_anywhere_lora/ExampleInstrumentedTest.kt`
around lines 17 - 23, The existing instrumented test
ExampleInstrumentedTest::useAppContext is fine; add new feature-level
instrumented tests that exercise the LoRA demo flows — create a new test class
(e.g., LoRaInstrumentedTest) or add tests alongside ExampleInstrumentedTest that
call the app logic to load a model, attach and detach a LoRA adapter, and run a
sample inference asserting expected non-error results or known outputs; target
the same instrumentation context
(InstrumentationRegistry.getInstrumentation().targetContext) to obtain app
resources and use the app's model-loading and adapter-management APIs to perform
the assertions.
examples/android/RunAnyWhereLora/app/src/test/java/com/runanywhere/run_anywhere_lora/ExampleUnitTest.kt (1)

12-16: Boilerplate placeholder — consider adding LoRA-specific unit tests.

addition_isCorrect is the auto-generated Android Studio scaffold test and doesn't cover any of the new LoRA functionality introduced in this PR (adapter loading, hot-swap, remove, clear, query). Leaving only this placeholder can give a false sense of unit-test coverage for the new feature.

Consider replacing or supplementing it with tests that exercise the public LoRA API surface — e.g., verifying that loading a LoRA adapter updates the expected state, that removing one clears it correctly, and that invalid inputs are handled gracefully.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@examples/android/RunAnyWhereLora/app/src/test/java/com/runanywhere/run_anywhere_lora/ExampleUnitTest.kt`
around lines 12 - 16, Replace the placeholder test in
ExampleUnitTest.addition_isCorrect with LoRA-focused unit tests that exercise
the public LoRA API surface: add tests that call loadAdapter (or the project’s
adapter-loading method) and assert the adapter appears in the manager/state,
test hotSwapAdapter to ensure a replacement updates state and behavior, test
removeAdapter and clear (or clearAdapters) to ensure adapters are
removed/cleared and that subsequent queries fail or return empty, and test
queryAdapter (or the query method) for expected responses and graceful handling
of invalid inputs; keep tests small, use the ExampleUnitTest class to group
them, and assert both success and failure cases for each API call.
examples/android/RunAnyWhereLora/app/src/main/res/xml/data_extraction_rules.xml (1)

7-19: Consider excluding large model-file directories from backup and device-transfer.

Both <cloud-backup> (line 7) and <device-transfer> (lines 13–18) have no active rules, so the OS will include all app data in cloud backups and device migrations by default. Google Drive autobackup is capped at 25 MB per app; if the quota is reached, the system stops backing up, and when a new backup is made the previous one is deleted. An AI inference app that caches model weights locally will almost certainly blow this cap, causing silent backup failures.

If the <device-transfer> attribute is not set, all the application data will be transferred during a D2D migration. Large model files transferred during device setup will noticeably slow down the experience.

When you're ready to customize, exclude model/cache directories in both sections:

🔧 Suggested skeleton for model-file exclusions
 <cloud-backup>
-    <!-- TODO: Use <include> and <exclude> to control what is backed up.
-    <include .../>
-    <exclude .../>
-    -->
+    <!-- Exclude downloaded model weights and caches from cloud backup -->
+    <exclude domain="file" path="models/"/>
+    <exclude domain="file" path="cache/"/>
 </cloud-backup>
-<!--
 <device-transfer>
-    <include .../>
-    <exclude .../>
+    <exclude domain="file" path="models/"/>
+    <exclude domain="file" path="cache/"/>
 </device-transfer>
--->
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@examples/android/RunAnyWhereLora/app/src/main/res/xml/data_extraction_rules.xml`
around lines 7 - 19, The cloud-backup and device-transfer sections currently
have no rules causing all app data to be backed up/transferred; update the
data_extraction_rules (data_extraction_rules.xml) by adding explicit <exclude>
entries inside the <cloud-backup> and inside the <device-transfer> blocks to
omit large model/cache directories (e.g., model/, models/, cache/, weights/,
.tflite, .pt, or your app's model storage folder) and any temp or download
folders used for inference; use <include> only for small essential files if
needed and ensure the same exclude patterns appear in both <cloud-backup> and
<device-transfer> so large model files are neither backed up to cloud nor
migrated during device-to-device transfers.
sdk/runanywhere-swift/Sources/RunAnywhere/Public/Sessions/LiveTranscriptionSession.swift (2)

71-83: Strong self capture replaces [weak self] — verify retention semantics are intentional.

The previous code presumably used [weak self] in the stream closures. Now let session = self captures the session strongly. This means the LiveTranscriptionSession will stay alive as long as the AsyncStream returned by transcriptions is held (specifically, the onTermination closure at line 78 retains session).

This is likely the correct behavior — consumers iterating the stream should keep the session alive — but it's a subtle behavioral change. If any call site previously relied on the session being weakly held and deallocated while a stream consumer was still active, that code path will now retain the session longer.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@sdk/runanywhere-swift/Sources/RunAnywhere/Public/Sessions/LiveTranscriptionSession.swift`
around lines 71 - 83, The current code captures self strongly via let session =
self inside transcriptions' AsyncStream, which changes retention semantics; if
you want to restore previous weak capture behavior, remove let session = self
and capture [weak self] in the Task closures inside the AsyncStream, then
guard-let self (or finish the continuation) before calling
self.onPartialCallback and before clearing it in continuation.onTermination;
otherwise, if the strong-keep-alive behavior is intended, add a short comment in
LiveTranscriptionSession/transcriptions documenting that the returned
AsyncStream intentionally retains the session until the stream terminates.

39-39: Consider alternatives to @unchecked Sendable for full Swift 6 strict concurrency compliance.

The class is already @MainActor-isolated, which provides actor-based thread safety. The current implementation correctly wraps all property access in Task { @mainactor in ... } blocks, even within non-isolated callback contexts (lines 133–175). However, @unchecked Sendable suppresses compiler diagnostics, so future modifications that bypass this wrapping pattern wouldn't be caught.

While @unchecked Sendable is a pragmatic pattern used throughout the SDK for bridging C callbacks and working with non-Sendable types, Swift 6 best practices favor explicit Sendable conformance. If feasible, consider whether making stored properties Sendable or restructuring callback handling could eliminate the need for the unchecked escape hatch.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@sdk/runanywhere-swift/Sources/RunAnywhere/Public/Sessions/LiveTranscriptionSession.swift`
at line 39, The LiveTranscriptionSession currently uses `@unchecked` Sendable;
instead remove `@unchecked` Sendable and make the class strictly actor-isolated
and Sendable-safe by either (A) relying solely on `@MainActor` isolation (remove
the Sendable conformance and ensure all external callback code dispatches into
Task { `@MainActor` in ... } as already done in the callback handlers), or (B) if
you must keep cross-thread callbacks, extract mutable state into a small
`@MainActor-isolated` actor or a separate `@MainActor-bound` reference type (e.g.,
LiveTranscriptionSessionState) and keep LiveTranscriptionSession free of
non-Sendable stored properties so the compiler can verify Sendable conformance;
update any stored properties referenced from outside the main actor to be
Sendable or accessed only via the `@MainActor-bound` actor and remove `@unchecked`
Sendable from the class declaration (symbol: LiveTranscriptionSession, Task {
`@MainActor` in ... }, and any stored properties you extract into the actor).
examples/android/RunAnyWhereLora/app/src/main/res/values/themes.xml (1)

4-4: Consider using a Material3 theme parent.

android:Theme.Material.Light.NoActionBar is the legacy Material 1 framework theme. Since this app uses Jetpack Compose (with a RunAnyWhereLoraTheme composable per the AI summary), the XML bridge theme should ideally descend from Theme.Material3.DayNight.NoActionBar for consistent Material3 system UI integration (window decorations, edge-to-edge, etc.).

♻️ Proposed refactor
-    <style name="Theme.RunAnyWhereLora" parent="android:Theme.Material.Light.NoActionBar" />
+    <style name="Theme.RunAnyWhereLora" parent="Theme.Material3.DayNight.NoActionBar" />

Ensure com.google.android.material:material is in the app's dependencies for Theme.Material3.* to resolve.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@examples/android/RunAnyWhereLora/app/src/main/res/values/themes.xml` at line
4, The app theme Theme.RunAnyWhereLora currently inherits
android:Theme.Material.Light.NoActionBar (legacy Material1); change its parent
to a Material3 parent such as Theme.Material3.DayNight.NoActionBar to align with
the Compose RunAnyWhereLoraTheme and system UI behavior, and ensure the
Material3 dependency (com.google.android.material:material) is added to the app
dependencies so the Theme.Material3.* parents resolve.
docs/impl/lora_adapter_support.md (1)

697-699: Changelog author field credits the AI model ("Claude") rather than the human contributor.

This is an editorial nit, but recorded authorship in project documentation should identify the human or team responsible, not the AI assistant used to draft the code.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/impl/lora_adapter_support.md` around lines 697 - 699, The changelog
entries currently list the AI model name "Claude" as the author for the two
2026-02-19 entries; replace "Claude" with the appropriate human author or team
name (e.g., the contributor's full name or "Team XYZ") for both entries so
project documentation credits the human contributor; update the author field in
the two entries that start with "2026-02-19" and keep the rest of the entry text
unchanged.
examples/android/RunAnyWhereLora/app/build.gradle.kts (2)

52-55: Overly broad pickFirsts glob may silently hide real .so version/ABI conflicts.

"lib/**/*.so" resolves every duplicate shared-library conflict across all ABI directories by taking the first encountered copy. If runanywhere-kotlin and runanywhere-core-llamacpp ever ship different, incompatible builds of the same .so, this rule will silently pick one and produce a crash at runtime rather than a build-time error.

Consider scoping the pattern to the specific libraries known to be duplicated (e.g., "lib/*/libc++_shared.so") so that unexpected new conflicts are still caught during the build.

♻️ Suggested narrowing
         jniLibs {
             useLegacyPackaging = true
-            pickFirsts += listOf("lib/**/*.so")
+            // Only suppress the known duplicate C++ runtime; all other conflicts surface as build errors.
+            pickFirsts += listOf("lib/*/libc++_shared.so")
         }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@examples/android/RunAnyWhereLora/app/build.gradle.kts` around lines 52 - 55,
The jniLibs block currently uses an overly broad pickFirsts pattern
("lib/**/*.so") which can silently mask ABI/version conflicts; replace that
broad glob in the jniLibs configuration (where useLegacyPackaging and pickFirsts
are defined) with a whitelist of the specific duplicate .so filenames you expect
(e.g., "lib/*/libc++_shared.so" or other explicit names) or remove the
pickFirsts entry so unexpected duplicates surface as build errors; update the
pickFirsts list to only include known-safe library basenames rather than a
recursive wildcard.

25-33: isMinifyEnabled = false in the release build type will produce an un-shrunk, un-obfuscated APK.

Acceptable for an example app, but note that the final APK will be significantly larger and the SDK internals will be fully visible in tools like apktool. If this example is distributed as a reference, enabling R8 shrinking with the existing proguard-android-optimize.txt ruleset would align it with production practices.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@examples/android/RunAnyWhereLora/app/build.gradle.kts` around lines 25 - 33,
The release build currently sets isMinifyEnabled = false which produces an
unshrunken, unobfuscated APK; update the release block (buildTypes -> release)
to enable R8 shrinking/obfuscation by setting isMinifyEnabled = true, keep the
existing proguardFiles(getDefaultProguardFile("proguard-android-optimize.txt"),
"proguard-rules.pro") and optionally add shrinkResources = true to remove unused
resources for a production-like example.
examples/android/RunAnyWhereLora/.idea/runConfigurations.xml (1)

1-17: Consider whether .idea/runConfigurations.xml should be version-controlled.

This file disables all JUnit and Kotlin JUnit run-configuration producers for the module. Committing it imposes this setting on every contributor who opens the project in Android Studio, preventing "Run test" gutter icons from appearing for any test class. Since the build.gradle.kts includes test dependencies (testImplementation, androidTestImplementation), contributors will find tests undiscoverable from the IDE.

If the intent is just to suppress noisy "create JUnit configuration" prompts during development, consider adding .idea/ to .gitignore instead (or at least limit this file to not being tracked).

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@examples/android/RunAnyWhereLora/.idea/runConfigurations.xml` around lines 1
- 17, This runConfigurations.xml disables many JUnit/Kotlin test configuration
producers (e.g., AbstractAllInDirectoryConfigurationProducer,
AllInPackageConfigurationProducer, PatternConfigurationProducer,
TestInClassConfigurationProducer, UniqueIdConfigurationProducer,
JUnitTestDiscoveryConfigurationProducer, KotlinJUnitRunConfigurationProducer,
KotlinPatternConfigurationProducer) and should not be enforced in the repo;
remove this file from version control (or revert it to default) so IDE test
discovery works for contributors, and instead add .idea/ (or at least this
runConfigurations.xml) to .gitignore or stop tracking it so the producers are
not globally disabled for all developers.
examples/android/RunAnyWhereLora/app/src/main/java/com/runanywhere/run_anywhere_lora/ui/theme/Theme.kt (1)

3-3: Unused import: android.app.Activity.

This import is not referenced in the file — likely a leftover from the project template.

🧹 Remove unused import
-import android.app.Activity
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@examples/android/RunAnyWhereLora/app/src/main/java/com/runanywhere/run_anywhere_lora/ui/theme/Theme.kt`
at line 3, Remove the unused import statement "import android.app.Activity" from
the file; locate the import line and delete it (or run Optimize Imports /
auto-import cleanup) so only referenced Android types remain in Theme.kt.
sdk/runanywhere-commons/include/rac/features/llm/rac_llm_service.h (1)

52-63: Vtable entries lack error-code, lifecycle, and memory-ownership documentation.

Per coding guidelines, public C API headers must document vtable operations, error codes, and lifecycle requirements. The new LoRA entries should document:

  1. get_lora_info — who owns *out_json? If the caller must free() it, state that explicitly.
  2. Preconditions — what rac_result_t is returned when called with no model loaded, or with a NULL/empty adapter_path?
  3. load_lora scale — any valid range constraints (e.g., must be > 0)?

The existing vtable entries above (e.g., generate, cleanup) set a precedent of minimal docs too, but since this is new API surface, it's a good opportunity to raise the bar.

📝 Suggested documentation enhancement
-    /** Load a LoRA adapter (optional, NULL if not supported) */
-    rac_result_t (*load_lora)(void* impl, const char* adapter_path, float scale);
+    /**
+     * Load a LoRA adapter (optional, NULL if not supported).
+     * `@param` impl Backend implementation handle
+     * `@param` adapter_path Path to LoRA GGUF file (must not be NULL)
+     * `@param` scale Adapter strength (typically 0.0–1.0+)
+     * `@return` RAC_SUCCESS, RAC_ERROR_INVALID_ARGUMENT, or RAC_ERROR_NOT_INITIALIZED
+     */
+    rac_result_t (*load_lora)(void* impl, const char* adapter_path, float scale);

-    /** Remove a LoRA adapter by path (optional, NULL if not supported) */
-    rac_result_t (*remove_lora)(void* impl, const char* adapter_path);
+    /**
+     * Remove a LoRA adapter by path (optional, NULL if not supported).
+     * `@param` impl Backend implementation handle
+     * `@param` adapter_path Path used when loading (must not be NULL)
+     * `@return` RAC_SUCCESS or RAC_ERROR_NOT_FOUND
+     */
+    rac_result_t (*remove_lora)(void* impl, const char* adapter_path);

-    /** Clear all LoRA adapters (optional, NULL if not supported) */
-    rac_result_t (*clear_lora)(void* impl);
+    /**
+     * Clear all LoRA adapters (optional, NULL if not supported).
+     * `@param` impl Backend implementation handle
+     * `@return` RAC_SUCCESS or error code
+     */
+    rac_result_t (*clear_lora)(void* impl);

-    /** Get loaded LoRA adapters info as JSON (optional, NULL if not supported) */
-    rac_result_t (*get_lora_info)(void* impl, char** out_json);
+    /**
+     * Get loaded LoRA adapters info as JSON (optional, NULL if not supported).
+     * `@param` impl Backend implementation handle
+     * `@param` out_json Output: caller-owned JSON string; free with free()
+     * `@return` RAC_SUCCESS or error code
+     */
+    rac_result_t (*get_lora_info)(void* impl, char** out_json);

As per coding guidelines: "Public C API headers in include/rac/ must document vtable operations, error codes, and lifecycle requirements."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@sdk/runanywhere-commons/include/rac/features/llm/rac_llm_service.h` around
lines 52 - 63, The new LoRA vtable entries (load_lora, remove_lora, clear_lora,
get_lora_info) lack required public-API documentation: update the header
comments for each function to state exact preconditions and returned
rac_result_t values (e.g., returned error when no model is loaded, when
adapter_path is NULL/empty, or when adapter not found), document
lifecycle/ownership rules (explicitly state whether *out_json is heap-allocated
and must be freed by the caller, or remains owned by the implementation and must
not be freed), and specify valid range/constraints for the scale parameter of
load_lora (e.g., >0 and recommended bounds). Reference the vtable symbols
(load_lora, remove_lora, clear_lora, get_lora_info) and rac_result_t in the
comments so callers know error semantics and memory ownership.
sdk/runanywhere-commons/src/backends/llamacpp/llamacpp_backend.h (1)

101-118: Duplicate section header: "TEXT GENERATION IMPLEMENTATION" appears twice.

Lines 101–103 (pre-existing) and lines 116–118 both carry the same "TEXT GENERATION IMPLEMENTATION" section banner. The first one should be renamed to something like "LORA ADAPTER ENTRY" to match the content it actually introduces.

Proposed fix
-// =============================================================================
-// TEXT GENERATION IMPLEMENTATION
-// =============================================================================
-
 // =============================================================================
 // LORA ADAPTER ENTRY
 // =============================================================================

(Remove the first redundant header at line 101–103, keeping the new LoRA section header at 105–107.)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@sdk/runanywhere-commons/src/backends/llamacpp/llamacpp_backend.h` around
lines 101 - 118, The file contains a duplicate section banner "TEXT GENERATION
IMPLEMENTATION" around the LoraAdapterEntry declaration; remove or rename the
first banner so the LoraAdapterEntry block is correctly labeled (e.g., change
the earlier "TEXT GENERATION IMPLEMENTATION" header to "LORA ADAPTER ENTRY" or
delete it) to avoid duplicate section headers while keeping the existing LORA
ADAPTER ENTRY header that surrounds struct LoraAdapterEntry and its members
(llama_adapter_lora*, path, scale, applied).
.github/workflows/build-all-test.yml (2)

108-147: ~120 lines of duplicated development config generation across cpp-android and cpp-ios.

The "Setup Development Config" steps at lines 108–147 and 195–234 are identical. If the config format or validation logic changes, both must be updated in lockstep.

Consider extracting this into a composite action or a shared script (e.g., scripts/ci/generate-dev-config.sh) referenced by both jobs.

Also applies to: 195-234

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/build-all-test.yml around lines 108 - 147, The duplicated
"Setup Development Config" block should be extracted into a reusable script and
invoked from both cpp-android and cpp-ios jobs: create a script (e.g.,
scripts/ci/generate-dev-config.sh) that accepts SUPABASE_URL, SUPABASE_ANON_KEY
and BUILD_TOKEN, performs the CLEAN_* trimming, writes the same C++ template to
CONFIG_FILE (preserving symbols like CONFIG_FILE, SUPABASE_URL,
SUPABASE_ANON_KEY, BUILD_TOKEN, SENTRY_DSN and functions
rac_dev_config_is_available / rac_dev_config_get_*), and update both workflow
steps to call that script with the three env vars instead of duplicating the
heredoc.

395-398: Redundant failure suppression on Flutter test step.

melos run test || true already swallows failures, making continue-on-error: true redundant. Pick one — typically continue-on-error: true is preferred in workflows because it preserves the step status as "failed" in the UI while not blocking downstream jobs.

Proposed fix
       - name: Run Tests
         working-directory: sdk/runanywhere-flutter
-        run: melos run test || true
-        continue-on-error: true
+        run: melos run test
+        continue-on-error: true
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/build-all-test.yml around lines 395 - 398, The "Run Tests"
step currently suppresses failures twice: the run command uses "melos run test
|| true" while the step also sets continue-on-error: true; remove the redundant
"|| true" from the run command in the "Run Tests" step (working-directory:
sdk/runanywhere-flutter) so the command is simply "melos run test" and keep
continue-on-error: true to preserve UI visibility while not blocking downstream
jobs.
examples/android/RunAnyWhereLora/app/src/main/java/com/runanywhere/run_anywhere_lora/LoraViewModel.kt (1)

68-116: unloadLLMModel() may block the main thread.

Lines 73-75 call RunAnywhere.unloadLLMModel() outside the withContext(Dispatchers.IO) block. If this is a suspend function that performs native/JNI calls (likely, given it mirrors loadLLMModel), it could block on the Main dispatcher. Consider moving the unload into the IO context alongside the load.

Proposed fix
             try {
-                // Unload existing model if loaded
-                if (RunAnywhere.isLLMModelLoaded()) {
-                    RunAnywhere.unloadLLMModel()
-                }
-
                 // Generate a model ID from filename
                 val filename = path.substringAfterLast('/')
                 val modelId = filename.removeSuffix(".gguf")
@@ // ... register model ...

-                // Load the model
                 withContext(Dispatchers.IO) {
+                    // Unload existing model if loaded
+                    if (RunAnywhere.isLLMModelLoaded()) {
+                        RunAnywhere.unloadLLMModel()
+                    }
+                    // Load the model
                     RunAnywhere.loadLLMModel(modelId)
                 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@examples/android/RunAnyWhereLora/app/src/main/java/com/runanywhere/run_anywhere_lora/LoraViewModel.kt`
around lines 68 - 116, The unload call can block the main thread because
RunAnywhere.unloadLLMModel() likely does native/JNI work; move the unload into
an IO context so it runs off the Main dispatcher. In loadModel(), perform
RunAnywhere.unloadLLMModel() inside the same withContext(Dispatchers.IO) block
you use for RunAnywhere.loadLLMModel(), ensuring both unloadLLMModel() and
loadLLMModel(modelId) execute on Dispatchers.IO to avoid blocking the UI thread.
sdk/runanywhere-commons/include/rac/features/llm/rac_llm_component.h (1)

217-264: LoRA API declarations look well-structured and properly documented.

The four new functions follow the rac_ prefix convention, include error code documentation, and correctly document memory ownership for get_lora_info. One minor inconsistency:

Line 229 documents scale as 0.0-1.0, but the Kotlin-side documentation (CppBridgeLLM.kt line 834) describes it as 0.0 to 1.0+. If scales above 1.0 are valid (which is common for LoRA), the C header doc should reflect that.

📝 Suggested doc fix
- * `@param` scale Adapter scale factor (0.0-1.0, default 1.0)
+ * `@param` scale Adapter scale factor (typically 0.0-1.0, default 1.0; values >1.0 amplify the adapter effect)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@sdk/runanywhere-commons/include/rac/features/llm/rac_llm_component.h` around
lines 217 - 264, Update the documentation for rac_llm_component_load_lora to
match the Kotlin-side CppBridgeLLM.kt behavior: change the documented valid
range for the scale parameter from "0.0-1.0" to indicate that values greater
than 1.0 are allowed (e.g., "0.0 to 1.0+" or ">= 0.0"), and ensure any mention
of scale range in related comments (e.g., in rac_llm_component_load_lora's
docblock) is consistent with CppBridgeLLM.kt's description.
sdk/runanywhere-commons/src/jni/runanywhere_commons_jni.cpp (2)

1159-1165: Use rac_free() instead of free() for the backend-allocated LoRA info JSON string.

The rest of this file consistently uses rac_free() for memory allocated by the C library (Lines 2249, 2258, 3372, 3393, 3416, etc.). The json pointer in racLlmComponentGetLoraInfo is allocated by rac_llm_component_get_lora_info and should be released through the same allocator-aware free.

♻️ Proposed fix
     jstring jresult = env->NewStringUTF(json);
-    free(json);
+    rac_free(json);
     return jresult;
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@sdk/runanywhere-commons/src/jni/runanywhere_commons_jni.cpp` around lines
1159 - 1165, The code in racLlmComponentGetLoraInfo returns a jstring created
from json but uses free(json); which is inconsistent with the rest of the file
and unsafe because json is allocated by rac_llm_component_get_lora_info; replace
the plain free(json) with rac_free(json) so the backend-allocated LoRA info JSON
is released with the matching allocator (keep env->NewStringUTF(json) and
variable names intact).

1101-1134: LoRA path marshaling diverges from the established getCString helper — also missing result log in RemoveLora.

racLlmComponentLoadLora and racLlmComponentRemoveLora call GetStringUTFChars/ReleaseStringUTFChars directly, while every other function in this file that needs a C-string from a jstring uses getCString(). Additionally, racLlmComponentRemoveLora logs nothing, while racLlmComponentLoadLora logs the result at Line 1114.

♻️ Proposed refactor
-    const char* path = env->GetStringUTFChars(adapterPath, nullptr);
-    if (!path) {
-        return RAC_ERROR_INVALID_ARGUMENT;
-    }
-
-    LOGi("racLlmComponentLoadLora: handle=%lld, path=%s, scale=%.2f",
-         (long long)handle, path, (float)scale);
-
-    rac_result_t result = rac_llm_component_load_lora(
-        reinterpret_cast<rac_handle_t>(handle), path, static_cast<float>(scale));
-
-    env->ReleaseStringUTFChars(adapterPath, path);
+    std::string pathStr = getCString(env, adapterPath);
+    LOGi("racLlmComponentLoadLora: handle=%lld, path=%s, scale=%.2f",
+         (long long)handle, pathStr.c_str(), (float)scale);
+
+    rac_result_t result = rac_llm_component_load_lora(
+        reinterpret_cast<rac_handle_t>(handle), pathStr.c_str(), static_cast<float>(scale));
-    const char* path = env->GetStringUTFChars(adapterPath, nullptr);
-    if (!path) {
-        return RAC_ERROR_INVALID_ARGUMENT;
-    }
-
-    rac_result_t result = rac_llm_component_remove_lora(
-        reinterpret_cast<rac_handle_t>(handle), path);
-
-    env->ReleaseStringUTFChars(adapterPath, path);
-    return static_cast<jint>(result);
+    std::string pathStr = getCString(env, adapterPath);
+    rac_result_t result = rac_llm_component_remove_lora(
+        reinterpret_cast<rac_handle_t>(handle), pathStr.c_str());
+
+    LOGi("racLlmComponentRemoveLora result=%d", result);
+    return static_cast<jint>(result);
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@sdk/runanywhere-commons/src/jni/runanywhere_commons_jni.cpp` around lines
1101 - 1134, Both functions bypass the file's getCString helper and
racLlmComponentRemoveLora lacks a result log; replace direct
env->GetStringUTFChars/ReleaseStringUTFChars usage with the existing
getCString(env, adapterPath) helper in both racLlmComponentLoadLora and
racLlmComponentRemoveLora (use the returned C-string per the helper's contract
and do not call ReleaseStringUTFChars manually), and add a LOGi call in
racLlmComponentRemoveLora to log the result (similar to racLlmComponentLoadLora)
after rac_llm_component_remove_lora returns while keeping existing parameter
validation (handle == 0 / adapterPath == nullptr) and return the
static_cast<jint>(result).

Comment on lines +827 to +879
bool LlamaCppTextGeneration::load_lora_adapter(const std::string& adapter_path, float scale) {
std::lock_guard<std::mutex> lock(mutex_);

if (!model_loaded_ || !model_) {
LOGE("Cannot load LoRA adapter: model not loaded");
return false;
}

// Check if adapter already loaded
for (const auto& entry : lora_adapters_) {
if (entry.path == adapter_path) {
LOGE("LoRA adapter already loaded: %s", adapter_path.c_str());
return false;
}
}

LOGI("Loading LoRA adapter: %s (scale=%.2f)", adapter_path.c_str(), scale);

// Load adapter against model
llama_adapter_lora* adapter = llama_adapter_lora_init(model_, adapter_path.c_str());
if (!adapter) {
LOGE("Failed to load LoRA adapter from: %s", adapter_path.c_str());
return false;
}

// Store adapter entry
LoraAdapterEntry entry;
entry.adapter = adapter;
entry.path = adapter_path;
entry.scale = scale;
entry.applied = false;
lora_adapters_.push_back(std::move(entry));

// Recreate context so the new adapter is visible
if (!recreate_context()) {
// Remove the adapter entry we just added on failure
lora_adapters_.pop_back();
return false;
}

// Apply all loaded adapters to the new context
if (!apply_lora_adapters()) {
lora_adapters_.pop_back();
return false;
}

// Clear KV cache after adapter changes
llama_memory_clear(llama_get_memory(context_), true);

LOGI("LoRA adapter loaded and applied: %s (%zu total adapters)",
adapter_path.c_str(), lora_adapters_.size());
return true;
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

# First, let's examine the actual code around the flagged lines
fd "llamacpp_backend.cpp" --type f | head -1 | xargs -I {} sh -c 'wc -l "{}"'

Repository: RunanywhereAI/runanywhere-sdks

Length of output: 144


🏁 Script executed:

# Get the specific lines from the file to see the actual implementation
fd "llamacpp_backend.cpp" --type f | head -1 | xargs -I {} sh -c 'sed -n "827,929p" "{}"'

Repository: RunanywhereAI/runanywhere-sdks

Length of output: 3240


🏁 Script executed:

# Also check lines 367-372 for unload_model_internal context
fd "llamacpp_backend.cpp" --type f | head -1 | xargs -I {} sh -c 'sed -n "360,380p" "{}"'

Repository: RunanywhereAI/runanywhere-sdks

Length of output: 598


🏁 Script executed:

# Check if there's a header file with adapter-related declarations
fd "llamacpp.*\.h" --type f | xargs grep -l "LoraAdapter\|lora_adapter" 2>/dev/null | head -5

Repository: RunanywhereAI/runanywhere-sdks

Length of output: 138


🌐 Web query:

llama.cpp llama_adapter_lora_init llama_adapter_lora_free memory management API

💡 Result:

In llama.cpp’s C API, a LoRA adapter is an opaque heap object (struct llama_adapter_lora *) with explicit create/free:

  • Create / load (allocates):
    struct llama_adapter_lora * llama_adapter_lora_init(struct llama_model * model, const char * path_lora);
    Returns NULL on failure. [1]

  • Free (deallocates):
    void llama_adapter_lora_free(struct llama_adapter_lora * adapter); [1]

  • Attach/detach to a context (does not transfer ownership):

    • int32_t llama_set_adapter_lora(struct llama_context * ctx, struct llama_adapter_lora * adapter, float scale); (adds to the context; “will not modify model’s weight”) [1]
    • int32_t llama_rm_adapter_lora(struct llama_context * ctx, struct llama_adapter_lora * adapter); (removes that adapter; returns -1 if not present) [1]
    • void llama_clear_adapter_lora(struct llama_context * ctx); (removes all adapters) [1]

Practical memory-management rule

Keep the adapter alive as long as it may be used by any llama_context, and typically:

  1. llama_adapter_lora_init()
  2. llama_set_adapter_lora()
  3. when done: llama_rm_adapter_lora() / llama_clear_adapter_lora()
  4. finally: llama_adapter_lora_free()

This is also how higher-level bindings model it (initialize → use → register a finalizer that calls llama_adapter_lora_free). [1][2]

Sources:
[1] llama-cpp-python API reference (low-level bindings + docstrings for llama_adapter_lora_init/free, llama_set_adapter_lora, llama_rm_adapter_lora, llama_clear_adapter_lora) (llama-cpp-python.readthedocs.io)
[2] llama-cpp-hs Llama.Adapter docs (wraps llama_adapter_lora_init and returns a managed adapter object) (hackage-content.haskell.org)

Citations:


🏁 Script executed:

# Check if llama_adapter_lora_free is called anywhere in the codebase
fd "llamacpp" --type f -e cpp -e h | xargs rg "llama_adapter_lora_free" 2>/dev/null

Repository: RunanywhereAI/runanywhere-sdks

Length of output: 58


🏁 Script executed:

# Check the header file to see LoraAdapterEntry definition
fd "llamacpp_backend.h" --type f | head -1 | xargs -I {} sh -c 'grep -A 10 "struct LoraAdapterEntry\|class LoraAdapterEntry" "{}"'

Repository: RunanywhereAI/runanywhere-sdks

Length of output: 415


LoRA adapter memory leak across multiple code paths — requires immediate fix.

The llama.cpp C API requires explicit llama_adapter_lora_free() to deallocate adapters created by llama_adapter_lora_init(). Your code never calls this function, creating definite resource leaks:

  1. Error paths in load_lora_adapter() (lines 862–864, 868–869): pop_back() discards the adapter pointer without freeing it when recreate_context() or apply_lora_adapters() fails.
  2. Normal operation in remove_lora_adapter() (line 905): erase(it) removes the entry from the vector without freeing the adapter — every adapter removal leaks.
  3. clear_lora_adapters() (line 925): clear() discards all adapter pointers without freeing them.
  4. unload_model_internal() (line 372–373): clear() discards adapters without freeing.

The comment at line 904 stating "adapter memory freed with the model" is incorrect per the llama.cpp API — adapters are not automatically freed with the model and must be explicitly deallocated via llama_adapter_lora_free().

Each leaked adapter remains allocated in memory for the lifetime of the model, causing cumulative memory exhaustion as adapters are loaded and not properly freed.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@sdk/runanywhere-commons/src/backends/llamacpp/llamacpp_backend.cpp` around
lines 827 - 879, The code leaks llama_adapter_lora objects because
llama_adapter_lora_free() is never called; update load_lora_adapter(),
remove_lora_adapter(), clear_lora_adapters(), and unload_model_internal() to
explicitly free adapters: when you pop_back() an entry after recreate_context()
or apply_lora_adapters() failure in LlamaCppTextGeneration::load_lora_adapter(),
call llama_adapter_lora_free(entry.adapter) before removing it; in
remove_lora_adapter() call llama_adapter_lora_free(it->adapter) before erasing
the vector element; in clear_lora_adapters() iterate entries and call
llama_adapter_lora_free(entry.adapter) for each before clearing the vector; and
in unload_model_internal() ensure any remaining entries are freed the same way
(and null or clear fields afterward) so no adapter pointers are discarded
without deallocation.

  - Replace raw -1 returns with RAC_ERROR_INVALID_HANDLE/RAC_ERROR_INVALID_ARGUMENT
    to match codebase error handling conventions
  - Use getCString() helper instead of raw GetStringUTFChars/ReleaseStringUTFChars
  - Add missing result logging to racLlmComponentRemoveLora and racLlmComponentClearLora
  - Use rac_free() instead of free() in racLlmComponentGetLoraInfo for consistency
  - Clarify LoRA adapter memory ownership comments (adapters freed automatically
    with model per llama.cpp b8011 API — llama_adapter_lora_free is deprecated)
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (1)
sdk/runanywhere-commons/src/jni/runanywhere_commons_jni.cpp (1)

1090-1160: All four prior review concerns are resolved in this iteration — LoRA functions look correct.

  • Error guards use RAC_ERROR_INVALID_HANDLE / RAC_ERROR_INVALID_ARGUMENT throughout. ✓
  • getCString() helper used consistently instead of raw GetStringUTFChars / ReleaseStringUTFChars. ✓
  • racLlmComponentRemoveLora now logs its result (Line 1127). ✓
  • racLlmComponentGetLoraInfo calls rac_free(json) (Line 1158), matching the project allocator convention. ✓

One minor optional: racLlmComponentRemoveLora and racLlmComponentClearLora have no function-entry LOGi (unlike racLlmComponentLoadLora at Line 1104). Consider adding entry logs for operational symmetry.

🪵 Optional entry-level logging for Remove/Clear
 JNIEXPORT jint JNICALL
 Java_com_runanywhere_sdk_native_bridge_RunAnywhereBridge_racLlmComponentRemoveLora(
     JNIEnv* env, jclass clazz, jlong handle, jstring adapterPath) {
     if (handle == 0)
         return RAC_ERROR_INVALID_HANDLE;
     if (adapterPath == nullptr)
         return RAC_ERROR_INVALID_ARGUMENT;

     std::string path = getCString(env, adapterPath);
+    LOGi("racLlmComponentRemoveLora: handle=%lld, path=%s", (long long)handle, path.c_str());

     rac_result_t result = rac_llm_component_remove_lora(
 JNIEXPORT jint JNICALL
 Java_com_runanywhere_sdk_native_bridge_RunAnywhereBridge_racLlmComponentClearLora(
     JNIEnv* env, jclass clazz, jlong handle) {
     if (handle == 0)
         return RAC_ERROR_INVALID_HANDLE;
+    LOGi("racLlmComponentClearLora: handle=%lld", (long long)handle);

     rac_result_t result = rac_llm_component_clear_lora(reinterpret_cast<rac_handle_t>(handle));
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@sdk/runanywhere-commons/src/jni/runanywhere_commons_jni.cpp` around lines
1090 - 1160, Add entry-level LOGi calls at the start of
Java_com_runanywhere_sdk_native_bridge_RunAnywhereBridge_racLlmComponentRemoveLora
and
Java_com_runanywhere_sdk_native_bridge_RunAnywhereBridge_racLlmComponentClearLora
to match racLlmComponentLoadLora; specifically, in racLlmComponentRemoveLora log
the handle and adapterPath string (use the existing getCString(env, adapterPath)
value or log path.c_str()) before calling rac_llm_component_remove_lora, and in
racLlmComponentClearLora log the handle value at the top before calling
rac_llm_component_clear_lora so both functions have symmetrical entry logging.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@sdk/runanywhere-commons/src/backends/llamacpp/llamacpp_backend.cpp`:
- Around line 813-825: apply_lora_adapters() currently leaves lora_adapters_ and
the context inconsistent on a partial failure; update
LlamaCppTextGeneration::apply_lora_adapters to perform a rollback when
llama_set_adapter_lora returns non-zero: call llama_clear_adapter_lora for each
adapter that was successfully applied, reset those entries' applied flags to
false, and ensure the failing adapter's entry state is consistent before
returning false (also keep load_lora_adapter/pop_back semantics intact or adjust
to remove the correct entry so lora_adapters_ and context remain in sync).
- Around line 861-871: recreate_context() currently frees the old
context_/sampler_ before creating the new one, leaving context_=nullptr and
model_loaded_=true if llama_init_from_model fails; modify recreate_context() to
allocate/create the new context and sampler first (or save copies of the old
context_/sampler_), and only free the old ones after the new context is
confirmed valid; on any failure (llama_init_from_model or
apply_lora_adapters()), call unload_model_internal() to set model_loaded_ =
false and restore the old context_/sampler_ (or keep the original intact), and
ensure any adapter created by llama_adapter_lora_init is properly freed
(llama_adapter_lora_free or equivalent) before popping from lora_adapters_ to
avoid leaks.

---

Duplicate comments:
In `@sdk/runanywhere-commons/src/backends/llamacpp/llamacpp_backend.cpp`:
- Around line 367-372: The code currently assumes lora adapters are freed with
the model but we must verify and, if not, explicitly free them: inspect the
vendored llama.cpp headers for a declaration of llama_adapter_lora_free and any
build/version macros (search for "llama_adapter_lora_free" and
"LLAMA_BUILD_NUMBER"/"build_number"/"LLAMA_VERSION_PATCH"); if the function
exists and is not deprecated in this vendored tree, update all cleanup paths
(unload_model_internal, remove_lora_adapter, clear_lora_adapters, and failure
paths in load_lora_adapter) to call llama_adapter_lora_free(adapter) for each
adapter before clearing lora_adapters_ and ensure no double-free when model
teardown also frees them (add guards or null out pointers after free).

In `@sdk/runanywhere-commons/src/jni/runanywhere_commons_jni.cpp`:
- Line 131: Replace any uses of AttachCurrentThread that pass a void** cast with
the C++ NDK form that accepts a JNIEnv**: update calls such as
g_jvm->AttachCurrentThread(&env, nullptr) in getJNIEnv,
llm_stream_callback_token, model_assignment_http_get_callback, and
vlm_stream_callback_token so they pass &env (JNIEnv**) directly and remove
reinterpret_cast<void**> or equivalent casts; ensure the function signatures and
local env variable types are JNIEnv* so the call matches the C++
AttachCurrentThread(&env, nullptr) prototype consistently across all four call
sites.

---

Nitpick comments:
In `@sdk/runanywhere-commons/src/jni/runanywhere_commons_jni.cpp`:
- Around line 1090-1160: Add entry-level LOGi calls at the start of
Java_com_runanywhere_sdk_native_bridge_RunAnywhereBridge_racLlmComponentRemoveLora
and
Java_com_runanywhere_sdk_native_bridge_RunAnywhereBridge_racLlmComponentClearLora
to match racLlmComponentLoadLora; specifically, in racLlmComponentRemoveLora log
the handle and adapterPath string (use the existing getCString(env, adapterPath)
value or log path.c_str()) before calling rac_llm_component_remove_lora, and in
racLlmComponentClearLora log the handle value at the top before calling
rac_llm_component_clear_lora so both functions have symmetrical entry logging.

Comment on lines +813 to +825
bool LlamaCppTextGeneration::apply_lora_adapters() {
for (auto& entry : lora_adapters_) {
int32_t result = llama_set_adapter_lora(context_, entry.adapter, entry.scale);
if (result != 0) {
LOGE("Failed to apply LoRA adapter: %s (error=%d)", entry.path.c_str(), result);
entry.applied = false;
return false;
}
entry.applied = true;
LOGI("Applied LoRA adapter: %s (scale=%.2f)", entry.path.c_str(), entry.scale);
}
return true;
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Missing rollback in apply_lora_adapters() leaves context and tracking in inconsistent state on partial failure.

If llama_set_adapter_lora fails for adapter at index K (where K is not the last entry):

  1. Adapters [0..K-1] are applied to the new context (applied = true).
  2. Adapter K remains unapplied (applied = false) and stays in lora_adapters_.
  3. load_lora_adapter() then calls lora_adapters_.pop_back(), which removes only the newly-added last entry, not the failing entry K.
  4. The context now has adapters [0..K-1] applied, but lora_adapters_ has entries [0..K] with mixed applied flags — a persistent inconsistency.

At a minimum, on failure, the already-applied adapters should be rolled back with llama_clear_adapter_lora before returning false, and a consistent error state should be established.

🛡️ Proposed defensive approach
 bool LlamaCppTextGeneration::apply_lora_adapters() {
     for (auto& entry : lora_adapters_) {
         int32_t result = llama_set_adapter_lora(context_, entry.adapter, entry.scale);
         if (result != 0) {
             LOGE("Failed to apply LoRA adapter: %s (error=%d)", entry.path.c_str(), result);
             entry.applied = false;
+            // Roll back all adapters applied so far to restore a consistent state
+            llama_clear_adapter_lora(context_);
+            for (auto& e : lora_adapters_) {
+                e.applied = false;
+            }
             return false;
         }
         entry.applied = true;
         LOGI("Applied LoRA adapter: %s (scale=%.2f)", entry.path.c_str(), entry.scale);
     }
     return true;
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@sdk/runanywhere-commons/src/backends/llamacpp/llamacpp_backend.cpp` around
lines 813 - 825, apply_lora_adapters() currently leaves lora_adapters_ and the
context inconsistent on a partial failure; update
LlamaCppTextGeneration::apply_lora_adapters to perform a rollback when
llama_set_adapter_lora returns non-zero: call llama_clear_adapter_lora for each
adapter that was successfully applied, reset those entries' applied flags to
false, and ensure the failing adapter's entry state is consistent before
returning false (also keep load_lora_adapter/pop_back semantics intact or adjust
to remove the correct entry so lora_adapters_ and context remain in sync).

Comment on lines +861 to +871
if (!recreate_context()) {
// Remove the adapter entry we just added on failure
lora_adapters_.pop_back();
return false;
}

// Apply all loaded adapters to the new context
if (!apply_lora_adapters()) {
lora_adapters_.pop_back();
return false;
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

recreate_context() failure leaves the instance in a permanently broken state (context_ = nullptr, model_loaded_ = true).

Inside recreate_context(), the old context_ and sampler_ are unconditionally freed before the new context is created (Lines 783-786). If llama_init_from_model fails:

  • context_ and sampler_ are nullptr.
  • model_loaded_ remains true.
  • is_ready() returns false permanently.
  • The public generate_* APIs silently fail; callers receive no indication that recovery requires a full model reload.

Additionally, in the apply_lora_adapters() failure path at Line 868-871, the adapter object allocated by llama_adapter_lora_init has been pushed and then popped from the vector — if the deprecation of llama_adapter_lora_free cannot be confirmed (see previous comment), this is an additional leak.

Recommended fixes:

  1. Save and restore the old context/sampler (or attempt the new context before freeing the old one).
  2. If recreation truly fails, call unload_model_internal() to set model_loaded_ = false so callers can detect and handle the failure.
🛡️ Minimal mitigation for the broken-state issue
-    // Recreate context so the new adapter is visible
-    if (!recreate_context()) {
-        // Remove the adapter entry we just added on failure
-        lora_adapters_.pop_back();
-        return false;
-    }
+    // Recreate context so the new adapter is visible
+    if (!recreate_context()) {
+        lora_adapters_.pop_back();
+        // context_ is now null; force a clean unload so callers can detect the failure
+        LOGE("Context recreation failed; forcing model unload to maintain consistent state");
+        unload_model_internal();
+        return false;
+    }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@sdk/runanywhere-commons/src/backends/llamacpp/llamacpp_backend.cpp` around
lines 861 - 871, recreate_context() currently frees the old context_/sampler_
before creating the new one, leaving context_=nullptr and model_loaded_=true if
llama_init_from_model fails; modify recreate_context() to allocate/create the
new context and sampler first (or save copies of the old context_/sampler_), and
only free the old ones after the new context is confirmed valid; on any failure
(llama_init_from_model or apply_lora_adapters()), call unload_model_internal()
to set model_loaded_ = false and restore the old context_/sampler_ (or keep the
original intact), and ensure any adapter created by llama_adapter_lora_init is
properly freed (llama_adapter_lora_free or equivalent) before popping from
lora_adapters_ to avoid leaks.

@sanchitmonga22
Copy link
Copy Markdown
Contributor

@coderabbitai please review

@sanchitmonga22
Copy link
Copy Markdown
Contributor

@greptile please review

@sanchitmonga22
Copy link
Copy Markdown
Contributor

@ellipsis please review

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Feb 19, 2026

@sanchitmonga22, I'll review this PR.

PR Description Image: The image shows a mobile phone screenshot of the "RunAnywhere LoRA" Android app demonstrating the LoRA adapter functionality. The UI displays:

  • Loaded model: msf:1000137816
  • Loaded LoRA adapter: msf:1000137817 x1.0
  • Generated text about diabetes (Type 1 and Type 2 descriptions)
  • Performance metrics: 38.7 tok/s, 376 tokens, 12.0s generation time
  • Control buttons: "Model", "+ LoRA", "Clear"
  • Text input: "what is diabetes"

This is a substantial PR adding comprehensive LoRA (Low-Rank Adaptation) adapter support across the entire SDK stack:

Scope Overview:

  • C++ Backend (~350 lines): New LoRA lifecycle APIs in llamacpp backend with context recreation and KV cache clearing
  • JNI Layer (~90 lines): Four new JNI methods with updated thread attachment patterns
  • Kotlin SDK (~260 lines): Public APIs for LoRA management with state validation and error handling
  • Build Workflow (~400 lines): Major expansion with per-target inputs and matrix strategy
  • Android Example (~1200 lines): Complete demo app with ViewModel, Compose UI, and file picker integration
  • Documentation (700 lines): Comprehensive implementation guide
  • Swift: Sendable conformance fixes for Swift 6

I'll conduct a thorough review focusing on API design, memory safety, threading, error handling, and integration patterns.

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@shubhammalhotra28
Copy link
Copy Markdown
Contributor

@Siddhesh2377 merging this to dev branch to take out these changes and work on it for Swift :)
Will open new PR and feel free to have a look then.

@shubhammalhotra28 shubhammalhotra28 changed the base branch from main to dev February 21, 2026 23:06
@shubhammalhotra28 shubhammalhotra28 merged commit 4cfd510 into RunanywhereAI:dev Feb 21, 2026
11 of 16 checks passed
Siddhesh2377 added a commit that referenced this pull request Feb 22, 2026
* feat(lora): add LoRA adapter support across SDK + demo app

  Implement LoRA (Low-Rank Adaptation) adapter hot-swapping for llama.cpp
  backend across all 6 SDK layers (C++ -> C API -> Component -> JNI ->
  Kotlin Bridge -> Kotlin Public API).

  - Add load/remove/clear/query LoRA adapter operations
  - Use vtable dispatch in component layer to decouple librac_commons
    from librac_backend_llamacpp (fixes linker errors)
  - Add LoRA vtable entries to rac_llm_service_ops_t
  - Fix AttachCurrentThread cast for Android NDK C++ JNI build
  - Add RunAnyWhereLora Android demo app with Material 3 Q&A UI
  - Add comprehensive implementation docs with C/C++ API reference

* feat(ci): add selectable build targets to Build All workflow + fix Swift concurrency errors

  Rewrite build-all-test.yml with 9 boolean checkbox inputs so each build
  target can be toggled independently from the GitHub Actions UI:
  - C++ Android Backends (arm64-v8a, armeabi-v7a, x86_64 matrix)
  - C++ iOS Backends (XCFramework)
  - Kotlin SDK (JVM + Android)
  - Swift SDK (iOS/macOS)
  - Web SDK (TypeScript)
  - Flutter SDK (Dart analyze via Melos)
  - React Native SDK (TypeScript via Lerna)
  - Android Example Apps (RunAnywhereAI + RunAnyWhereLora)
  - IntelliJ Plugin

  Fix two Swift strict-concurrency errors that fail the Swift SDK build:
  - LiveTranscriptionSession: add @unchecked Sendable (safe because class
    is @mainactor, all access serialized)
  - RunAnywhere+VisionLanguage: add Sendable conformance to rac_vlm_image_t
    so the C struct can cross the Task boundary in the streaming builder;
    simplify StreamingCollector to start timing at init

* fix(swift): resolve strict concurrency errors in LiveTranscriptionSession and VLM streaming

  LiveTranscriptionSession.swift:
  - Replace [weak self] captures with strong `let session = self` before
    closures to avoid captured var in @Sendable/@task contexts (class is
    @mainactor @unchecked Sendable so strong ref is safe, bounded by
    stream lifecycle)
  - Wrap deprecated startStreamingTranscription call in @available helper
    to silence deprecation warning until migration to transcribeStream API

  RunAnywhere+VisionLanguage.swift:
  - Add `let capturedCImage = cImage` before AsyncThrowingStream closure
    so the Task captures an immutable let instead of a mutable var
  - Add `extension rac_vlm_image_t: @unchecked Sendable {}` for the C
    struct to cross Task concurrency boundaries safely
  - Simplify StreamingCollector to initialize startTime at init instead
    of requiring a separate async start() call

* fix(jni): address CodeRabbit review findings in LoRA JNI functions

  - Replace raw -1 returns with RAC_ERROR_INVALID_HANDLE/RAC_ERROR_INVALID_ARGUMENT
    to match codebase error handling conventions
  - Use getCString() helper instead of raw GetStringUTFChars/ReleaseStringUTFChars
  - Add missing result logging to racLlmComponentRemoveLora and racLlmComponentClearLora
  - Use rac_free() instead of free() in racLlmComponentGetLoraInfo for consistency
  - Clarify LoRA adapter memory ownership comments (adapters freed automatically
    with model per llama.cpp b8011 API — llama_adapter_lora_free is deprecated)
ManthanNimodiya pushed a commit to ManthanNimodiya/runanywhere-sdks that referenced this pull request Feb 23, 2026
* feat(lora): add LoRA adapter support across SDK + demo app

  Implement LoRA (Low-Rank Adaptation) adapter hot-swapping for llama.cpp
  backend across all 6 SDK layers (C++ -> C API -> Component -> JNI ->
  Kotlin Bridge -> Kotlin Public API).

  - Add load/remove/clear/query LoRA adapter operations
  - Use vtable dispatch in component layer to decouple librac_commons
    from librac_backend_llamacpp (fixes linker errors)
  - Add LoRA vtable entries to rac_llm_service_ops_t
  - Fix AttachCurrentThread cast for Android NDK C++ JNI build
  - Add RunAnyWhereLora Android demo app with Material 3 Q&A UI
  - Add comprehensive implementation docs with C/C++ API reference

* feat(ci): add selectable build targets to Build All workflow + fix Swift concurrency errors

  Rewrite build-all-test.yml with 9 boolean checkbox inputs so each build
  target can be toggled independently from the GitHub Actions UI:
  - C++ Android Backends (arm64-v8a, armeabi-v7a, x86_64 matrix)
  - C++ iOS Backends (XCFramework)
  - Kotlin SDK (JVM + Android)
  - Swift SDK (iOS/macOS)
  - Web SDK (TypeScript)
  - Flutter SDK (Dart analyze via Melos)
  - React Native SDK (TypeScript via Lerna)
  - Android Example Apps (RunAnywhereAI + RunAnyWhereLora)
  - IntelliJ Plugin

  Fix two Swift strict-concurrency errors that fail the Swift SDK build:
  - LiveTranscriptionSession: add @unchecked Sendable (safe because class
    is @mainactor, all access serialized)
  - RunAnywhere+VisionLanguage: add Sendable conformance to rac_vlm_image_t
    so the C struct can cross the Task boundary in the streaming builder;
    simplify StreamingCollector to start timing at init

* fix(swift): resolve strict concurrency errors in LiveTranscriptionSession and VLM streaming

  LiveTranscriptionSession.swift:
  - Replace [weak self] captures with strong `let session = self` before
    closures to avoid captured var in @Sendable/@task contexts (class is
    @mainactor @unchecked Sendable so strong ref is safe, bounded by
    stream lifecycle)
  - Wrap deprecated startStreamingTranscription call in @available helper
    to silence deprecation warning until migration to transcribeStream API

  RunAnywhere+VisionLanguage.swift:
  - Add `let capturedCImage = cImage` before AsyncThrowingStream closure
    so the Task captures an immutable let instead of a mutable var
  - Add `extension rac_vlm_image_t: @unchecked Sendable {}` for the C
    struct to cross Task concurrency boundaries safely
  - Simplify StreamingCollector to initialize startTime at init instead
    of requiring a separate async start() call

* fix(jni): address CodeRabbit review findings in LoRA JNI functions

  - Replace raw -1 returns with RAC_ERROR_INVALID_HANDLE/RAC_ERROR_INVALID_ARGUMENT
    to match codebase error handling conventions
  - Use getCString() helper instead of raw GetStringUTFChars/ReleaseStringUTFChars
  - Add missing result logging to racLlmComponentRemoveLora and racLlmComponentClearLora
  - Use rac_free() instead of free() in racLlmComponentGetLoraInfo for consistency
  - Clarify LoRA adapter memory ownership comments (adapters freed automatically
    with model per llama.cpp b8011 API — llama_adapter_lora_free is deprecated)
VyasGuru pushed a commit to VyasGuru/runanywhere-sdks that referenced this pull request Feb 26, 2026
* feat(lora): add LoRA adapter support across SDK + demo app

  Implement LoRA (Low-Rank Adaptation) adapter hot-swapping for llama.cpp
  backend across all 6 SDK layers (C++ -> C API -> Component -> JNI ->
  Kotlin Bridge -> Kotlin Public API).

  - Add load/remove/clear/query LoRA adapter operations
  - Use vtable dispatch in component layer to decouple librac_commons
    from librac_backend_llamacpp (fixes linker errors)
  - Add LoRA vtable entries to rac_llm_service_ops_t
  - Fix AttachCurrentThread cast for Android NDK C++ JNI build
  - Add RunAnyWhereLora Android demo app with Material 3 Q&A UI
  - Add comprehensive implementation docs with C/C++ API reference

* feat(ci): add selectable build targets to Build All workflow + fix Swift concurrency errors

  Rewrite build-all-test.yml with 9 boolean checkbox inputs so each build
  target can be toggled independently from the GitHub Actions UI:
  - C++ Android Backends (arm64-v8a, armeabi-v7a, x86_64 matrix)
  - C++ iOS Backends (XCFramework)
  - Kotlin SDK (JVM + Android)
  - Swift SDK (iOS/macOS)
  - Web SDK (TypeScript)
  - Flutter SDK (Dart analyze via Melos)
  - React Native SDK (TypeScript via Lerna)
  - Android Example Apps (RunAnywhereAI + RunAnyWhereLora)
  - IntelliJ Plugin

  Fix two Swift strict-concurrency errors that fail the Swift SDK build:
  - LiveTranscriptionSession: add @unchecked Sendable (safe because class
    is @mainactor, all access serialized)
  - RunAnywhere+VisionLanguage: add Sendable conformance to rac_vlm_image_t
    so the C struct can cross the Task boundary in the streaming builder;
    simplify StreamingCollector to start timing at init

* fix(swift): resolve strict concurrency errors in LiveTranscriptionSession and VLM streaming

  LiveTranscriptionSession.swift:
  - Replace [weak self] captures with strong `let session = self` before
    closures to avoid captured var in @Sendable/@task contexts (class is
    @mainactor @unchecked Sendable so strong ref is safe, bounded by
    stream lifecycle)
  - Wrap deprecated startStreamingTranscription call in @available helper
    to silence deprecation warning until migration to transcribeStream API

  RunAnywhere+VisionLanguage.swift:
  - Add `let capturedCImage = cImage` before AsyncThrowingStream closure
    so the Task captures an immutable let instead of a mutable var
  - Add `extension rac_vlm_image_t: @unchecked Sendable {}` for the C
    struct to cross Task concurrency boundaries safely
  - Simplify StreamingCollector to initialize startTime at init instead
    of requiring a separate async start() call

* fix(jni): address CodeRabbit review findings in LoRA JNI functions

  - Replace raw -1 returns with RAC_ERROR_INVALID_HANDLE/RAC_ERROR_INVALID_ARGUMENT
    to match codebase error handling conventions
  - Use getCString() helper instead of raw GetStringUTFChars/ReleaseStringUTFChars
  - Add missing result logging to racLlmComponentRemoveLora and racLlmComponentClearLora
  - Use rac_free() instead of free() in racLlmComponentGetLoraInfo for consistency
  - Clarify LoRA adapter memory ownership comments (adapters freed automatically
    with model per llama.cpp b8011 API — llama_adapter_lora_free is deprecated)
shubhammalhotra28 added a commit that referenced this pull request Mar 5, 2026
* Added Lora + Fixed Build-All-Work-Flow (#389)

* feat(lora): add LoRA adapter support across SDK + demo app

  Implement LoRA (Low-Rank Adaptation) adapter hot-swapping for llama.cpp
  backend across all 6 SDK layers (C++ -> C API -> Component -> JNI ->
  Kotlin Bridge -> Kotlin Public API).

  - Add load/remove/clear/query LoRA adapter operations
  - Use vtable dispatch in component layer to decouple librac_commons
    from librac_backend_llamacpp (fixes linker errors)
  - Add LoRA vtable entries to rac_llm_service_ops_t
  - Fix AttachCurrentThread cast for Android NDK C++ JNI build
  - Add RunAnyWhereLora Android demo app with Material 3 Q&A UI
  - Add comprehensive implementation docs with C/C++ API reference

* feat(ci): add selectable build targets to Build All workflow + fix Swift concurrency errors

  Rewrite build-all-test.yml with 9 boolean checkbox inputs so each build
  target can be toggled independently from the GitHub Actions UI:
  - C++ Android Backends (arm64-v8a, armeabi-v7a, x86_64 matrix)
  - C++ iOS Backends (XCFramework)
  - Kotlin SDK (JVM + Android)
  - Swift SDK (iOS/macOS)
  - Web SDK (TypeScript)
  - Flutter SDK (Dart analyze via Melos)
  - React Native SDK (TypeScript via Lerna)
  - Android Example Apps (RunAnywhereAI + RunAnyWhereLora)
  - IntelliJ Plugin

  Fix two Swift strict-concurrency errors that fail the Swift SDK build:
  - LiveTranscriptionSession: add @unchecked Sendable (safe because class
    is @mainactor, all access serialized)
  - RunAnywhere+VisionLanguage: add Sendable conformance to rac_vlm_image_t
    so the C struct can cross the Task boundary in the streaming builder;
    simplify StreamingCollector to start timing at init

* fix(swift): resolve strict concurrency errors in LiveTranscriptionSession and VLM streaming

  LiveTranscriptionSession.swift:
  - Replace [weak self] captures with strong `let session = self` before
    closures to avoid captured var in @Sendable/@task contexts (class is
    @mainactor @unchecked Sendable so strong ref is safe, bounded by
    stream lifecycle)
  - Wrap deprecated startStreamingTranscription call in @available helper
    to silence deprecation warning until migration to transcribeStream API

  RunAnywhere+VisionLanguage.swift:
  - Add `let capturedCImage = cImage` before AsyncThrowingStream closure
    so the Task captures an immutable let instead of a mutable var
  - Add `extension rac_vlm_image_t: @unchecked Sendable {}` for the C
    struct to cross Task concurrency boundaries safely
  - Simplify StreamingCollector to initialize startTime at init instead
    of requiring a separate async start() call

* fix(jni): address CodeRabbit review findings in LoRA JNI functions

  - Replace raw -1 returns with RAC_ERROR_INVALID_HANDLE/RAC_ERROR_INVALID_ARGUMENT
    to match codebase error handling conventions
  - Use getCString() helper instead of raw GetStringUTFChars/ReleaseStringUTFChars
  - Add missing result logging to racLlmComponentRemoveLora and racLlmComponentClearLora
  - Use rac_free() instead of free() in racLlmComponentGetLoraInfo for consistency
  - Clarify LoRA adapter memory ownership comments (adapters freed automatically
    with model per llama.cpp b8011 API — llama_adapter_lora_free is deprecated)

* Add lora ios (#407)

* ios initial changes

* minimal sample needed to test lora

* updating docs

* addressed the comments

* Prototype for Optimised RAG

First version for Optimised RAG. Not polished yet, Once tested, I'll microoptimise, bench, and finish.

* Aligning / upstream update for dev (#442)

* chore: add AGENTS.md with Cursor Cloud specific instructions

* chore: update AGENTS.md with Linux backend build and voice assistant instructions

* minor fixes

* fix: Android app UI improvements, SDK concurrency bug fixes, and LoRA download support

Android App:
- Redesign intro screen with minimal layout and linear progress bar
- Improve VLM screen: use shared ModelRequiredOverlay, theme-consistent colors,
  fix button clipping (replace IconButton with clickable Column)
- Fix keyboard handling: hide bottom bar when keyboard open, apply imePadding correctly
- Add scrollable auto-scroll prompt suggestions in ChatScreen
- Add shimmer typing indicator with "Thinking..." label
- Fix 9 app-level bugs: think tag leak, CancellationException handling,
  VoiceAssistant lifecycle, ConversationStore ANR, TTS sample rate parsing,
  LoRA download mutex deadlock

KMP SDK (10 bug fixes):
- Fix cancel() deadlock: move JNI calls outside synchronized(lock) in CppBridgeLLM
- Fix orphaned CoroutineScope leak in generateStream using callbackFlow
- Fix initializeServices() holding lock across network I/O
- Fix loraDownloadDir lazy val caching wrong path before pathProvider set
- Fix setBaseDirCallback TOCTOU race condition
- Add @volatile to DownloadTask mutable fields for thread visibility
- Fix unescapeJson() replacement order (process \\\\ before \\n)
- Add downloadLock for atomic cancel/pause/resume operations
- Fix checkNativeLibrary() to actually call native method
- Add ensureServicesReady() to generateStream
- Add LoRA adapter download/delete/path SDK functions

Known issue: Tool-calling may show unexpected behavior when a LoRA adapter is
applied — the model detects the tool call but responds with "I can assist with
this" instead of executing it. Tested with Qwen 2.5 0.5B. This only occurs
when the model has a LoRA adapter loaded.

* fix(tts): scan WAV data chunk instead of hardcoding 44-byte header offset

WAV files with extra chunks (LIST, fact, bext) had metadata bytes fed
into AudioTrack as PCM, causing distorted playback. Now walks the chunk
structure to find the actual "data" chunk start.

* fix: Android app UI bug fixes, responsive dimensions, LoRA example prompts, and darker dark mode

- Fix nested verticalScroll inside LazyColumn (ThinkingToggle) causing broken scroll
- Fix weight(1f) + verticalScroll overflow in VLMScreen DescriptionPanel
- Add verticalScroll to MoreHubScreen to prevent clipping on small screens
- Add imePadding to ConversationListSheet so keyboard doesn't cover search
- Fix auto-scroll wrap logic in EmptyStateView using canScrollForward
- Replace collectAsState with collectAsStateWithLifecycle in 3 screens
- Replace deprecated STTMode.values() with .entries
- Replace hardcoded Color.Gray with AppColors.statusGray for dark mode contrast
- Remove redundant Color.White inside buttons with contentColor set
- Replace hardcoded 300.dp bubble width with responsive Dimensions.messageBubbleMaxWidth
- Add accessibility semantics role to VLMScreen clickable Column
- Disable Image Generation card (placeholder feature)
- Add responsive rDp/rSp utilities and convert Dimensions/AppSpacing to use them
- Add LoRA example prompts with copy button to adapter picker and manager screens
- Darken dark mode background colors

* fix: Android app bug fixes - race conditions, ANR, pixel corruption, scroll, and memory safety

- VoiceAssistantViewModel: replace runBlocking with GlobalScope.launch in onCleared to prevent ANR
- VoiceAssistantViewModel: add synchronized audioBufferLock for thread-safe ByteArrayOutputStream access
- VoiceAssistantViewModel: scan WAV data chunk instead of hardcoding 44-byte header offset
- ConversationStore: use MutableStateFlow.update {} for atomic compare-and-set on all mutations
- ToolSettingsViewModel: clear static singleton in onCleared to prevent stale references
- VLMViewModel: advance rgbIdx by 3 in else branch to prevent pixel corruption on out-of-bounds skip
- ChatViewModel: use CopyOnWriteArrayList for tokensPerSecondHistory thread safety
- VoiceAssistantParticleView: remove wasted transparent drawPoints call
- RunAnywhereApplication: capture volatile initializationError to local val before null check
- VLMScreen: add verticalScroll to description panel for long text overflow
- ResponsiveUtils: add designWidth <= 0 guard to prevent division by zero in rDp/rSp

---------

Co-authored-by: Cursor Agent <cursoragent@cursor.com>
Co-authored-by: Sanchit Monga <sanchitmonga22@gmail.com>
Co-authored-by: Sanchit Monga <sm3468@g.rit.edu>
Co-authored-by: Siddhesh2377 <siddheshsonar2377@gmail.com>
Co-authored-by: RunAnywhere <>

* RAG rewrite

* Refactor RAG terminology to "pipeline" across scripts and source files for consistency. Update comments and logging messages to reflect the change from "backend" to "pipeline". Remove unused React Native package files related to RAG.

* Complete RAG Flutter implementation (full state) (#419)

RAG Flutter SDK.

->there are a bunch of UI/UX issues with this like button not loading, the round spinny download thing not being proper etc, but the rag pipeline works.

-> also onnx and rag when built together were giving duplicate symbol errors as rag requires onnx, so a future task that should be done soon is to include onnx in core as well, and perhaps add a conditional?

* Optimised RAG + implement a hybrid search

* fixed tnc block error.

* Changed batching parametres, similarity threshold, and optimised embedding memory+speed output

* fix(rag): close anonymous namespace in rag_chunker.cpp to fix compilation

The anonymous namespace wrapping perform_recursive_chunking was never
closed, causing DocumentChunker member definitions to be inside the
anonymous namespace — resulting in "cannot define or redeclare" errors.

Made-with: Cursor

* fix: remove stale runanywhere-core-rag module references from Android app

RAG was moved into the core SDK but the Android example app still
referenced the deleted module, breaking the build.

* fixing ios/swift - remvoing ragbackend - refactor

* finxing the tts for platform in voice agent

* lora fixes - to match up with kotlin

* refactor: fold RAG backend into rac_commons, remove separate RAG binary

- Changed rac_backend_rag from SHARED/STATIC to OBJECT library (CMake)
- RAG objects folded into rac_commons at compile time
- Moved ONNX embedding provider to rac_backend_onnx to break shared-lib cycle
- ONNX backend now registers embeddings provider during rac_backend_onnx_register()
- Removed RAG as separate backend from all build scripts and SDK configs
- Updated Android, Kotlin, Flutter, React Native build/distribution pipelines
- RAG JNI bridge (librac_backend_rag_jni.so) remains as thin wrapper linking rac_commons

* fixing rn for rag +some permissions for vlm + npm dependencies + archive logic improved

* refactor for react - tts is causing trouble - refactoring that now - will follow with flutter once done

---------

Co-authored-by: Siddhesh <DARKWILDHACKER@gmail.com>
Co-authored-by: Sanchit Monga <sanchitmonga22@gmail.com>
Co-authored-by: VyasGuru <71374747+VyasGuru@users.noreply.github.com>
Co-authored-by: Sanchit Monga <sm3468@g.rit.edu>
Co-authored-by: Siddhesh2377 <siddheshsonar2377@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation enhancement New feature or request kotlin-sdk

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants