Added Lora + Fixed Build-All-Work-Flow#389
Added Lora + Fixed Build-All-Work-Flow#389shubhammalhotra28 merged 5 commits intoRunanywhereAI:devfrom
Conversation
Implement LoRA (Low-Rank Adaptation) adapter hot-swapping for llama.cpp
backend across all 6 SDK layers (C++ -> C API -> Component -> JNI ->
Kotlin Bridge -> Kotlin Public API).
- Add load/remove/clear/query LoRA adapter operations
- Use vtable dispatch in component layer to decouple librac_commons
from librac_backend_llamacpp (fixes linker errors)
- Add LoRA vtable entries to rac_llm_service_ops_t
- Fix AttachCurrentThread cast for Android NDK C++ JNI build
- Add RunAnyWhereLora Android demo app with Material 3 Q&A UI
- Add comprehensive implementation docs with C/C++ API reference
…ift concurrency errors Rewrite build-all-test.yml with 9 boolean checkbox inputs so each build target can be toggled independently from the GitHub Actions UI: - C++ Android Backends (arm64-v8a, armeabi-v7a, x86_64 matrix) - C++ iOS Backends (XCFramework) - Kotlin SDK (JVM + Android) - Swift SDK (iOS/macOS) - Web SDK (TypeScript) - Flutter SDK (Dart analyze via Melos) - React Native SDK (TypeScript via Lerna) - Android Example Apps (RunAnywhereAI + RunAnyWhereLora) - IntelliJ Plugin Fix two Swift strict-concurrency errors that fail the Swift SDK build: - LiveTranscriptionSession: add @unchecked Sendable (safe because class is @mainactor, all access serialized) - RunAnywhere+VisionLanguage: add Sendable conformance to rac_vlm_image_t so the C struct can cross the Task boundary in the streaming builder; simplify StreamingCollector to start timing at init
…sion and VLM streaming
LiveTranscriptionSession.swift:
- Replace [weak self] captures with strong `let session = self` before
closures to avoid captured var in @Sendable/@task contexts (class is
@mainactor @unchecked Sendable so strong ref is safe, bounded by
stream lifecycle)
- Wrap deprecated startStreamingTranscription call in @available helper
to silence deprecation warning until migration to transcribeStream API
RunAnywhere+VisionLanguage.swift:
- Add `let capturedCImage = cImage` before AsyncThrowingStream closure
so the Task captures an immutable let instead of a mutable var
- Add `extension rac_vlm_image_t: @unchecked Sendable {}` for the C
struct to cross Task concurrency boundaries safely
- Simplify StreamingCollector to initialize startTime at init instead
of requiring a separate async start() call
📝 WalkthroughWalkthroughThis PR adds LoRA adapter support across the SDK (C/C++ backend, JNI, Kotlin multiplatform, and Swift), implements LoRA lifecycle in the LlamaCPP backend, extends JNI and Kotlin APIs, provides JVM/Android implementations, and adds a full RunAnyWhereLora Android example. It also restructures CI with a multi-target build workflow. Changes
Sequence Diagram(s)sequenceDiagram
participant UI as "App UI (Compose)"
participant VM as "LoraViewModel"
participant SDK as "RunAnywhere Kotlin API"
participant JNI as "RunAnywhereBridge (JNI)"
participant CPP as "LlamaCPP Backend"
UI->>VM: user selects LoRA file + scale
VM->>SDK: call loadLoraAdapter(config)
SDK->>JNI: racLlmComponentLoadLora(handle, path, scale)
JNI->>CPP: load_lora(adapter_path, scale)
CPP->>CPP: load adapter, recreate_context(), apply_lora_adapters(), clear_kv_cache
CPP-->>JNI: result code / info
JNI-->>SDK: result/out_json
SDK-->>VM: success / updated adapter list
VM-->>UI: update UI state (adapter applied)
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Suggested labels
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Important
Looks good to me! 👍
Reviewed everything up to 638da00 in 23 seconds. Click for details.
- Reviewed
4667lines of code in55files - Skipped
10files when reviewing. - Skipped posting
0draft comments. View those below. - Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
Workflow ID: wflow_Dk65vG7BOH704JFM
You can customize by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.
|
@shubhammalhotra28 once u r free merge this PR
|
There was a problem hiding this comment.
Actionable comments posted: 9
Note
Due to the large number of review comments, Critical, Major severity comments were prioritized as inline comments.
🟡 Minor comments (14)
examples/android/RunAnyWhereLora/.idea/misc.xml-4-4 (1)
4-4:⚠️ Potential issue | 🟡 MinorUpdate
misc.xmlto match the project's Java 17 configuration.The Gradle build (lines 58–61 of
app/build.gradle.kts) explicitly setssourceCompatibility = JavaVersion.VERSION_17andtargetCompatibility = JavaVersion.VERSION_17, butmisc.xmldeclareslanguageLevel="JDK_1_7". This mismatch will cause Android Studio's IDE analyzer to flag valid Java 8+ syntax (lambdas, streams, method references) as errors until a Gradle sync overwrites the stale metadata. Updatemisc.xmlline 4 tolanguageLevel="JDK_17"to reflect the actual build configuration.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@examples/android/RunAnyWhereLora/.idea/misc.xml` at line 4, misc.xml currently sets the IDE language level to JDK_1_7 which mismatches the project's Gradle config; update the ProjectRootManager component's languageLevel attribute (the element containing ProjectRootManager version="2") from "JDK_1_7" to "JDK_17" so the IDE reflects sourceCompatibility = JavaVersion.VERSION_17 and stops flagging Java 8+/17 syntax as errors.examples/android/RunAnyWhereLora/app/proguard-rules.pro-6-6 (1)
6-6:⚠️ Potential issue | 🟡 MinorStale HTTP documentation URL.
http://developer.android.com/guide/developing/tools/proguard.htmlredirects to a deprecated page. The current ProGuard/R8 shrinking guide is athttps://developer.android.com/build/shrink-code.📝 Proposed fix
-# http://developer.android.com/guide/developing/tools/proguard.html +# https://developer.android.com/build/shrink-code🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@examples/android/RunAnyWhereLora/app/proguard-rules.pro` at line 6, Replace the stale ProGuard documentation URL string "http://developer.android.com/guide/developing/tools/proguard.html" with the current shrinker guide URL "https://developer.android.com/build/shrink-code" so the comment in proguard-rules.pro points to the up-to-date Android ProGuard/R8 documentation..idea/vcs.xml-5-8 (1)
5-8:⚠️ Potential issue | 🟡 MinorAvoid committing build-directory paths as VCS roots.
All five new mappings (lines 5–9) point into
build/…/_deps/subdirectories — ephemeral CMake FetchContent clones whose exact paths depend on which build targets the local developer has run. Committing these will cause every other contributor who has not built those specific targets (e.g.android/unified/arm64-v8a,dev-core) to receive spurious "VCS root not registered" warnings in IntelliJ IDEA. The root mapping on line 4 (directory="") already covers the whole repository; no additional entries are needed.Recommended fix: remove lines 5–9 and ensure
build/remains listed in.gitignoreso these transient directories are never picked up again.🗑️ Proposed fix
<component name="VcsDirectoryMappings"> <mapping directory="" vcs="Git" /> - <mapping directory="$PROJECT_DIR$/sdk/runanywhere-commons/build/android/unified/arm64-v8a/_deps/llamacpp-src" vcs="Git" /> - <mapping directory="$PROJECT_DIR$/sdk/runanywhere-commons/build/android/unified/arm64-v8a/_deps/nlohmann_json-src" vcs="Git" /> - <mapping directory="$PROJECT_DIR$/sdk/runanywhere-commons/build/dev-core/_deps/nlohmann_json-src" vcs="Git" /> - <mapping directory="$PROJECT_DIR$/sdk/runanywhere-commons/build/dev/_deps/llamacpp-src" vcs="Git" /> - <mapping directory="$PROJECT_DIR$/sdk/runanywhere-commons/build/dev/_deps/nlohmann_json-src" vcs="Git" /> </component>🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In @.idea/vcs.xml around lines 5 - 8, Remove the transient VCS root mappings that point inside build/_deps (the <mapping directory=".../_deps/llamacpp-src" vcs="Git" />, <mapping directory=".../_deps/nlohmann_json-src" vcs="Git" /> entries) from the .idea/vcs.xml so IntelliJ doesn't warn about unregistered VCS roots; keep only the existing repository root mapping and ensure build/ is listed in .gitignore so those FetchContent-generated directories are never committed again.examples/android/RunAnyWhereLora/app/src/main/res/xml/backup_rules.xml-5-6 (1)
5-6:⚠️ Potential issue | 🟡 MinorMisleading comment —
older thanshould berunning API 31 or higher.The note on line 5 says "This file is ignored for devices older than API 31", but the opposite is true.
<full-backup-content>controls files backed up on devices running Android 11 (API level 30) or lower; these rules are also used for Android 12+ devices if the app targets Android 11 or lower. ThedataExtractionRulesattribute applies to Android 12 and above APIs, whereasallowBackupandfullBackupContentattributes are for Android versions prior to API 31.As written, the comment implies pre-API 31 devices ignore this file, which could lead a future maintainer to skip configuring backup exclusions for those devices entirely.
✏️ Suggested fix
- Note: This file is ignored for devices older than API 31 + Note: This file is ignored for devices running API 31 or higher. + For API 31+ devices, use data_extraction_rules.xml instead. See https://developer.android.com/about/versions/12/backup-restore🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@examples/android/RunAnyWhereLora/app/src/main/res/xml/backup_rules.xml` around lines 5 - 6, Update the misleading comment in backup_rules.xml: replace "This file is ignored for devices older than API 31" with a correct note such as "This file is ignored on devices running API 31 or higher" and add a brief clarification that <full-backup-content> and allowBackup apply to Android 11 (API 30) and below while dataExtractionRules applies to Android 12+ (API 31+), so maintainers know which attributes control backups on which API levels.sdk/runanywhere-swift/Sources/RunAnywhere/Public/Sessions/LiveTranscriptionSession.swift-130-130 (1)
130-130:⚠️ Potential issue | 🟡 MinorDeprecation wrapper does not silence the warning at the call site in
start().Calling a
@available(*, deprecated)method from a non-deprecated context emits a deprecation warning. The wrapperstartLegacyStreamingat line 130 will produce:'startLegacyStreaming(options:onPartialResult:onFinalResult:onError:)' is deprecated: Migrate to transcribeStream API
The deprecation annotation on the wrapper silences warnings inside the wrapper but surfaces the warning at the call site instead. To fully suppress the warning until migration, remove the deprecation marker and add a TODO comment:
Proposed fix
- // Wrapper to silence deprecation warning until migration to transcribeStream - `@available`(*, deprecated, message: "Migrate to transcribeStream API") - private static func startLegacyStreaming( + // TODO: Migrate to transcribeStream API and remove this wrapper + private static func startLegacyStreaming( options: STTOptions, onPartialResult: `@escaping` (STTTranscriptionResult) -> Void, onFinalResult: `@escaping` (STTOutput) -> Void, onError: `@escaping` (Error) -> Void ) async throws { + // swiftlint:disable:next deprecation_warning try await RunAnywhere.startStreamingTranscription( options: options, onPartialResult: onPartialResult, onFinalResult: onFinalResult, onError: onError ) }Note:
// swiftlint:disable:nextonly suppresses SwiftLint linting rules, not compiler warnings. For compiler-level suppression, the pattern above (removing the wrapper's deprecation and adding the comment) is the standard workaround in Swift 6 when line-level pragma support is unavailable.Also applies to: 216-230
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@sdk/runanywhere-swift/Sources/RunAnywhere/Public/Sessions/LiveTranscriptionSession.swift` at line 130, The call site in start() is still getting a compiler deprecation warning because the wrapper startLegacyStreaming(options:onPartialResult:onFinalResult:onError:) is annotated `@available`(deprecated); remove the deprecation attribute from that wrapper (and the analogous deprecated wrapper around lines 216-230) so calls from start() do not emit warnings, and add a TODO comment above each wrapper indicating it should be migrated to transcribeStream in the future (e.g., "// TODO: migrate to transcribeStream; temporary non-deprecated shim to avoid compiler warnings"). Ensure signatures (startLegacyStreaming(...)) remain unchanged so callers like start() still call the same method.docs/impl/lora_adapter_support.md-467-491 (1)
467-491:⚠️ Potential issue | 🟡 MinorAdd a language specifier to the fenced code block (MD040).
The layer-diagram fence has no language tag;
markdownlint-cli2reportsMD040. Use```text(or```plain) to satisfy the linter.📝 Proposed fix
-``` +```text Kotlin Public API (RunAnywhere.loadLoraAdapter) ...🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/impl/lora_adapter_support.md` around lines 467 - 491, The fenced diagram block starting with "Kotlin Public API (RunAnywhere.loadLoraAdapter)" is missing a language tag; update the opening fence from ``` to a plain text specifier such as ```text (or ```plain) so the block becomes a labeled plain-text fence, keeping the diagram contents unchanged and ensuring the linter MD040 is satisfied.docs/impl/lora_adapter_support.md-208-213 (1)
208-213:⚠️ Potential issue | 🟡 Minor
clearAdapters()inconsistent with the stated error-handling contract.Line 162 explicitly states "All LoRA functions throw
SDKErroron failure," but theclearAdapters()ViewModel example has no try-catch. Whilerac_llm_component_clear_loraalways returnsRAC_SUCCESSat the C level, the Kotlin wrapper can still throwSDKError.notInitializedif the SDK is not ready. The example should be consistent.📝 Proposed fix
fun clearAdapters() { viewModelScope.launch { - RunAnywhere.clearLoraAdapters() - refreshAdapterList() + try { + RunAnywhere.clearLoraAdapters() + refreshAdapterList() + } catch (e: SDKError) { + _state.update { it.copy(error = e.message) } + } } }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/impl/lora_adapter_support.md` around lines 208 - 213, The clearAdapters() example violates the documented error contract by not handling SDKError; wrap the RunAnywhere.clearLoraAdapters() call inside a try-catch in clearAdapters() (the ViewModel method) to catch SDKError (e.g., SDKError.notInitialized) and handle/log it before/after calling refreshAdapterList(), ensuring the Kotlin wrapper's possible exceptions are caught and dealt with according to the project’s error-handling pattern.docs/impl/lora_adapter_support.md-653-653 (1)
653-653:⚠️ Potential issue | 🟡 MinorSentence fragment flagged by LanguageTool (MISSING_IT_THERE).
"Could be done by calling…" is missing a subject.
📝 Proposed fix
-Could be done by calling `llama_set_adapter_lora(ctx, adapter, new_scale)` +This could be done by calling `llama_set_adapter_lora(ctx, adapter, new_scale)`🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/impl/lora_adapter_support.md` at line 653, The sentence fragment "Could be done by calling…" lacks a subject; update the sentence to include an explicit subject (e.g., "This could be done by calling…" or "It could be done by calling…") or rephrase the line to a complete sentence; locate the exact phrase "Could be done by calling…" in docs/impl/lora_adapter_support.md and replace it with the chosen complete sentence.sdk/runanywhere-swift/Sources/RunAnywhere/Public/Extensions/VLM/RunAnywhere+VisionLanguage.swift-12-14 (1)
12-14:⚠️ Potential issue | 🟡 MinorCorrect the documented safety invariant for
@unchecked Sendable— not all pointer fields are backed byrgbData.The
@unchecked Sendableconformance is safe, but the comment's claim is inaccurate.rac_vlm_image_thas three pointer members:file_path,pixel_data, andbase64_data. Onlypixel_datais backed by the capturedrgbDataData object. Thefile_pathandbase64_datapointers come frompath.withCStringandencoded.withCStringrespectively.The actual safety mechanism is scoped pointer lifetime: all pointers are only assigned and used within the
withCPointersclosure where the underlying data (strings, Data buffers) remains alive. The C function calls happen immediately within that closure scope, so pointers never escape their valid lifetime.Update the comment to reflect this:
// C struct with raw pointers — safe to send across concurrency boundaries // because all pointer fields (file_path, base64_data, pixel_data) are only // assigned and dereferenced within withCPointers, keeping them alive. extension rac_vlm_image_t: `@unchecked` Sendable {}🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@sdk/runanywhere-swift/Sources/RunAnywhere/Public/Extensions/VLM/RunAnywhere`+VisionLanguage.swift around lines 12 - 14, Update the comment on the `@unchecked` Sendable conformance for rac_vlm_image_t to accurately describe the safety invariant: state that all pointer fields (file_path, base64_data, pixel_data) are only assigned and dereferenced within withCPointers so their underlying storage (strings/Data including rgbData) remains alive for the C call; mention scoped pointer lifetime rather than claiming all pointers are backed by rgbData. Reference rac_vlm_image_t, withCPointers, and the pointer fields (file_path, base64_data, pixel_data) in the comment.examples/android/RunAnyWhereLora/app/src/main/java/com/runanywhere/run_anywhere_lora/LoraScreen.kt-262-272 (1)
262-272:⚠️ Potential issue | 🟡 MinorDual
weight(1f)produces a 50/50 split when generating with an empty answer.When
isGeneratingis true andansweris empty, both theSelectionContainer(line 222, showing empty text) and the spinnerBox(line 264) receiveweight(1f), splitting the available space equally. The emptySelectionContainertakes half the card for no visible content.Consider wrapping the spinner in a full-size
Boxwithout competing weight, or gating theSelectionContaineronanswer.isNotEmpty().🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@examples/android/RunAnyWhereLora/app/src/main/java/com/runanywhere/run_anywhere_lora/LoraScreen.kt` around lines 262 - 272, The UI shows a 50/50 split because both the SelectionContainer (the empty text block inside LoraScreen) and the spinner Box (shown when state.isGenerating && state.answer.isEmpty()) have weight(1f); fix by preventing competing weights: either conditionally render the SelectionContainer only when state.answer.isNotEmpty() or remove the .weight(1f) from the spinner Box and make it fill available space (e.g., use fillMaxSize()/fillMaxWidth()) so the spinner occupies the full card without splitting, updating the logic inside LoraScreen where state.isGenerating and state.answer are checked and the CircularProgressIndicator Box is created.examples/android/RunAnyWhereLora/app/src/main/java/com/runanywhere/run_anywhere_lora/MainActivity.kt-80-82 (1)
80-82:⚠️ Potential issue | 🟡 MinorRetry button is a no-op.
The
onClickhandler is empty. Either wire it up to actually retry SDK initialization (e.g., pass a lambda from theLoraApplication), or remove the button to avoid misleading users.Would you like me to suggest an implementation that wires the retry action to
LoraApplication?🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@examples/android/RunAnyWhereLora/app/src/main/java/com/runanywhere/run_anywhere_lora/MainActivity.kt` around lines 80 - 82, The Retry TextButton currently has an empty onClick and should either be removed or wired to actually retry SDK initialization; implement a retry lambda on the activity/application side (expose a retryInitialization or initializeSdk method on LoraApplication) and call that from the TextButton's onClick in MainActivity.kt (or pass the lambda down into the Composable), ensuring the onClick invokes LoraApplication.retryInitialization() (or the chosen method) to re-attempt SDK init and update UI state accordingly.sdk/runanywhere-commons/include/rac/backends/rac_llm_llamacpp.h-195-209 (1)
195-209:⚠️ Potential issue | 🟡 MinorDocumentation says scale is "0.0-1.0" but the actual range is unbounded.
Line 204 documents the scale parameter as
(0.0-1.0, default 1.0), but:
- The Android UI (
LoraScreen.ktline 477) allows0f..2f.- The C++ implementation (
load_lora_adapter) does not clamp the value.Either update the documentation to reflect the actual valid range (e.g.,
0.0–2.0or just state the default without an upper bound), or add validation in the C layer to reject out-of-range values.Proposed doc fix
- * `@param` scale Adapter scale factor (0.0-1.0, default 1.0) + * `@param` scale Adapter scale factor (typically 0.0-2.0, default 1.0)Based on learnings: "Public C API headers in include/rac/ must document vtable operations, error codes, and lifecycle requirements"
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@sdk/runanywhere-commons/include/rac/backends/rac_llm_llamacpp.h` around lines 195 - 209, The documentation for rac_llm_llamacpp_load_lora incorrectly constrains the scale to "0.0-1.0"; update either the header docs or the C shim to match actual behavior: either change the parameter description in rac_llm_llamacpp_load_lora to state the true range (e.g., "default 1.0, accepts values >=0.0 (UI allows up to 2.0)") and note no clamping, or add validation in the C implementation (load_lora_adapter) to reject/clamp values outside the desired range and return a clear error code; reference rac_llm_llamacpp_load_lora and load_lora_adapter so the maintainers can find and update the declaration, docs, and/or validation to stay consistent with LoraScreen.kt behavior.examples/android/RunAnyWhereLora/app/src/main/java/com/runanywhere/run_anywhere_lora/LoraApplication.kt-45-48 (1)
45-48:⚠️ Potential issue | 🟡 Minor
onTerminate()is never called on production Android devices.Per Android documentation,
Application.onTerminate()is only invoked in emulated environments. Relying on it for scope cleanup means theapplicationScopeis never cancelled on real devices. For an example app this is low-risk, but worth noting — consider usingProcessLifecycleOwneror accepting the scope lives for the process lifetime.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@examples/android/RunAnyWhereLora/app/src/main/java/com/runanywhere/run_anywhere_lora/LoraApplication.kt` around lines 45 - 48, onTerminate() is only called in emulators so cancelling applicationScope there won't run on real devices; replace or supplement that cleanup by observing process lifecycle (use ProcessLifecycleOwner.get().lifecycle) to cancel applicationScope in onStop()/onDestroy or else document/accept that applicationScope intentionally lives for the process lifetime. Locate the Application subclass (methods onTerminate and variable applicationScope) and either register a LifecycleObserver/DefaultLifecycleObserver with ProcessLifecycleOwner to call applicationScope.cancel() at appropriate terminal lifecycle event, or remove the onTerminate-based cancel and add a comment explaining the scope is purposely process-scoped.sdk/runanywhere-commons/src/backends/llamacpp/rac_llm_llamacpp.cpp-381-396 (1)
381-396:⚠️ Potential issue | 🟡 Minor
strdupreturn value unchecked — potential null dereference on OOM.
strdupat line 393 can returnNULLif memory allocation fails. While unlikely for a short JSON string, the pattern should handle it for robustness. The same applies to the pre-existingstrdupat line 320 (get_model_info).Proposed fix
auto info = h->text_gen->get_lora_info(); std::string json_str = info.dump(); - *out_json = strdup(json_str.c_str()); + *out_json = strdup(json_str.c_str()); + if (!*out_json) { + return RAC_ERROR_OUT_OF_MEMORY; + } return RAC_SUCCESS;🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@sdk/runanywhere-commons/src/backends/llamacpp/rac_llm_llamacpp.cpp` around lines 381 - 396, The strdup call in rac_llm_llamacpp_get_lora_info can return NULL on allocation failure; update the function to check the return of strdup(json_str.c_str()) and handle OOM by returning an appropriate error (e.g., RAC_ERROR_OUT_OF_MEMORY) and leaving *out_json in a defined state (set to nullptr on failure). Apply the same defensive check/pattern used for get_model_info so both rac_llm_llamacpp_get_lora_info and get_model_info validate strdup, avoid null-deref, and return a clear error when allocation fails.
🧹 Nitpick comments (20)
examples/android/RunAnyWhereLora/.gitignore (1)
3-3: Redundant root-anchoredlocal.propertiesentry.Line 3 (
/local.properties) is a strict subset of the unanchoredlocal.propertiespattern on Line 15, which already matches the file at any directory depth including the root. Line 3 can be removed.🧹 Proposed cleanup
-.gradle -/local.properties +.gradle /.idea/caches🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@examples/android/RunAnyWhereLora/.gitignore` at line 3, Remove the redundant root-anchored ignore entry "/local.properties" from the .gitignore; the unanchored "local.properties" pattern already covers files at any directory depth (including the repo root), so keep the unanchored "local.properties" rule and delete the "/local.properties" line to avoid duplication.examples/android/RunAnyWhereLora/app/src/androidTest/java/com/runanywhere/run_anywhere_lora/ExampleInstrumentedTest.kt (1)
17-23: LGTM — standard boilerplate; consider adding feature-level tests.This is the default Android Studio–generated instrumented test and is correct as-is. The package name assertion at Line 22 matches the declared package at Line 1.
Since this PR introduces a new Android demo app with LoRA model loading and management, the only optional improvement is expanding test coverage beyond the boilerplate to cover the actual feature flows (e.g., loading a model, attaching/detaching a LoRA adapter, verifying inference output).
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@examples/android/RunAnyWhereLora/app/src/androidTest/java/com/runanywhere/run_anywhere_lora/ExampleInstrumentedTest.kt` around lines 17 - 23, The existing instrumented test ExampleInstrumentedTest::useAppContext is fine; add new feature-level instrumented tests that exercise the LoRA demo flows — create a new test class (e.g., LoRaInstrumentedTest) or add tests alongside ExampleInstrumentedTest that call the app logic to load a model, attach and detach a LoRA adapter, and run a sample inference asserting expected non-error results or known outputs; target the same instrumentation context (InstrumentationRegistry.getInstrumentation().targetContext) to obtain app resources and use the app's model-loading and adapter-management APIs to perform the assertions.examples/android/RunAnyWhereLora/app/src/test/java/com/runanywhere/run_anywhere_lora/ExampleUnitTest.kt (1)
12-16: Boilerplate placeholder — consider adding LoRA-specific unit tests.
addition_isCorrectis the auto-generated Android Studio scaffold test and doesn't cover any of the new LoRA functionality introduced in this PR (adapter loading, hot-swap, remove, clear, query). Leaving only this placeholder can give a false sense of unit-test coverage for the new feature.Consider replacing or supplementing it with tests that exercise the public LoRA API surface — e.g., verifying that loading a LoRA adapter updates the expected state, that removing one clears it correctly, and that invalid inputs are handled gracefully.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@examples/android/RunAnyWhereLora/app/src/test/java/com/runanywhere/run_anywhere_lora/ExampleUnitTest.kt` around lines 12 - 16, Replace the placeholder test in ExampleUnitTest.addition_isCorrect with LoRA-focused unit tests that exercise the public LoRA API surface: add tests that call loadAdapter (or the project’s adapter-loading method) and assert the adapter appears in the manager/state, test hotSwapAdapter to ensure a replacement updates state and behavior, test removeAdapter and clear (or clearAdapters) to ensure adapters are removed/cleared and that subsequent queries fail or return empty, and test queryAdapter (or the query method) for expected responses and graceful handling of invalid inputs; keep tests small, use the ExampleUnitTest class to group them, and assert both success and failure cases for each API call.examples/android/RunAnyWhereLora/app/src/main/res/xml/data_extraction_rules.xml (1)
7-19: Consider excluding large model-file directories from backup and device-transfer.Both
<cloud-backup>(line 7) and<device-transfer>(lines 13–18) have no active rules, so the OS will include all app data in cloud backups and device migrations by default. Google Drive autobackup is capped at 25 MB per app; if the quota is reached, the system stops backing up, and when a new backup is made the previous one is deleted. An AI inference app that caches model weights locally will almost certainly blow this cap, causing silent backup failures.If the
<device-transfer>attribute is not set, all the application data will be transferred during a D2D migration. Large model files transferred during device setup will noticeably slow down the experience.When you're ready to customize, exclude model/cache directories in both sections:
🔧 Suggested skeleton for model-file exclusions
<cloud-backup> - <!-- TODO: Use <include> and <exclude> to control what is backed up. - <include .../> - <exclude .../> - --> + <!-- Exclude downloaded model weights and caches from cloud backup --> + <exclude domain="file" path="models/"/> + <exclude domain="file" path="cache/"/> </cloud-backup> -<!-- <device-transfer> - <include .../> - <exclude .../> + <exclude domain="file" path="models/"/> + <exclude domain="file" path="cache/"/> </device-transfer> --->🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@examples/android/RunAnyWhereLora/app/src/main/res/xml/data_extraction_rules.xml` around lines 7 - 19, The cloud-backup and device-transfer sections currently have no rules causing all app data to be backed up/transferred; update the data_extraction_rules (data_extraction_rules.xml) by adding explicit <exclude> entries inside the <cloud-backup> and inside the <device-transfer> blocks to omit large model/cache directories (e.g., model/, models/, cache/, weights/, .tflite, .pt, or your app's model storage folder) and any temp or download folders used for inference; use <include> only for small essential files if needed and ensure the same exclude patterns appear in both <cloud-backup> and <device-transfer> so large model files are neither backed up to cloud nor migrated during device-to-device transfers.sdk/runanywhere-swift/Sources/RunAnywhere/Public/Sessions/LiveTranscriptionSession.swift (2)
71-83: Strongselfcapture replaces[weak self]— verify retention semantics are intentional.The previous code presumably used
[weak self]in the stream closures. Nowlet session = selfcaptures the session strongly. This means theLiveTranscriptionSessionwill stay alive as long as theAsyncStreamreturned bytranscriptionsis held (specifically, theonTerminationclosure at line 78 retainssession).This is likely the correct behavior — consumers iterating the stream should keep the session alive — but it's a subtle behavioral change. If any call site previously relied on the session being weakly held and deallocated while a stream consumer was still active, that code path will now retain the session longer.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@sdk/runanywhere-swift/Sources/RunAnywhere/Public/Sessions/LiveTranscriptionSession.swift` around lines 71 - 83, The current code captures self strongly via let session = self inside transcriptions' AsyncStream, which changes retention semantics; if you want to restore previous weak capture behavior, remove let session = self and capture [weak self] in the Task closures inside the AsyncStream, then guard-let self (or finish the continuation) before calling self.onPartialCallback and before clearing it in continuation.onTermination; otherwise, if the strong-keep-alive behavior is intended, add a short comment in LiveTranscriptionSession/transcriptions documenting that the returned AsyncStream intentionally retains the session until the stream terminates.
39-39: Consider alternatives to@unchecked Sendablefor full Swift 6 strict concurrency compliance.The class is already
@MainActor-isolated, which provides actor-based thread safety. The current implementation correctly wraps all property access inTask {@mainactorin ... }blocks, even within non-isolated callback contexts (lines 133–175). However,@unchecked Sendablesuppresses compiler diagnostics, so future modifications that bypass this wrapping pattern wouldn't be caught.While
@unchecked Sendableis a pragmatic pattern used throughout the SDK for bridging C callbacks and working with non-Sendabletypes, Swift 6 best practices favor explicitSendableconformance. If feasible, consider whether making stored propertiesSendableor restructuring callback handling could eliminate the need for the unchecked escape hatch.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@sdk/runanywhere-swift/Sources/RunAnywhere/Public/Sessions/LiveTranscriptionSession.swift` at line 39, The LiveTranscriptionSession currently uses `@unchecked` Sendable; instead remove `@unchecked` Sendable and make the class strictly actor-isolated and Sendable-safe by either (A) relying solely on `@MainActor` isolation (remove the Sendable conformance and ensure all external callback code dispatches into Task { `@MainActor` in ... } as already done in the callback handlers), or (B) if you must keep cross-thread callbacks, extract mutable state into a small `@MainActor-isolated` actor or a separate `@MainActor-bound` reference type (e.g., LiveTranscriptionSessionState) and keep LiveTranscriptionSession free of non-Sendable stored properties so the compiler can verify Sendable conformance; update any stored properties referenced from outside the main actor to be Sendable or accessed only via the `@MainActor-bound` actor and remove `@unchecked` Sendable from the class declaration (symbol: LiveTranscriptionSession, Task { `@MainActor` in ... }, and any stored properties you extract into the actor).examples/android/RunAnyWhereLora/app/src/main/res/values/themes.xml (1)
4-4: Consider using a Material3 theme parent.
android:Theme.Material.Light.NoActionBaris the legacy Material 1 framework theme. Since this app uses Jetpack Compose (with aRunAnyWhereLoraThemecomposable per the AI summary), the XML bridge theme should ideally descend fromTheme.Material3.DayNight.NoActionBarfor consistent Material3 system UI integration (window decorations, edge-to-edge, etc.).♻️ Proposed refactor
- <style name="Theme.RunAnyWhereLora" parent="android:Theme.Material.Light.NoActionBar" /> + <style name="Theme.RunAnyWhereLora" parent="Theme.Material3.DayNight.NoActionBar" />Ensure
com.google.android.material:materialis in the app's dependencies forTheme.Material3.*to resolve.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@examples/android/RunAnyWhereLora/app/src/main/res/values/themes.xml` at line 4, The app theme Theme.RunAnyWhereLora currently inherits android:Theme.Material.Light.NoActionBar (legacy Material1); change its parent to a Material3 parent such as Theme.Material3.DayNight.NoActionBar to align with the Compose RunAnyWhereLoraTheme and system UI behavior, and ensure the Material3 dependency (com.google.android.material:material) is added to the app dependencies so the Theme.Material3.* parents resolve.docs/impl/lora_adapter_support.md (1)
697-699: Changelog author field credits the AI model ("Claude") rather than the human contributor.This is an editorial nit, but recorded authorship in project documentation should identify the human or team responsible, not the AI assistant used to draft the code.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/impl/lora_adapter_support.md` around lines 697 - 699, The changelog entries currently list the AI model name "Claude" as the author for the two 2026-02-19 entries; replace "Claude" with the appropriate human author or team name (e.g., the contributor's full name or "Team XYZ") for both entries so project documentation credits the human contributor; update the author field in the two entries that start with "2026-02-19" and keep the rest of the entry text unchanged.examples/android/RunAnyWhereLora/app/build.gradle.kts (2)
52-55: Overly broadpickFirstsglob may silently hide real.soversion/ABI conflicts.
"lib/**/*.so"resolves every duplicate shared-library conflict across all ABI directories by taking the first encountered copy. Ifrunanywhere-kotlinandrunanywhere-core-llamacppever ship different, incompatible builds of the same.so, this rule will silently pick one and produce a crash at runtime rather than a build-time error.Consider scoping the pattern to the specific libraries known to be duplicated (e.g.,
"lib/*/libc++_shared.so") so that unexpected new conflicts are still caught during the build.♻️ Suggested narrowing
jniLibs { useLegacyPackaging = true - pickFirsts += listOf("lib/**/*.so") + // Only suppress the known duplicate C++ runtime; all other conflicts surface as build errors. + pickFirsts += listOf("lib/*/libc++_shared.so") }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@examples/android/RunAnyWhereLora/app/build.gradle.kts` around lines 52 - 55, The jniLibs block currently uses an overly broad pickFirsts pattern ("lib/**/*.so") which can silently mask ABI/version conflicts; replace that broad glob in the jniLibs configuration (where useLegacyPackaging and pickFirsts are defined) with a whitelist of the specific duplicate .so filenames you expect (e.g., "lib/*/libc++_shared.so" or other explicit names) or remove the pickFirsts entry so unexpected duplicates surface as build errors; update the pickFirsts list to only include known-safe library basenames rather than a recursive wildcard.
25-33:isMinifyEnabled = falsein the release build type will produce an un-shrunk, un-obfuscated APK.Acceptable for an example app, but note that the final APK will be significantly larger and the SDK internals will be fully visible in tools like
apktool. If this example is distributed as a reference, enabling R8 shrinking with the existingproguard-android-optimize.txtruleset would align it with production practices.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@examples/android/RunAnyWhereLora/app/build.gradle.kts` around lines 25 - 33, The release build currently sets isMinifyEnabled = false which produces an unshrunken, unobfuscated APK; update the release block (buildTypes -> release) to enable R8 shrinking/obfuscation by setting isMinifyEnabled = true, keep the existing proguardFiles(getDefaultProguardFile("proguard-android-optimize.txt"), "proguard-rules.pro") and optionally add shrinkResources = true to remove unused resources for a production-like example.examples/android/RunAnyWhereLora/.idea/runConfigurations.xml (1)
1-17: Consider whether.idea/runConfigurations.xmlshould be version-controlled.This file disables all JUnit and Kotlin JUnit run-configuration producers for the module. Committing it imposes this setting on every contributor who opens the project in Android Studio, preventing "Run test" gutter icons from appearing for any test class. Since the
build.gradle.ktsincludes test dependencies (testImplementation,androidTestImplementation), contributors will find tests undiscoverable from the IDE.If the intent is just to suppress noisy "create JUnit configuration" prompts during development, consider adding
.idea/to.gitignoreinstead (or at least limit this file to not being tracked).🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@examples/android/RunAnyWhereLora/.idea/runConfigurations.xml` around lines 1 - 17, This runConfigurations.xml disables many JUnit/Kotlin test configuration producers (e.g., AbstractAllInDirectoryConfigurationProducer, AllInPackageConfigurationProducer, PatternConfigurationProducer, TestInClassConfigurationProducer, UniqueIdConfigurationProducer, JUnitTestDiscoveryConfigurationProducer, KotlinJUnitRunConfigurationProducer, KotlinPatternConfigurationProducer) and should not be enforced in the repo; remove this file from version control (or revert it to default) so IDE test discovery works for contributors, and instead add .idea/ (or at least this runConfigurations.xml) to .gitignore or stop tracking it so the producers are not globally disabled for all developers.examples/android/RunAnyWhereLora/app/src/main/java/com/runanywhere/run_anywhere_lora/ui/theme/Theme.kt (1)
3-3: Unused import:android.app.Activity.This import is not referenced in the file — likely a leftover from the project template.
🧹 Remove unused import
-import android.app.Activity🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@examples/android/RunAnyWhereLora/app/src/main/java/com/runanywhere/run_anywhere_lora/ui/theme/Theme.kt` at line 3, Remove the unused import statement "import android.app.Activity" from the file; locate the import line and delete it (or run Optimize Imports / auto-import cleanup) so only referenced Android types remain in Theme.kt.sdk/runanywhere-commons/include/rac/features/llm/rac_llm_service.h (1)
52-63: Vtable entries lack error-code, lifecycle, and memory-ownership documentation.Per coding guidelines, public C API headers must document vtable operations, error codes, and lifecycle requirements. The new LoRA entries should document:
get_lora_info— who owns*out_json? If the caller mustfree()it, state that explicitly.- Preconditions — what
rac_result_tis returned when called with no model loaded, or with a NULL/emptyadapter_path?load_lorascale — any valid range constraints (e.g., must be > 0)?The existing vtable entries above (e.g.,
generate,cleanup) set a precedent of minimal docs too, but since this is new API surface, it's a good opportunity to raise the bar.📝 Suggested documentation enhancement
- /** Load a LoRA adapter (optional, NULL if not supported) */ - rac_result_t (*load_lora)(void* impl, const char* adapter_path, float scale); + /** + * Load a LoRA adapter (optional, NULL if not supported). + * `@param` impl Backend implementation handle + * `@param` adapter_path Path to LoRA GGUF file (must not be NULL) + * `@param` scale Adapter strength (typically 0.0–1.0+) + * `@return` RAC_SUCCESS, RAC_ERROR_INVALID_ARGUMENT, or RAC_ERROR_NOT_INITIALIZED + */ + rac_result_t (*load_lora)(void* impl, const char* adapter_path, float scale); - /** Remove a LoRA adapter by path (optional, NULL if not supported) */ - rac_result_t (*remove_lora)(void* impl, const char* adapter_path); + /** + * Remove a LoRA adapter by path (optional, NULL if not supported). + * `@param` impl Backend implementation handle + * `@param` adapter_path Path used when loading (must not be NULL) + * `@return` RAC_SUCCESS or RAC_ERROR_NOT_FOUND + */ + rac_result_t (*remove_lora)(void* impl, const char* adapter_path); - /** Clear all LoRA adapters (optional, NULL if not supported) */ - rac_result_t (*clear_lora)(void* impl); + /** + * Clear all LoRA adapters (optional, NULL if not supported). + * `@param` impl Backend implementation handle + * `@return` RAC_SUCCESS or error code + */ + rac_result_t (*clear_lora)(void* impl); - /** Get loaded LoRA adapters info as JSON (optional, NULL if not supported) */ - rac_result_t (*get_lora_info)(void* impl, char** out_json); + /** + * Get loaded LoRA adapters info as JSON (optional, NULL if not supported). + * `@param` impl Backend implementation handle + * `@param` out_json Output: caller-owned JSON string; free with free() + * `@return` RAC_SUCCESS or error code + */ + rac_result_t (*get_lora_info)(void* impl, char** out_json);As per coding guidelines: "Public C API headers in include/rac/ must document vtable operations, error codes, and lifecycle requirements."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@sdk/runanywhere-commons/include/rac/features/llm/rac_llm_service.h` around lines 52 - 63, The new LoRA vtable entries (load_lora, remove_lora, clear_lora, get_lora_info) lack required public-API documentation: update the header comments for each function to state exact preconditions and returned rac_result_t values (e.g., returned error when no model is loaded, when adapter_path is NULL/empty, or when adapter not found), document lifecycle/ownership rules (explicitly state whether *out_json is heap-allocated and must be freed by the caller, or remains owned by the implementation and must not be freed), and specify valid range/constraints for the scale parameter of load_lora (e.g., >0 and recommended bounds). Reference the vtable symbols (load_lora, remove_lora, clear_lora, get_lora_info) and rac_result_t in the comments so callers know error semantics and memory ownership.sdk/runanywhere-commons/src/backends/llamacpp/llamacpp_backend.h (1)
101-118: Duplicate section header: "TEXT GENERATION IMPLEMENTATION" appears twice.Lines 101–103 (pre-existing) and lines 116–118 both carry the same "TEXT GENERATION IMPLEMENTATION" section banner. The first one should be renamed to something like "LORA ADAPTER ENTRY" to match the content it actually introduces.
Proposed fix
-// ============================================================================= -// TEXT GENERATION IMPLEMENTATION -// ============================================================================= - // ============================================================================= // LORA ADAPTER ENTRY // =============================================================================(Remove the first redundant header at line 101–103, keeping the new LoRA section header at 105–107.)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@sdk/runanywhere-commons/src/backends/llamacpp/llamacpp_backend.h` around lines 101 - 118, The file contains a duplicate section banner "TEXT GENERATION IMPLEMENTATION" around the LoraAdapterEntry declaration; remove or rename the first banner so the LoraAdapterEntry block is correctly labeled (e.g., change the earlier "TEXT GENERATION IMPLEMENTATION" header to "LORA ADAPTER ENTRY" or delete it) to avoid duplicate section headers while keeping the existing LORA ADAPTER ENTRY header that surrounds struct LoraAdapterEntry and its members (llama_adapter_lora*, path, scale, applied)..github/workflows/build-all-test.yml (2)
108-147: ~120 lines of duplicated development config generation acrosscpp-androidandcpp-ios.The "Setup Development Config" steps at lines 108–147 and 195–234 are identical. If the config format or validation logic changes, both must be updated in lockstep.
Consider extracting this into a composite action or a shared script (e.g.,
scripts/ci/generate-dev-config.sh) referenced by both jobs.Also applies to: 195-234
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In @.github/workflows/build-all-test.yml around lines 108 - 147, The duplicated "Setup Development Config" block should be extracted into a reusable script and invoked from both cpp-android and cpp-ios jobs: create a script (e.g., scripts/ci/generate-dev-config.sh) that accepts SUPABASE_URL, SUPABASE_ANON_KEY and BUILD_TOKEN, performs the CLEAN_* trimming, writes the same C++ template to CONFIG_FILE (preserving symbols like CONFIG_FILE, SUPABASE_URL, SUPABASE_ANON_KEY, BUILD_TOKEN, SENTRY_DSN and functions rac_dev_config_is_available / rac_dev_config_get_*), and update both workflow steps to call that script with the three env vars instead of duplicating the heredoc.
395-398: Redundant failure suppression on Flutter test step.
melos run test || truealready swallows failures, makingcontinue-on-error: trueredundant. Pick one — typicallycontinue-on-error: trueis preferred in workflows because it preserves the step status as "failed" in the UI while not blocking downstream jobs.Proposed fix
- name: Run Tests working-directory: sdk/runanywhere-flutter - run: melos run test || true - continue-on-error: true + run: melos run test + continue-on-error: true🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In @.github/workflows/build-all-test.yml around lines 395 - 398, The "Run Tests" step currently suppresses failures twice: the run command uses "melos run test || true" while the step also sets continue-on-error: true; remove the redundant "|| true" from the run command in the "Run Tests" step (working-directory: sdk/runanywhere-flutter) so the command is simply "melos run test" and keep continue-on-error: true to preserve UI visibility while not blocking downstream jobs.examples/android/RunAnyWhereLora/app/src/main/java/com/runanywhere/run_anywhere_lora/LoraViewModel.kt (1)
68-116:unloadLLMModel()may block the main thread.Lines 73-75 call
RunAnywhere.unloadLLMModel()outside thewithContext(Dispatchers.IO)block. If this is a suspend function that performs native/JNI calls (likely, given it mirrorsloadLLMModel), it could block on the Main dispatcher. Consider moving the unload into the IO context alongside the load.Proposed fix
try { - // Unload existing model if loaded - if (RunAnywhere.isLLMModelLoaded()) { - RunAnywhere.unloadLLMModel() - } - // Generate a model ID from filename val filename = path.substringAfterLast('/') val modelId = filename.removeSuffix(".gguf") @@ // ... register model ... - // Load the model withContext(Dispatchers.IO) { + // Unload existing model if loaded + if (RunAnywhere.isLLMModelLoaded()) { + RunAnywhere.unloadLLMModel() + } + // Load the model RunAnywhere.loadLLMModel(modelId) }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@examples/android/RunAnyWhereLora/app/src/main/java/com/runanywhere/run_anywhere_lora/LoraViewModel.kt` around lines 68 - 116, The unload call can block the main thread because RunAnywhere.unloadLLMModel() likely does native/JNI work; move the unload into an IO context so it runs off the Main dispatcher. In loadModel(), perform RunAnywhere.unloadLLMModel() inside the same withContext(Dispatchers.IO) block you use for RunAnywhere.loadLLMModel(), ensuring both unloadLLMModel() and loadLLMModel(modelId) execute on Dispatchers.IO to avoid blocking the UI thread.sdk/runanywhere-commons/include/rac/features/llm/rac_llm_component.h (1)
217-264: LoRA API declarations look well-structured and properly documented.The four new functions follow the
rac_prefix convention, include error code documentation, and correctly document memory ownership forget_lora_info. One minor inconsistency:Line 229 documents
scaleas0.0-1.0, but the Kotlin-side documentation (CppBridgeLLM.ktline 834) describes it as0.0 to 1.0+. If scales above 1.0 are valid (which is common for LoRA), the C header doc should reflect that.📝 Suggested doc fix
- * `@param` scale Adapter scale factor (0.0-1.0, default 1.0) + * `@param` scale Adapter scale factor (typically 0.0-1.0, default 1.0; values >1.0 amplify the adapter effect)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@sdk/runanywhere-commons/include/rac/features/llm/rac_llm_component.h` around lines 217 - 264, Update the documentation for rac_llm_component_load_lora to match the Kotlin-side CppBridgeLLM.kt behavior: change the documented valid range for the scale parameter from "0.0-1.0" to indicate that values greater than 1.0 are allowed (e.g., "0.0 to 1.0+" or ">= 0.0"), and ensure any mention of scale range in related comments (e.g., in rac_llm_component_load_lora's docblock) is consistent with CppBridgeLLM.kt's description.sdk/runanywhere-commons/src/jni/runanywhere_commons_jni.cpp (2)
1159-1165: Userac_free()instead offree()for the backend-allocated LoRA info JSON string.The rest of this file consistently uses
rac_free()for memory allocated by the C library (Lines 2249, 2258, 3372, 3393, 3416, etc.). Thejsonpointer inracLlmComponentGetLoraInfois allocated byrac_llm_component_get_lora_infoand should be released through the same allocator-aware free.♻️ Proposed fix
jstring jresult = env->NewStringUTF(json); - free(json); + rac_free(json); return jresult;🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@sdk/runanywhere-commons/src/jni/runanywhere_commons_jni.cpp` around lines 1159 - 1165, The code in racLlmComponentGetLoraInfo returns a jstring created from json but uses free(json); which is inconsistent with the rest of the file and unsafe because json is allocated by rac_llm_component_get_lora_info; replace the plain free(json) with rac_free(json) so the backend-allocated LoRA info JSON is released with the matching allocator (keep env->NewStringUTF(json) and variable names intact).
1101-1134: LoRA path marshaling diverges from the establishedgetCStringhelper — also missing result log inRemoveLora.
racLlmComponentLoadLoraandracLlmComponentRemoveLoracallGetStringUTFChars/ReleaseStringUTFCharsdirectly, while every other function in this file that needs a C-string from ajstringusesgetCString(). Additionally,racLlmComponentRemoveLoralogs nothing, whileracLlmComponentLoadLoralogs the result at Line 1114.♻️ Proposed refactor
- const char* path = env->GetStringUTFChars(adapterPath, nullptr); - if (!path) { - return RAC_ERROR_INVALID_ARGUMENT; - } - - LOGi("racLlmComponentLoadLora: handle=%lld, path=%s, scale=%.2f", - (long long)handle, path, (float)scale); - - rac_result_t result = rac_llm_component_load_lora( - reinterpret_cast<rac_handle_t>(handle), path, static_cast<float>(scale)); - - env->ReleaseStringUTFChars(adapterPath, path); + std::string pathStr = getCString(env, adapterPath); + LOGi("racLlmComponentLoadLora: handle=%lld, path=%s, scale=%.2f", + (long long)handle, pathStr.c_str(), (float)scale); + + rac_result_t result = rac_llm_component_load_lora( + reinterpret_cast<rac_handle_t>(handle), pathStr.c_str(), static_cast<float>(scale));- const char* path = env->GetStringUTFChars(adapterPath, nullptr); - if (!path) { - return RAC_ERROR_INVALID_ARGUMENT; - } - - rac_result_t result = rac_llm_component_remove_lora( - reinterpret_cast<rac_handle_t>(handle), path); - - env->ReleaseStringUTFChars(adapterPath, path); - return static_cast<jint>(result); + std::string pathStr = getCString(env, adapterPath); + rac_result_t result = rac_llm_component_remove_lora( + reinterpret_cast<rac_handle_t>(handle), pathStr.c_str()); + + LOGi("racLlmComponentRemoveLora result=%d", result); + return static_cast<jint>(result);🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@sdk/runanywhere-commons/src/jni/runanywhere_commons_jni.cpp` around lines 1101 - 1134, Both functions bypass the file's getCString helper and racLlmComponentRemoveLora lacks a result log; replace direct env->GetStringUTFChars/ReleaseStringUTFChars usage with the existing getCString(env, adapterPath) helper in both racLlmComponentLoadLora and racLlmComponentRemoveLora (use the returned C-string per the helper's contract and do not call ReleaseStringUTFChars manually), and add a LOGi call in racLlmComponentRemoveLora to log the result (similar to racLlmComponentLoadLora) after rac_llm_component_remove_lora returns while keeping existing parameter validation (handle == 0 / adapterPath == nullptr) and return the static_cast<jint>(result).
...droid/RunAnyWhereLora/app/src/main/java/com/runanywhere/run_anywhere_lora/LoraApplication.kt
Show resolved
Hide resolved
...es/android/RunAnyWhereLora/app/src/main/java/com/runanywhere/run_anywhere_lora/LoraScreen.kt
Show resolved
Hide resolved
examples/android/RunAnyWhereLora/gradle/wrapper/gradle-wrapper.properties
Show resolved
Hide resolved
| bool LlamaCppTextGeneration::load_lora_adapter(const std::string& adapter_path, float scale) { | ||
| std::lock_guard<std::mutex> lock(mutex_); | ||
|
|
||
| if (!model_loaded_ || !model_) { | ||
| LOGE("Cannot load LoRA adapter: model not loaded"); | ||
| return false; | ||
| } | ||
|
|
||
| // Check if adapter already loaded | ||
| for (const auto& entry : lora_adapters_) { | ||
| if (entry.path == adapter_path) { | ||
| LOGE("LoRA adapter already loaded: %s", adapter_path.c_str()); | ||
| return false; | ||
| } | ||
| } | ||
|
|
||
| LOGI("Loading LoRA adapter: %s (scale=%.2f)", adapter_path.c_str(), scale); | ||
|
|
||
| // Load adapter against model | ||
| llama_adapter_lora* adapter = llama_adapter_lora_init(model_, adapter_path.c_str()); | ||
| if (!adapter) { | ||
| LOGE("Failed to load LoRA adapter from: %s", adapter_path.c_str()); | ||
| return false; | ||
| } | ||
|
|
||
| // Store adapter entry | ||
| LoraAdapterEntry entry; | ||
| entry.adapter = adapter; | ||
| entry.path = adapter_path; | ||
| entry.scale = scale; | ||
| entry.applied = false; | ||
| lora_adapters_.push_back(std::move(entry)); | ||
|
|
||
| // Recreate context so the new adapter is visible | ||
| if (!recreate_context()) { | ||
| // Remove the adapter entry we just added on failure | ||
| lora_adapters_.pop_back(); | ||
| return false; | ||
| } | ||
|
|
||
| // Apply all loaded adapters to the new context | ||
| if (!apply_lora_adapters()) { | ||
| lora_adapters_.pop_back(); | ||
| return false; | ||
| } | ||
|
|
||
| // Clear KV cache after adapter changes | ||
| llama_memory_clear(llama_get_memory(context_), true); | ||
|
|
||
| LOGI("LoRA adapter loaded and applied: %s (%zu total adapters)", | ||
| adapter_path.c_str(), lora_adapters_.size()); | ||
| return true; | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# First, let's examine the actual code around the flagged lines
fd "llamacpp_backend.cpp" --type f | head -1 | xargs -I {} sh -c 'wc -l "{}"'Repository: RunanywhereAI/runanywhere-sdks
Length of output: 144
🏁 Script executed:
# Get the specific lines from the file to see the actual implementation
fd "llamacpp_backend.cpp" --type f | head -1 | xargs -I {} sh -c 'sed -n "827,929p" "{}"'Repository: RunanywhereAI/runanywhere-sdks
Length of output: 3240
🏁 Script executed:
# Also check lines 367-372 for unload_model_internal context
fd "llamacpp_backend.cpp" --type f | head -1 | xargs -I {} sh -c 'sed -n "360,380p" "{}"'Repository: RunanywhereAI/runanywhere-sdks
Length of output: 598
🏁 Script executed:
# Check if there's a header file with adapter-related declarations
fd "llamacpp.*\.h" --type f | xargs grep -l "LoraAdapter\|lora_adapter" 2>/dev/null | head -5Repository: RunanywhereAI/runanywhere-sdks
Length of output: 138
🌐 Web query:
llama.cpp llama_adapter_lora_init llama_adapter_lora_free memory management API
💡 Result:
In llama.cpp’s C API, a LoRA adapter is an opaque heap object (struct llama_adapter_lora *) with explicit create/free:
-
Create / load (allocates):
struct llama_adapter_lora * llama_adapter_lora_init(struct llama_model * model, const char * path_lora);
Returns NULL on failure. [1] -
Free (deallocates):
void llama_adapter_lora_free(struct llama_adapter_lora * adapter);[1] -
Attach/detach to a context (does not transfer ownership):
int32_t llama_set_adapter_lora(struct llama_context * ctx, struct llama_adapter_lora * adapter, float scale);(adds to the context; “will not modify model’s weight”) [1]int32_t llama_rm_adapter_lora(struct llama_context * ctx, struct llama_adapter_lora * adapter);(removes that adapter; returns-1if not present) [1]void llama_clear_adapter_lora(struct llama_context * ctx);(removes all adapters) [1]
Practical memory-management rule
Keep the adapter alive as long as it may be used by any llama_context, and typically:
llama_adapter_lora_init()llama_set_adapter_lora()- when done:
llama_rm_adapter_lora()/llama_clear_adapter_lora() - finally:
llama_adapter_lora_free()
This is also how higher-level bindings model it (initialize → use → register a finalizer that calls llama_adapter_lora_free). [1][2]
Sources:
[1] llama-cpp-python API reference (low-level bindings + docstrings for llama_adapter_lora_init/free, llama_set_adapter_lora, llama_rm_adapter_lora, llama_clear_adapter_lora) (llama-cpp-python.readthedocs.io)
[2] llama-cpp-hs Llama.Adapter docs (wraps llama_adapter_lora_init and returns a managed adapter object) (hackage-content.haskell.org)
Citations:
- 1: https://llama-cpp-python.readthedocs.io/en/latest/api-reference/
- 2: https://hackage-content.haskell.org/package/llama-cpp-hs-0.1.0.0/docs/Llama-Adapter.html?utm_source=openai
🏁 Script executed:
# Check if llama_adapter_lora_free is called anywhere in the codebase
fd "llamacpp" --type f -e cpp -e h | xargs rg "llama_adapter_lora_free" 2>/dev/nullRepository: RunanywhereAI/runanywhere-sdks
Length of output: 58
🏁 Script executed:
# Check the header file to see LoraAdapterEntry definition
fd "llamacpp_backend.h" --type f | head -1 | xargs -I {} sh -c 'grep -A 10 "struct LoraAdapterEntry\|class LoraAdapterEntry" "{}"'Repository: RunanywhereAI/runanywhere-sdks
Length of output: 415
LoRA adapter memory leak across multiple code paths — requires immediate fix.
The llama.cpp C API requires explicit llama_adapter_lora_free() to deallocate adapters created by llama_adapter_lora_init(). Your code never calls this function, creating definite resource leaks:
- Error paths in
load_lora_adapter()(lines 862–864, 868–869):pop_back()discards the adapter pointer without freeing it whenrecreate_context()orapply_lora_adapters()fails. - Normal operation in
remove_lora_adapter()(line 905):erase(it)removes the entry from the vector without freeing the adapter — every adapter removal leaks. clear_lora_adapters()(line 925):clear()discards all adapter pointers without freeing them.unload_model_internal()(line 372–373):clear()discards adapters without freeing.
The comment at line 904 stating "adapter memory freed with the model" is incorrect per the llama.cpp API — adapters are not automatically freed with the model and must be explicitly deallocated via llama_adapter_lora_free().
Each leaked adapter remains allocated in memory for the lifetime of the model, causing cumulative memory exhaustion as adapters are loaded and not properly freed.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@sdk/runanywhere-commons/src/backends/llamacpp/llamacpp_backend.cpp` around
lines 827 - 879, The code leaks llama_adapter_lora objects because
llama_adapter_lora_free() is never called; update load_lora_adapter(),
remove_lora_adapter(), clear_lora_adapters(), and unload_model_internal() to
explicitly free adapters: when you pop_back() an entry after recreate_context()
or apply_lora_adapters() failure in LlamaCppTextGeneration::load_lora_adapter(),
call llama_adapter_lora_free(entry.adapter) before removing it; in
remove_lora_adapter() call llama_adapter_lora_free(it->adapter) before erasing
the vector element; in clear_lora_adapters() iterate entries and call
llama_adapter_lora_free(entry.adapter) for each before clearing the vector; and
in unload_model_internal() ensure any remaining entries are freed the same way
(and null or clear fields afterward) so no adapter pointers are discarded
without deallocation.
...where-kotlin/src/commonMain/kotlin/com/runanywhere/sdk/public/extensions/RunAnywhere+LoRA.kt
Show resolved
Hide resolved
- Replace raw -1 returns with RAC_ERROR_INVALID_HANDLE/RAC_ERROR_INVALID_ARGUMENT
to match codebase error handling conventions
- Use getCString() helper instead of raw GetStringUTFChars/ReleaseStringUTFChars
- Add missing result logging to racLlmComponentRemoveLora and racLlmComponentClearLora
- Use rac_free() instead of free() in racLlmComponentGetLoraInfo for consistency
- Clarify LoRA adapter memory ownership comments (adapters freed automatically
with model per llama.cpp b8011 API — llama_adapter_lora_free is deprecated)
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (1)
sdk/runanywhere-commons/src/jni/runanywhere_commons_jni.cpp (1)
1090-1160: All four prior review concerns are resolved in this iteration — LoRA functions look correct.
- Error guards use
RAC_ERROR_INVALID_HANDLE/RAC_ERROR_INVALID_ARGUMENTthroughout. ✓getCString()helper used consistently instead of rawGetStringUTFChars/ReleaseStringUTFChars. ✓racLlmComponentRemoveLoranow logs its result (Line 1127). ✓racLlmComponentGetLoraInfocallsrac_free(json)(Line 1158), matching the project allocator convention. ✓One minor optional:
racLlmComponentRemoveLoraandracLlmComponentClearLorahave no function-entryLOGi(unlikeracLlmComponentLoadLoraat Line 1104). Consider adding entry logs for operational symmetry.🪵 Optional entry-level logging for Remove/Clear
JNIEXPORT jint JNICALL Java_com_runanywhere_sdk_native_bridge_RunAnywhereBridge_racLlmComponentRemoveLora( JNIEnv* env, jclass clazz, jlong handle, jstring adapterPath) { if (handle == 0) return RAC_ERROR_INVALID_HANDLE; if (adapterPath == nullptr) return RAC_ERROR_INVALID_ARGUMENT; std::string path = getCString(env, adapterPath); + LOGi("racLlmComponentRemoveLora: handle=%lld, path=%s", (long long)handle, path.c_str()); rac_result_t result = rac_llm_component_remove_lora(JNIEXPORT jint JNICALL Java_com_runanywhere_sdk_native_bridge_RunAnywhereBridge_racLlmComponentClearLora( JNIEnv* env, jclass clazz, jlong handle) { if (handle == 0) return RAC_ERROR_INVALID_HANDLE; + LOGi("racLlmComponentClearLora: handle=%lld", (long long)handle); rac_result_t result = rac_llm_component_clear_lora(reinterpret_cast<rac_handle_t>(handle));🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@sdk/runanywhere-commons/src/jni/runanywhere_commons_jni.cpp` around lines 1090 - 1160, Add entry-level LOGi calls at the start of Java_com_runanywhere_sdk_native_bridge_RunAnywhereBridge_racLlmComponentRemoveLora and Java_com_runanywhere_sdk_native_bridge_RunAnywhereBridge_racLlmComponentClearLora to match racLlmComponentLoadLora; specifically, in racLlmComponentRemoveLora log the handle and adapterPath string (use the existing getCString(env, adapterPath) value or log path.c_str()) before calling rac_llm_component_remove_lora, and in racLlmComponentClearLora log the handle value at the top before calling rac_llm_component_clear_lora so both functions have symmetrical entry logging.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@sdk/runanywhere-commons/src/backends/llamacpp/llamacpp_backend.cpp`:
- Around line 813-825: apply_lora_adapters() currently leaves lora_adapters_ and
the context inconsistent on a partial failure; update
LlamaCppTextGeneration::apply_lora_adapters to perform a rollback when
llama_set_adapter_lora returns non-zero: call llama_clear_adapter_lora for each
adapter that was successfully applied, reset those entries' applied flags to
false, and ensure the failing adapter's entry state is consistent before
returning false (also keep load_lora_adapter/pop_back semantics intact or adjust
to remove the correct entry so lora_adapters_ and context remain in sync).
- Around line 861-871: recreate_context() currently frees the old
context_/sampler_ before creating the new one, leaving context_=nullptr and
model_loaded_=true if llama_init_from_model fails; modify recreate_context() to
allocate/create the new context and sampler first (or save copies of the old
context_/sampler_), and only free the old ones after the new context is
confirmed valid; on any failure (llama_init_from_model or
apply_lora_adapters()), call unload_model_internal() to set model_loaded_ =
false and restore the old context_/sampler_ (or keep the original intact), and
ensure any adapter created by llama_adapter_lora_init is properly freed
(llama_adapter_lora_free or equivalent) before popping from lora_adapters_ to
avoid leaks.
---
Duplicate comments:
In `@sdk/runanywhere-commons/src/backends/llamacpp/llamacpp_backend.cpp`:
- Around line 367-372: The code currently assumes lora adapters are freed with
the model but we must verify and, if not, explicitly free them: inspect the
vendored llama.cpp headers for a declaration of llama_adapter_lora_free and any
build/version macros (search for "llama_adapter_lora_free" and
"LLAMA_BUILD_NUMBER"/"build_number"/"LLAMA_VERSION_PATCH"); if the function
exists and is not deprecated in this vendored tree, update all cleanup paths
(unload_model_internal, remove_lora_adapter, clear_lora_adapters, and failure
paths in load_lora_adapter) to call llama_adapter_lora_free(adapter) for each
adapter before clearing lora_adapters_ and ensure no double-free when model
teardown also frees them (add guards or null out pointers after free).
In `@sdk/runanywhere-commons/src/jni/runanywhere_commons_jni.cpp`:
- Line 131: Replace any uses of AttachCurrentThread that pass a void** cast with
the C++ NDK form that accepts a JNIEnv**: update calls such as
g_jvm->AttachCurrentThread(&env, nullptr) in getJNIEnv,
llm_stream_callback_token, model_assignment_http_get_callback, and
vlm_stream_callback_token so they pass &env (JNIEnv**) directly and remove
reinterpret_cast<void**> or equivalent casts; ensure the function signatures and
local env variable types are JNIEnv* so the call matches the C++
AttachCurrentThread(&env, nullptr) prototype consistently across all four call
sites.
---
Nitpick comments:
In `@sdk/runanywhere-commons/src/jni/runanywhere_commons_jni.cpp`:
- Around line 1090-1160: Add entry-level LOGi calls at the start of
Java_com_runanywhere_sdk_native_bridge_RunAnywhereBridge_racLlmComponentRemoveLora
and
Java_com_runanywhere_sdk_native_bridge_RunAnywhereBridge_racLlmComponentClearLora
to match racLlmComponentLoadLora; specifically, in racLlmComponentRemoveLora log
the handle and adapterPath string (use the existing getCString(env, adapterPath)
value or log path.c_str()) before calling rac_llm_component_remove_lora, and in
racLlmComponentClearLora log the handle value at the top before calling
rac_llm_component_clear_lora so both functions have symmetrical entry logging.
| bool LlamaCppTextGeneration::apply_lora_adapters() { | ||
| for (auto& entry : lora_adapters_) { | ||
| int32_t result = llama_set_adapter_lora(context_, entry.adapter, entry.scale); | ||
| if (result != 0) { | ||
| LOGE("Failed to apply LoRA adapter: %s (error=%d)", entry.path.c_str(), result); | ||
| entry.applied = false; | ||
| return false; | ||
| } | ||
| entry.applied = true; | ||
| LOGI("Applied LoRA adapter: %s (scale=%.2f)", entry.path.c_str(), entry.scale); | ||
| } | ||
| return true; | ||
| } |
There was a problem hiding this comment.
Missing rollback in apply_lora_adapters() leaves context and tracking in inconsistent state on partial failure.
If llama_set_adapter_lora fails for adapter at index K (where K is not the last entry):
- Adapters
[0..K-1]are applied to the new context (applied = true). - Adapter K remains unapplied (
applied = false) and stays inlora_adapters_. load_lora_adapter()then callslora_adapters_.pop_back(), which removes only the newly-added last entry, not the failing entry K.- The context now has adapters
[0..K-1]applied, butlora_adapters_has entries[0..K]with mixedappliedflags — a persistent inconsistency.
At a minimum, on failure, the already-applied adapters should be rolled back with llama_clear_adapter_lora before returning false, and a consistent error state should be established.
🛡️ Proposed defensive approach
bool LlamaCppTextGeneration::apply_lora_adapters() {
for (auto& entry : lora_adapters_) {
int32_t result = llama_set_adapter_lora(context_, entry.adapter, entry.scale);
if (result != 0) {
LOGE("Failed to apply LoRA adapter: %s (error=%d)", entry.path.c_str(), result);
entry.applied = false;
+ // Roll back all adapters applied so far to restore a consistent state
+ llama_clear_adapter_lora(context_);
+ for (auto& e : lora_adapters_) {
+ e.applied = false;
+ }
return false;
}
entry.applied = true;
LOGI("Applied LoRA adapter: %s (scale=%.2f)", entry.path.c_str(), entry.scale);
}
return true;
}🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@sdk/runanywhere-commons/src/backends/llamacpp/llamacpp_backend.cpp` around
lines 813 - 825, apply_lora_adapters() currently leaves lora_adapters_ and the
context inconsistent on a partial failure; update
LlamaCppTextGeneration::apply_lora_adapters to perform a rollback when
llama_set_adapter_lora returns non-zero: call llama_clear_adapter_lora for each
adapter that was successfully applied, reset those entries' applied flags to
false, and ensure the failing adapter's entry state is consistent before
returning false (also keep load_lora_adapter/pop_back semantics intact or adjust
to remove the correct entry so lora_adapters_ and context remain in sync).
| if (!recreate_context()) { | ||
| // Remove the adapter entry we just added on failure | ||
| lora_adapters_.pop_back(); | ||
| return false; | ||
| } | ||
|
|
||
| // Apply all loaded adapters to the new context | ||
| if (!apply_lora_adapters()) { | ||
| lora_adapters_.pop_back(); | ||
| return false; | ||
| } |
There was a problem hiding this comment.
recreate_context() failure leaves the instance in a permanently broken state (context_ = nullptr, model_loaded_ = true).
Inside recreate_context(), the old context_ and sampler_ are unconditionally freed before the new context is created (Lines 783-786). If llama_init_from_model fails:
context_andsampler_are nullptr.model_loaded_remains true.is_ready()returnsfalsepermanently.- The public
generate_*APIs silently fail; callers receive no indication that recovery requires a full model reload.
Additionally, in the apply_lora_adapters() failure path at Line 868-871, the adapter object allocated by llama_adapter_lora_init has been pushed and then popped from the vector — if the deprecation of llama_adapter_lora_free cannot be confirmed (see previous comment), this is an additional leak.
Recommended fixes:
- Save and restore the old context/sampler (or attempt the new context before freeing the old one).
- If recreation truly fails, call
unload_model_internal()to setmodel_loaded_ = falseso callers can detect and handle the failure.
🛡️ Minimal mitigation for the broken-state issue
- // Recreate context so the new adapter is visible
- if (!recreate_context()) {
- // Remove the adapter entry we just added on failure
- lora_adapters_.pop_back();
- return false;
- }
+ // Recreate context so the new adapter is visible
+ if (!recreate_context()) {
+ lora_adapters_.pop_back();
+ // context_ is now null; force a clean unload so callers can detect the failure
+ LOGE("Context recreation failed; forcing model unload to maintain consistent state");
+ unload_model_internal();
+ return false;
+ }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@sdk/runanywhere-commons/src/backends/llamacpp/llamacpp_backend.cpp` around
lines 861 - 871, recreate_context() currently frees the old context_/sampler_
before creating the new one, leaving context_=nullptr and model_loaded_=true if
llama_init_from_model fails; modify recreate_context() to allocate/create the
new context and sampler first (or save copies of the old context_/sampler_), and
only free the old ones after the new context is confirmed valid; on any failure
(llama_init_from_model or apply_lora_adapters()), call unload_model_internal()
to set model_loaded_ = false and restore the old context_/sampler_ (or keep the
original intact), and ensure any adapter created by llama_adapter_lora_init is
properly freed (llama_adapter_lora_free or equivalent) before popping from
lora_adapters_ to avoid leaks.
|
@coderabbitai please review |
|
@greptile please review |
|
@ellipsis please review |
|
PR Description Image: The image shows a mobile phone screenshot of the "RunAnywhere LoRA" Android app demonstrating the LoRA adapter functionality. The UI displays:
This is a substantial PR adding comprehensive LoRA (Low-Rank Adaptation) adapter support across the entire SDK stack: Scope Overview:
I'll conduct a thorough review focusing on API design, memory safety, threading, error handling, and integration patterns. ✅ Actions performedReview triggered.
|
|
@Siddhesh2377 merging this to dev branch to take out these changes and work on it for Swift :) |
* feat(lora): add LoRA adapter support across SDK + demo app
Implement LoRA (Low-Rank Adaptation) adapter hot-swapping for llama.cpp
backend across all 6 SDK layers (C++ -> C API -> Component -> JNI ->
Kotlin Bridge -> Kotlin Public API).
- Add load/remove/clear/query LoRA adapter operations
- Use vtable dispatch in component layer to decouple librac_commons
from librac_backend_llamacpp (fixes linker errors)
- Add LoRA vtable entries to rac_llm_service_ops_t
- Fix AttachCurrentThread cast for Android NDK C++ JNI build
- Add RunAnyWhereLora Android demo app with Material 3 Q&A UI
- Add comprehensive implementation docs with C/C++ API reference
* feat(ci): add selectable build targets to Build All workflow + fix Swift concurrency errors
Rewrite build-all-test.yml with 9 boolean checkbox inputs so each build
target can be toggled independently from the GitHub Actions UI:
- C++ Android Backends (arm64-v8a, armeabi-v7a, x86_64 matrix)
- C++ iOS Backends (XCFramework)
- Kotlin SDK (JVM + Android)
- Swift SDK (iOS/macOS)
- Web SDK (TypeScript)
- Flutter SDK (Dart analyze via Melos)
- React Native SDK (TypeScript via Lerna)
- Android Example Apps (RunAnywhereAI + RunAnyWhereLora)
- IntelliJ Plugin
Fix two Swift strict-concurrency errors that fail the Swift SDK build:
- LiveTranscriptionSession: add @unchecked Sendable (safe because class
is @mainactor, all access serialized)
- RunAnywhere+VisionLanguage: add Sendable conformance to rac_vlm_image_t
so the C struct can cross the Task boundary in the streaming builder;
simplify StreamingCollector to start timing at init
* fix(swift): resolve strict concurrency errors in LiveTranscriptionSession and VLM streaming
LiveTranscriptionSession.swift:
- Replace [weak self] captures with strong `let session = self` before
closures to avoid captured var in @Sendable/@task contexts (class is
@mainactor @unchecked Sendable so strong ref is safe, bounded by
stream lifecycle)
- Wrap deprecated startStreamingTranscription call in @available helper
to silence deprecation warning until migration to transcribeStream API
RunAnywhere+VisionLanguage.swift:
- Add `let capturedCImage = cImage` before AsyncThrowingStream closure
so the Task captures an immutable let instead of a mutable var
- Add `extension rac_vlm_image_t: @unchecked Sendable {}` for the C
struct to cross Task concurrency boundaries safely
- Simplify StreamingCollector to initialize startTime at init instead
of requiring a separate async start() call
* fix(jni): address CodeRabbit review findings in LoRA JNI functions
- Replace raw -1 returns with RAC_ERROR_INVALID_HANDLE/RAC_ERROR_INVALID_ARGUMENT
to match codebase error handling conventions
- Use getCString() helper instead of raw GetStringUTFChars/ReleaseStringUTFChars
- Add missing result logging to racLlmComponentRemoveLora and racLlmComponentClearLora
- Use rac_free() instead of free() in racLlmComponentGetLoraInfo for consistency
- Clarify LoRA adapter memory ownership comments (adapters freed automatically
with model per llama.cpp b8011 API — llama_adapter_lora_free is deprecated)
* feat(lora): add LoRA adapter support across SDK + demo app
Implement LoRA (Low-Rank Adaptation) adapter hot-swapping for llama.cpp
backend across all 6 SDK layers (C++ -> C API -> Component -> JNI ->
Kotlin Bridge -> Kotlin Public API).
- Add load/remove/clear/query LoRA adapter operations
- Use vtable dispatch in component layer to decouple librac_commons
from librac_backend_llamacpp (fixes linker errors)
- Add LoRA vtable entries to rac_llm_service_ops_t
- Fix AttachCurrentThread cast for Android NDK C++ JNI build
- Add RunAnyWhereLora Android demo app with Material 3 Q&A UI
- Add comprehensive implementation docs with C/C++ API reference
* feat(ci): add selectable build targets to Build All workflow + fix Swift concurrency errors
Rewrite build-all-test.yml with 9 boolean checkbox inputs so each build
target can be toggled independently from the GitHub Actions UI:
- C++ Android Backends (arm64-v8a, armeabi-v7a, x86_64 matrix)
- C++ iOS Backends (XCFramework)
- Kotlin SDK (JVM + Android)
- Swift SDK (iOS/macOS)
- Web SDK (TypeScript)
- Flutter SDK (Dart analyze via Melos)
- React Native SDK (TypeScript via Lerna)
- Android Example Apps (RunAnywhereAI + RunAnyWhereLora)
- IntelliJ Plugin
Fix two Swift strict-concurrency errors that fail the Swift SDK build:
- LiveTranscriptionSession: add @unchecked Sendable (safe because class
is @mainactor, all access serialized)
- RunAnywhere+VisionLanguage: add Sendable conformance to rac_vlm_image_t
so the C struct can cross the Task boundary in the streaming builder;
simplify StreamingCollector to start timing at init
* fix(swift): resolve strict concurrency errors in LiveTranscriptionSession and VLM streaming
LiveTranscriptionSession.swift:
- Replace [weak self] captures with strong `let session = self` before
closures to avoid captured var in @Sendable/@task contexts (class is
@mainactor @unchecked Sendable so strong ref is safe, bounded by
stream lifecycle)
- Wrap deprecated startStreamingTranscription call in @available helper
to silence deprecation warning until migration to transcribeStream API
RunAnywhere+VisionLanguage.swift:
- Add `let capturedCImage = cImage` before AsyncThrowingStream closure
so the Task captures an immutable let instead of a mutable var
- Add `extension rac_vlm_image_t: @unchecked Sendable {}` for the C
struct to cross Task concurrency boundaries safely
- Simplify StreamingCollector to initialize startTime at init instead
of requiring a separate async start() call
* fix(jni): address CodeRabbit review findings in LoRA JNI functions
- Replace raw -1 returns with RAC_ERROR_INVALID_HANDLE/RAC_ERROR_INVALID_ARGUMENT
to match codebase error handling conventions
- Use getCString() helper instead of raw GetStringUTFChars/ReleaseStringUTFChars
- Add missing result logging to racLlmComponentRemoveLora and racLlmComponentClearLora
- Use rac_free() instead of free() in racLlmComponentGetLoraInfo for consistency
- Clarify LoRA adapter memory ownership comments (adapters freed automatically
with model per llama.cpp b8011 API — llama_adapter_lora_free is deprecated)
* feat(lora): add LoRA adapter support across SDK + demo app
Implement LoRA (Low-Rank Adaptation) adapter hot-swapping for llama.cpp
backend across all 6 SDK layers (C++ -> C API -> Component -> JNI ->
Kotlin Bridge -> Kotlin Public API).
- Add load/remove/clear/query LoRA adapter operations
- Use vtable dispatch in component layer to decouple librac_commons
from librac_backend_llamacpp (fixes linker errors)
- Add LoRA vtable entries to rac_llm_service_ops_t
- Fix AttachCurrentThread cast for Android NDK C++ JNI build
- Add RunAnyWhereLora Android demo app with Material 3 Q&A UI
- Add comprehensive implementation docs with C/C++ API reference
* feat(ci): add selectable build targets to Build All workflow + fix Swift concurrency errors
Rewrite build-all-test.yml with 9 boolean checkbox inputs so each build
target can be toggled independently from the GitHub Actions UI:
- C++ Android Backends (arm64-v8a, armeabi-v7a, x86_64 matrix)
- C++ iOS Backends (XCFramework)
- Kotlin SDK (JVM + Android)
- Swift SDK (iOS/macOS)
- Web SDK (TypeScript)
- Flutter SDK (Dart analyze via Melos)
- React Native SDK (TypeScript via Lerna)
- Android Example Apps (RunAnywhereAI + RunAnyWhereLora)
- IntelliJ Plugin
Fix two Swift strict-concurrency errors that fail the Swift SDK build:
- LiveTranscriptionSession: add @unchecked Sendable (safe because class
is @mainactor, all access serialized)
- RunAnywhere+VisionLanguage: add Sendable conformance to rac_vlm_image_t
so the C struct can cross the Task boundary in the streaming builder;
simplify StreamingCollector to start timing at init
* fix(swift): resolve strict concurrency errors in LiveTranscriptionSession and VLM streaming
LiveTranscriptionSession.swift:
- Replace [weak self] captures with strong `let session = self` before
closures to avoid captured var in @Sendable/@task contexts (class is
@mainactor @unchecked Sendable so strong ref is safe, bounded by
stream lifecycle)
- Wrap deprecated startStreamingTranscription call in @available helper
to silence deprecation warning until migration to transcribeStream API
RunAnywhere+VisionLanguage.swift:
- Add `let capturedCImage = cImage` before AsyncThrowingStream closure
so the Task captures an immutable let instead of a mutable var
- Add `extension rac_vlm_image_t: @unchecked Sendable {}` for the C
struct to cross Task concurrency boundaries safely
- Simplify StreamingCollector to initialize startTime at init instead
of requiring a separate async start() call
* fix(jni): address CodeRabbit review findings in LoRA JNI functions
- Replace raw -1 returns with RAC_ERROR_INVALID_HANDLE/RAC_ERROR_INVALID_ARGUMENT
to match codebase error handling conventions
- Use getCString() helper instead of raw GetStringUTFChars/ReleaseStringUTFChars
- Add missing result logging to racLlmComponentRemoveLora and racLlmComponentClearLora
- Use rac_free() instead of free() in racLlmComponentGetLoraInfo for consistency
- Clarify LoRA adapter memory ownership comments (adapters freed automatically
with model per llama.cpp b8011 API — llama_adapter_lora_free is deprecated)
* Added Lora + Fixed Build-All-Work-Flow (#389) * feat(lora): add LoRA adapter support across SDK + demo app Implement LoRA (Low-Rank Adaptation) adapter hot-swapping for llama.cpp backend across all 6 SDK layers (C++ -> C API -> Component -> JNI -> Kotlin Bridge -> Kotlin Public API). - Add load/remove/clear/query LoRA adapter operations - Use vtable dispatch in component layer to decouple librac_commons from librac_backend_llamacpp (fixes linker errors) - Add LoRA vtable entries to rac_llm_service_ops_t - Fix AttachCurrentThread cast for Android NDK C++ JNI build - Add RunAnyWhereLora Android demo app with Material 3 Q&A UI - Add comprehensive implementation docs with C/C++ API reference * feat(ci): add selectable build targets to Build All workflow + fix Swift concurrency errors Rewrite build-all-test.yml with 9 boolean checkbox inputs so each build target can be toggled independently from the GitHub Actions UI: - C++ Android Backends (arm64-v8a, armeabi-v7a, x86_64 matrix) - C++ iOS Backends (XCFramework) - Kotlin SDK (JVM + Android) - Swift SDK (iOS/macOS) - Web SDK (TypeScript) - Flutter SDK (Dart analyze via Melos) - React Native SDK (TypeScript via Lerna) - Android Example Apps (RunAnywhereAI + RunAnyWhereLora) - IntelliJ Plugin Fix two Swift strict-concurrency errors that fail the Swift SDK build: - LiveTranscriptionSession: add @unchecked Sendable (safe because class is @mainactor, all access serialized) - RunAnywhere+VisionLanguage: add Sendable conformance to rac_vlm_image_t so the C struct can cross the Task boundary in the streaming builder; simplify StreamingCollector to start timing at init * fix(swift): resolve strict concurrency errors in LiveTranscriptionSession and VLM streaming LiveTranscriptionSession.swift: - Replace [weak self] captures with strong `let session = self` before closures to avoid captured var in @Sendable/@task contexts (class is @mainactor @unchecked Sendable so strong ref is safe, bounded by stream lifecycle) - Wrap deprecated startStreamingTranscription call in @available helper to silence deprecation warning until migration to transcribeStream API RunAnywhere+VisionLanguage.swift: - Add `let capturedCImage = cImage` before AsyncThrowingStream closure so the Task captures an immutable let instead of a mutable var - Add `extension rac_vlm_image_t: @unchecked Sendable {}` for the C struct to cross Task concurrency boundaries safely - Simplify StreamingCollector to initialize startTime at init instead of requiring a separate async start() call * fix(jni): address CodeRabbit review findings in LoRA JNI functions - Replace raw -1 returns with RAC_ERROR_INVALID_HANDLE/RAC_ERROR_INVALID_ARGUMENT to match codebase error handling conventions - Use getCString() helper instead of raw GetStringUTFChars/ReleaseStringUTFChars - Add missing result logging to racLlmComponentRemoveLora and racLlmComponentClearLora - Use rac_free() instead of free() in racLlmComponentGetLoraInfo for consistency - Clarify LoRA adapter memory ownership comments (adapters freed automatically with model per llama.cpp b8011 API — llama_adapter_lora_free is deprecated) * Add lora ios (#407) * ios initial changes * minimal sample needed to test lora * updating docs * addressed the comments * Prototype for Optimised RAG First version for Optimised RAG. Not polished yet, Once tested, I'll microoptimise, bench, and finish. * Aligning / upstream update for dev (#442) * chore: add AGENTS.md with Cursor Cloud specific instructions * chore: update AGENTS.md with Linux backend build and voice assistant instructions * minor fixes * fix: Android app UI improvements, SDK concurrency bug fixes, and LoRA download support Android App: - Redesign intro screen with minimal layout and linear progress bar - Improve VLM screen: use shared ModelRequiredOverlay, theme-consistent colors, fix button clipping (replace IconButton with clickable Column) - Fix keyboard handling: hide bottom bar when keyboard open, apply imePadding correctly - Add scrollable auto-scroll prompt suggestions in ChatScreen - Add shimmer typing indicator with "Thinking..." label - Fix 9 app-level bugs: think tag leak, CancellationException handling, VoiceAssistant lifecycle, ConversationStore ANR, TTS sample rate parsing, LoRA download mutex deadlock KMP SDK (10 bug fixes): - Fix cancel() deadlock: move JNI calls outside synchronized(lock) in CppBridgeLLM - Fix orphaned CoroutineScope leak in generateStream using callbackFlow - Fix initializeServices() holding lock across network I/O - Fix loraDownloadDir lazy val caching wrong path before pathProvider set - Fix setBaseDirCallback TOCTOU race condition - Add @volatile to DownloadTask mutable fields for thread visibility - Fix unescapeJson() replacement order (process \\\\ before \\n) - Add downloadLock for atomic cancel/pause/resume operations - Fix checkNativeLibrary() to actually call native method - Add ensureServicesReady() to generateStream - Add LoRA adapter download/delete/path SDK functions Known issue: Tool-calling may show unexpected behavior when a LoRA adapter is applied — the model detects the tool call but responds with "I can assist with this" instead of executing it. Tested with Qwen 2.5 0.5B. This only occurs when the model has a LoRA adapter loaded. * fix(tts): scan WAV data chunk instead of hardcoding 44-byte header offset WAV files with extra chunks (LIST, fact, bext) had metadata bytes fed into AudioTrack as PCM, causing distorted playback. Now walks the chunk structure to find the actual "data" chunk start. * fix: Android app UI bug fixes, responsive dimensions, LoRA example prompts, and darker dark mode - Fix nested verticalScroll inside LazyColumn (ThinkingToggle) causing broken scroll - Fix weight(1f) + verticalScroll overflow in VLMScreen DescriptionPanel - Add verticalScroll to MoreHubScreen to prevent clipping on small screens - Add imePadding to ConversationListSheet so keyboard doesn't cover search - Fix auto-scroll wrap logic in EmptyStateView using canScrollForward - Replace collectAsState with collectAsStateWithLifecycle in 3 screens - Replace deprecated STTMode.values() with .entries - Replace hardcoded Color.Gray with AppColors.statusGray for dark mode contrast - Remove redundant Color.White inside buttons with contentColor set - Replace hardcoded 300.dp bubble width with responsive Dimensions.messageBubbleMaxWidth - Add accessibility semantics role to VLMScreen clickable Column - Disable Image Generation card (placeholder feature) - Add responsive rDp/rSp utilities and convert Dimensions/AppSpacing to use them - Add LoRA example prompts with copy button to adapter picker and manager screens - Darken dark mode background colors * fix: Android app bug fixes - race conditions, ANR, pixel corruption, scroll, and memory safety - VoiceAssistantViewModel: replace runBlocking with GlobalScope.launch in onCleared to prevent ANR - VoiceAssistantViewModel: add synchronized audioBufferLock for thread-safe ByteArrayOutputStream access - VoiceAssistantViewModel: scan WAV data chunk instead of hardcoding 44-byte header offset - ConversationStore: use MutableStateFlow.update {} for atomic compare-and-set on all mutations - ToolSettingsViewModel: clear static singleton in onCleared to prevent stale references - VLMViewModel: advance rgbIdx by 3 in else branch to prevent pixel corruption on out-of-bounds skip - ChatViewModel: use CopyOnWriteArrayList for tokensPerSecondHistory thread safety - VoiceAssistantParticleView: remove wasted transparent drawPoints call - RunAnywhereApplication: capture volatile initializationError to local val before null check - VLMScreen: add verticalScroll to description panel for long text overflow - ResponsiveUtils: add designWidth <= 0 guard to prevent division by zero in rDp/rSp --------- Co-authored-by: Cursor Agent <cursoragent@cursor.com> Co-authored-by: Sanchit Monga <sanchitmonga22@gmail.com> Co-authored-by: Sanchit Monga <sm3468@g.rit.edu> Co-authored-by: Siddhesh2377 <siddheshsonar2377@gmail.com> Co-authored-by: RunAnywhere <> * RAG rewrite * Refactor RAG terminology to "pipeline" across scripts and source files for consistency. Update comments and logging messages to reflect the change from "backend" to "pipeline". Remove unused React Native package files related to RAG. * Complete RAG Flutter implementation (full state) (#419) RAG Flutter SDK. ->there are a bunch of UI/UX issues with this like button not loading, the round spinny download thing not being proper etc, but the rag pipeline works. -> also onnx and rag when built together were giving duplicate symbol errors as rag requires onnx, so a future task that should be done soon is to include onnx in core as well, and perhaps add a conditional? * Optimised RAG + implement a hybrid search * fixed tnc block error. * Changed batching parametres, similarity threshold, and optimised embedding memory+speed output * fix(rag): close anonymous namespace in rag_chunker.cpp to fix compilation The anonymous namespace wrapping perform_recursive_chunking was never closed, causing DocumentChunker member definitions to be inside the anonymous namespace — resulting in "cannot define or redeclare" errors. Made-with: Cursor * fix: remove stale runanywhere-core-rag module references from Android app RAG was moved into the core SDK but the Android example app still referenced the deleted module, breaking the build. * fixing ios/swift - remvoing ragbackend - refactor * finxing the tts for platform in voice agent * lora fixes - to match up with kotlin * refactor: fold RAG backend into rac_commons, remove separate RAG binary - Changed rac_backend_rag from SHARED/STATIC to OBJECT library (CMake) - RAG objects folded into rac_commons at compile time - Moved ONNX embedding provider to rac_backend_onnx to break shared-lib cycle - ONNX backend now registers embeddings provider during rac_backend_onnx_register() - Removed RAG as separate backend from all build scripts and SDK configs - Updated Android, Kotlin, Flutter, React Native build/distribution pipelines - RAG JNI bridge (librac_backend_rag_jni.so) remains as thin wrapper linking rac_commons * fixing rn for rag +some permissions for vlm + npm dependencies + archive logic improved * refactor for react - tts is causing trouble - refactoring that now - will follow with flutter once done --------- Co-authored-by: Siddhesh <DARKWILDHACKER@gmail.com> Co-authored-by: Sanchit Monga <sanchitmonga22@gmail.com> Co-authored-by: VyasGuru <71374747+VyasGuru@users.noreply.github.com> Co-authored-by: Sanchit Monga <sm3468@g.rit.edu> Co-authored-by: Siddhesh2377 <siddheshsonar2377@gmail.com>
Description
Brief description of the changes made.
Type of Change
Testing
Platform-Specific Testing (check all that apply)
Swift SDK / iOS Sample:
Kotlin SDK / Android Sample:
Flutter SDK / Flutter Sample:
React Native SDK / React Native Sample:
Web SDK / Web Sample:
Labels
Please add the appropriate label(s): New Feat Lora in Kotlin SDK
SDKs:
Swift SDK- Changes to Swift SDK (sdk/runanywhere-swift)Kotlin SDK- Changes to Kotlin SDK (sdk/runanywhere-kotlin)Flutter SDK- Changes to Flutter SDK (sdk/runanywhere-flutter)React Native SDK- Changes to React Native SDK (sdk/runanywhere-react-native)Web SDK- Changes to Web SDK (sdk/runanywhere-web)Commons- Changes to shared native code (sdk/runanywhere-commons)Sample Apps:
iOS Sample- Changes to iOS example app (examples/ios)Android Sample- Changes to Android example app (examples/android)Flutter Sample- Changes to Flutter example app (examples/flutter)React Native Sample- Changes to React Native example app (examples/react-native)Web Sample- Changes to Web example app (examples/web)Checklist
Screenshots
Attach relevant UI screenshots for changes (if applicable):
Important
Added LoRA adapter support in Kotlin SDK and updated build workflow for comprehensive platform verification.
LoRAAdapterConfigandLoRAAdapterInfodata classes inLLMTypes.kt.loadLoraAdapter,removeLoraAdapter,clearLoraAdapters, andgetLoadedLoraAdaptersfunctions inRunAnywhere+LoRA.ktandRunAnywhere+LoRA.jvmAndroid.kt.RunAnywhereBridge.ktfor LoRA operations.build-all-test.ymlto include platform-specific build options and artifact uploads.RunAnyWhereLoraAndroid example app demonstrating LoRA adapter usage.This description was created by
for 638da00. You can customize this summary. It will automatically update as commits are pushed.
Summary by CodeRabbit
New Features
Documentation
Greptile Summary
This PR adds comprehensive LoRA (Low-Rank Adaptation) adapter support to the Kotlin SDK and updates the build workflow for better platform verification.
LoRA Adapter Implementation:
llamacpp_backend.cpp) with load, remove, clear, and info operationsrac_llm_component.h), JNI bridge (runanywhere_commons_jni.cpp), and Kotlin SDK (RunAnywhere+LoRA.kt)LoRAAdapterConfigandLoRAAdapterInfodata classes in Kotlin with proper validationBuild Workflow Improvements:
build-all-test.ymlwith per-platform checkboxes for selective buildingArchitecture:
Confidence Score: 5/5
Important Files Changed
LoraAdapterEntrystruct and LoRA management methods toLlamaCppTextGenerationclass. Clean API design matching C++ patterns.load_lora,remove_lora,clear_lora, andget_lora_info. Well-documented C API.LoRAAdapterConfigandLoRAAdapterInfodata classes with proper validation and serialization support.Sequence Diagram
sequenceDiagram participant App as Android App participant KotlinAPI as RunAnywhere+LoRA.kt participant Bridge as CppBridgeLLM participant JNI as runanywhere_commons_jni participant Component as rac_llm_component participant Backend as LlamaCppTextGeneration participant LlamaCpp as llama.cpp App->>KotlinAPI: loadLoraAdapter(config) KotlinAPI->>Bridge: loadLoraAdapter(path, scale) Bridge->>Bridge: Validate state == READY Bridge->>JNI: racLlmComponentLoadLora(handle, path, scale) JNI->>Component: rac_llm_component_load_lora(handle, path, scale) Component->>Component: Get service via lifecycle_get_service Component->>Backend: load_lora (via vtable) Backend->>Backend: Check model_loaded, lock mutex Backend->>LlamaCpp: llama_adapter_lora_init(model, path) LlamaCpp-->>Backend: adapter handle Backend->>Backend: Store in lora_adapters_ vector Backend->>Backend: recreate_context() Backend->>LlamaCpp: llama_free(old_context) Backend->>LlamaCpp: llama_init_from_model(model) LlamaCpp-->>Backend: new_context Backend->>Backend: apply_lora_adapters() Backend->>LlamaCpp: llama_set_adapter_lora(context, adapter, scale) Backend->>LlamaCpp: llama_memory_clear(context) Backend-->>Component: RAC_SUCCESS Component-->>JNI: RAC_SUCCESS JNI-->>Bridge: 0 Bridge-->>KotlinAPI: success KotlinAPI-->>App: adapter loadedLast reviewed commit: ef98bcd
(5/5) You can turn off certain types of comments like style here!