Added Lora + Fixed Build-All-Work-Flow (#389)#406
Conversation
* feat(lora): add LoRA adapter support across SDK + demo app
Implement LoRA (Low-Rank Adaptation) adapter hot-swapping for llama.cpp
backend across all 6 SDK layers (C++ -> C API -> Component -> JNI ->
Kotlin Bridge -> Kotlin Public API).
- Add load/remove/clear/query LoRA adapter operations
- Use vtable dispatch in component layer to decouple librac_commons
from librac_backend_llamacpp (fixes linker errors)
- Add LoRA vtable entries to rac_llm_service_ops_t
- Fix AttachCurrentThread cast for Android NDK C++ JNI build
- Add RunAnyWhereLora Android demo app with Material 3 Q&A UI
- Add comprehensive implementation docs with C/C++ API reference
* feat(ci): add selectable build targets to Build All workflow + fix Swift concurrency errors
Rewrite build-all-test.yml with 9 boolean checkbox inputs so each build
target can be toggled independently from the GitHub Actions UI:
- C++ Android Backends (arm64-v8a, armeabi-v7a, x86_64 matrix)
- C++ iOS Backends (XCFramework)
- Kotlin SDK (JVM + Android)
- Swift SDK (iOS/macOS)
- Web SDK (TypeScript)
- Flutter SDK (Dart analyze via Melos)
- React Native SDK (TypeScript via Lerna)
- Android Example Apps (RunAnywhereAI + RunAnyWhereLora)
- IntelliJ Plugin
Fix two Swift strict-concurrency errors that fail the Swift SDK build:
- LiveTranscriptionSession: add @unchecked Sendable (safe because class
is @mainactor, all access serialized)
- RunAnywhere+VisionLanguage: add Sendable conformance to rac_vlm_image_t
so the C struct can cross the Task boundary in the streaming builder;
simplify StreamingCollector to start timing at init
* fix(swift): resolve strict concurrency errors in LiveTranscriptionSession and VLM streaming
LiveTranscriptionSession.swift:
- Replace [weak self] captures with strong `let session = self` before
closures to avoid captured var in @Sendable/@task contexts (class is
@mainactor @unchecked Sendable so strong ref is safe, bounded by
stream lifecycle)
- Wrap deprecated startStreamingTranscription call in @available helper
to silence deprecation warning until migration to transcribeStream API
RunAnywhere+VisionLanguage.swift:
- Add `let capturedCImage = cImage` before AsyncThrowingStream closure
so the Task captures an immutable let instead of a mutable var
- Add `extension rac_vlm_image_t: @unchecked Sendable {}` for the C
struct to cross Task concurrency boundaries safely
- Simplify StreamingCollector to initialize startTime at init instead
of requiring a separate async start() call
* fix(jni): address CodeRabbit review findings in LoRA JNI functions
- Replace raw -1 returns with RAC_ERROR_INVALID_HANDLE/RAC_ERROR_INVALID_ARGUMENT
to match codebase error handling conventions
- Use getCString() helper instead of raw GetStringUTFChars/ReleaseStringUTFChars
- Add missing result logging to racLlmComponentRemoveLora and racLlmComponentClearLora
- Use rac_free() instead of free() in racLlmComponentGetLoraInfo for consistency
- Clarify LoRA adapter memory ownership comments (adapters freed automatically
with model per llama.cpp b8011 API — llama_adapter_lora_free is deprecated)
📝 WalkthroughWalkthroughAdds cross-language LoRA adapter support (C++ backend, C API, vtable, JNI, Kotlin, Swift) and a new Android example app (RunAnyWhereLora). Extends CI workflow to a parameterized multi-target build with per-target jobs and a consolidated summary. Changes
Sequence Diagram(s)sequenceDiagram
participant App as Android/iOS App (UI)
participant Kotlin as RunAnywhere Kotlin API
participant JNI as JNI Bridge / RunAnywhereBridge
participant CComp as C Component (rac_llm_component)
participant Backend as LlamaCPP Backend
App->>Kotlin: runAnywhere.loadLoraAdapter(config)
Kotlin->>JNI: racLlmComponentLoadLora(handle, path, scale)
JNI->>CComp: rac_llm_component_load_lora(handle, path, scale)
CComp->>Backend: backend->ops->load_lora(impl, path, scale)
Backend->>Backend: load adapter data -> store entry
Backend->>Backend: recreate_context(), apply_lora_adapters()
Backend-->>CComp: return status
CComp-->>JNI: return status
JNI-->>Kotlin: return status
Kotlin-->>App: success / throw on error
Estimated code review effort🎯 4 (Complex) | ⏱️ ~75 minutes Possibly related PRs
Suggested labels
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Important
Looks good to me! 👍
Reviewed everything up to 4cfd510 in 17 seconds. Click for details.
- Reviewed
4626lines of code in55files - Skipped
10files when reviewing. - Skipped posting
0draft comments. View those below. - Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
Workflow ID: wflow_5oHSuYgk2h4l7pTC
You can customize by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.
| // Apply all loaded adapters to the new context | ||
| if (!apply_lora_adapters()) { | ||
| lora_adapters_.pop_back(); | ||
| return false; |
There was a problem hiding this comment.
Inconsistent state on apply failure
If apply_lora_adapters() fails partway through the loop (e.g., the 2nd of 3 adapters fails), only the newly added adapter is popped. However, the context was already recreated at line 873, and previously loaded adapters that were re-applied before the failure point are now in an inconsistent state — their applied flags were set to true inside the loop even though the overall operation failed. The new context is left with a partial set of applied adapters.
Consider either rolling back all adapters on failure (calling llama_clear_adapter_lora on the context) or documenting that partial application is the expected behavior.
Prompt To Fix With AI
This is a comment left during a code review.
Path: sdk/runanywhere-commons/src/backends/llamacpp/llamacpp_backend.cpp
Line: 880-883
Comment:
**Inconsistent state on apply failure**
If `apply_lora_adapters()` fails partway through the loop (e.g., the 2nd of 3 adapters fails), only the newly added adapter is popped. However, the context was already recreated at line 873, and previously loaded adapters that were re-applied before the failure point are now in an inconsistent state — their `applied` flags were set to `true` inside the loop even though the overall operation failed. The new context is left with a partial set of applied adapters.
Consider either rolling back all adapters on failure (calling `llama_clear_adapter_lora` on the context) or documenting that partial application is the expected behavior.
How can I resolve this? If you propose a fix, please make it concise.| kotlinx.coroutines.runBlocking { | ||
| RunAnywhere.completeServicesInitialization() | ||
| } |
There was a problem hiding this comment.
runBlocking inside a coroutine
initializeSDK() is already a suspend function called from launch(Dispatchers.IO). Using runBlocking inside it blocks the IO thread and negates the benefits of structured concurrency. If completeServicesInitialization() is a suspend function, it can be called directly here without wrapping in runBlocking.
| kotlinx.coroutines.runBlocking { | |
| RunAnywhere.completeServicesInitialization() | |
| } | |
| RunAnywhere.completeServicesInitialization() |
Note: I see the same pattern in RunAnywhereApplication.kt, but it's worth fixing here since this is a new demo app.
Prompt To Fix With AI
This is a comment left during a code review.
Path: examples/android/RunAnyWhereLora/app/src/main/java/com/runanywhere/run_anywhere_lora/LoraApplication.kt
Line: 57-59
Comment:
**`runBlocking` inside a coroutine**
`initializeSDK()` is already a `suspend` function called from `launch(Dispatchers.IO)`. Using `runBlocking` inside it blocks the IO thread and negates the benefits of structured concurrency. If `completeServicesInitialization()` is a suspend function, it can be called directly here without wrapping in `runBlocking`.
```suggestion
RunAnywhere.completeServicesInitialization()
```
Note: I see the same pattern in `RunAnywhereApplication.kt`, but it's worth fixing here since this is a new demo app.
How can I resolve this? If you propose a fix, please make it concise.| <mapping directory="$PROJECT_DIR$/sdk/runanywhere-commons/build/android/unified/arm64-v8a/_deps/llamacpp-src" vcs="Git" /> | ||
| <mapping directory="$PROJECT_DIR$/sdk/runanywhere-commons/build/android/unified/arm64-v8a/_deps/nlohmann_json-src" vcs="Git" /> | ||
| <mapping directory="$PROJECT_DIR$/sdk/runanywhere-commons/build/dev-core/_deps/nlohmann_json-src" vcs="Git" /> | ||
| <mapping directory="$PROJECT_DIR$/sdk/runanywhere-commons/build/dev/_deps/llamacpp-src" vcs="Git" /> |
There was a problem hiding this comment.
Local build paths committed to VCS
These VCS mappings reference local build artifact directories (build/android/unified/arm64-v8a/_deps/..., build/dev-core/_deps/...). These paths are developer-machine-specific and will differ per machine. They should likely not be committed — consider adding them to .gitignore or reverting this change.
Prompt To Fix With AI
This is a comment left during a code review.
Path: .idea/vcs.xml
Line: 5-8
Comment:
**Local build paths committed to VCS**
These VCS mappings reference local build artifact directories (`build/android/unified/arm64-v8a/_deps/...`, `build/dev-core/_deps/...`). These paths are developer-machine-specific and will differ per machine. They should likely not be committed — consider adding them to `.gitignore` or reverting this change.
How can I resolve this? If you propose a fix, please make it concise.| color = MaterialTheme.colorScheme.onSurface.copy(alpha = 0.7f), | ||
| ) | ||
| Spacer(modifier = Modifier.height(16.dp)) | ||
| TextButton(onClick = { /* Would need app reference to retry */ }) { |
There was a problem hiding this comment.
Retry button has no effect
The "Retry" button's onClick handler is empty ({ /* Would need app reference to retry */ }). This will confuse users who encounter an error and try to retry. Consider either wiring it up to re-trigger SDK initialization or removing the button until it's functional.
Prompt To Fix With AI
This is a comment left during a code review.
Path: examples/android/RunAnyWhereLora/app/src/main/java/com/runanywhere/run_anywhere_lora/MainActivity.kt
Line: 80
Comment:
**Retry button has no effect**
The "Retry" button's `onClick` handler is empty (`{ /* Would need app reference to retry */ }`). This will confuse users who encounter an error and try to retry. Consider either wiring it up to re-trigger SDK initialization or removing the button until it's functional.
How can I resolve this? If you propose a fix, please make it concise.There was a problem hiding this comment.
Actionable comments posted: 20
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
sdk/runanywhere-swift/Sources/RunAnywhere/Public/Extensions/VLM/RunAnywhere+VisionLanguage.swift (1)
108-158:⚠️ Potential issue | 🟠 MajorPre-existing double-release risk on
contextPtrif the C function returns a failure code.The reference created with
passRetained(line 111) is released by bothcompleteCb/errorCbviatakeRetainedValue()and by the manual.release()on the failure path (line 154). The C++ wrapper invokeserrorCbwheneverrac_vlm_process_streamreturns non-SUCCESS, which means if the underlying call fails, the context is released twice: once in the callback and again at line 154, causing a use-after-free.Consider guarding the manual release with a flag tracking whether a callback already consumed the reference, or restructure to ensure only one path releases the reference.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@sdk/runanywhere-swift/Sources/RunAnywhere/Public/Extensions/VLM/RunAnywhere`+VisionLanguage.swift around lines 108 - 158, The code currently manually releases the retained StreamContext (contextPtr) and calls continuation.finish/collector.fail on the rac_vlm_component_process_stream failure path while the C++ wrapper already invokes errorCb/completeCb which call takeRetainedValue() and perform those actions, causing a double-release; remove the manual Unmanaged<StreamContext>.fromOpaque(contextPtr).release() and the duplicate continuation.finish(throwing: error)/await collector.fail(error) in the if result != RAC_SUCCESS branch so the callbacks (errorCb/completeCb) remain the single owners that consume the retained reference and handle finishing/failing the stream.
🧹 Nitpick comments (23)
examples/android/RunAnyWhereLora/app/src/main/res/values/strings.xml (1)
1-3: Consider externalising UI strings from Compose screens into this file.The AI summary indicates
LoraScreenandMainActivitycontain button labels, input hints, status messages, and other user-facing text. If those strings are hardcoded in Kotlin/Compose rather than referenced viaR.string.*, the app won't be easily localisable. For a demo app this is low priority, but extracting them here (and usingstringResource(R.string.*)in Compose) is the idiomatic Android pattern and keeps the door open for translations.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@examples/android/RunAnyWhereLora/app/src/main/res/values/strings.xml` around lines 1 - 3, LoraScreen and MainActivity currently hardcode UI text; extract all user-facing strings (button labels, input hints/placeholders, status/error messages, titles) into strings.xml as named resources, then replace hardcoded literals in the LoraScreen composable and MainActivity with stringResource(R.string.<name>) in Compose and getString(R.string.<name>) in Activity; ensure resource names are descriptive (e.g., lora_connect_button, lora_input_hint, lora_status_connected) and update any Compose previews/tests to use the stringResource imports.examples/android/RunAnyWhereLora/app/src/test/java/com/runanywhere/run_anywhere_lora/ExampleUnitTest.kt (1)
12-16: Boilerplate scaffold test — consider adding meaningful LoRA coverage.This is the default Android Studio–generated placeholder. It provides no coverage for any of the new LoRA functionality introduced in this PR (
LoRAAdapterConfig,LoRAAdapterInfo,LoraViewModel,loadLoraAdapter,removeLoraAdapter,clearLoraAdapters,getLoadedLoraAdapters). The placeholder can remain as-is for now, but the new API surface — especially the ViewModel state transitions and the suspend-function adapters — should have unit tests before this ships.Would you like me to generate a unit test skeleton covering the core LoRA adapter operations and
LoraViewModelstate transitions? I can open a new issue to track this if preferred.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@examples/android/RunAnyWhereLora/app/src/test/java/com/runanywhere/run_anywhere_lora/ExampleUnitTest.kt` around lines 12 - 16, Replace the placeholder test with real unit tests that exercise the LoRA feature surface: add tests in ExampleUnitTest.kt (or a new test file) that instantiate LoRAAdapterConfig and LoRAAdapterInfo, verify loadLoraAdapter/removeLoraAdapter/clearLoraAdapters/getLoadedLoraAdapters behavior (including adapter list updates), and cover LoraViewModel state transitions (loading, success, error) when calling its suspend operations; use kotlinx-coroutines-test to run suspend functions and a fake or mocked adapter provider to simulate success and failure paths so you can assert viewModel LiveData/StateFlow changes and the contents of loaded adapters. Ensure tests reference the symbols LoRAAdapterConfig, LoRAAdapterInfo, LoraViewModel, loadLoraAdapter, removeLoraAdapter, clearLoraAdapters, and getLoadedLoraAdapters.examples/android/RunAnyWhereLora/app/src/androidTest/java/com/runanywhere/run_anywhere_lora/ExampleInstrumentedTest.kt (1)
17-23: Boilerplate placeholder — add meaningful LoRA instrumented tests.This is the unmodified Android Studio scaffold test. With a full LoRA adapter API (
load,remove,clear,query) and a new Material 3 Q&A UI being introduced in this PR, the instrumented test suite offers zero coverage of those paths. Consider replacing or supplementing this file with tests that exercise real LoRA behaviour, for example:-class ExampleInstrumentedTest { - `@Test` - fun useAppContext() { - // Context of the app under test. - val appContext = InstrumentationRegistry.getInstrumentation().targetContext - assertEquals("com.runanywhere.run_anywhere_lora", appContext.packageName) - } -} +class LoraInstrumentedTest { + + `@Test` + fun appContextHasCorrectPackage() { + val appContext = InstrumentationRegistry.getInstrumentation().targetContext + assertEquals("com.runanywhere.run_anywhere_lora", appContext.packageName) + } + + // TODO: Add tests covering LoRA adapter load / remove / clear / query flows, + // e.g. verify that MainActivity surfaces the expected UI states and + // that adapter operations round-trip without crashing. +}At minimum, rename the class and method away from the generic
Example*/useAppContextplaceholders so future contributors know these are intentional, non-trivial tests.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@examples/android/RunAnyWhereLora/app/src/androidTest/java/com/runanywhere/run_anywhere_lora/ExampleInstrumentedTest.kt` around lines 17 - 23, Rename the placeholder test class ExampleInstrumentedTest and method useAppContext to meaningful names (e.g., LoRaAdapterInstrumentedTest and testLoRaAdapterLifecycle) and replace the scaffold with real instrumented tests that exercise the LoRA adapter API: call load(), verify data/query() returns expected results, call remove() and clear() and assert state changes; also add a test that launches the Material3 Q&A UI (Activity/Fragment) and asserts key UI elements render and respond to a sample query. Locate and update the class and method in ExampleInstrumentedTest to implement these assertions and use InstrumentationRegistry/ActivityScenario to drive UI interactions. Ensure each test has clear setup/teardown and meaningful names to replace the generic scaffold.examples/android/RunAnyWhereLora/app/src/main/res/values/themes.xml (1)
4-4: UseTheme.Material3.DayNight.NoActionBarto align with the stated Material 3 design goal.The PR describes a "Material 3 Q&A UI", and the companion Kotlin files (
Theme.kt,Color.kt,RunAnyWhereLoraTheme) are consistent with a Compose + Material3 setup. However, the Activity host theme inherits fromandroid:Theme.Material.Light.NoActionBar, which is the legacy AOSP framework Material 1 theme — not the Material Components library's Material3 variant.While Compose handles its own
MaterialThemeinternally, the host XML theme still governs window-level attributes such as status bar styling and edge-to-edge behavior. The MDC Compose Theme Adapter documentation explicitly states: "This relies on your Activity/Context theme extending one of theTheme.Material3themes." For a Compose + Material3 app, the recommended parent inres/values/themes.xmlisTheme.Material3.DayNight.NoActionBar.♻️ Proposed fix
- <style name="Theme.RunAnyWhereLora" parent="android:Theme.Material.Light.NoActionBar" /> + <style name="Theme.RunAnyWhereLora" parent="Theme.Material3.DayNight.NoActionBar" />Ensure
com.google.android.material:material(version 1.5.0+) is included in the app's Gradle dependencies to makeTheme.Material3.DayNight.NoActionBarresolvable at build time.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@examples/android/RunAnyWhereLora/app/src/main/res/values/themes.xml` at line 4, The host theme "Theme.RunAnyWhereLora" currently extends the legacy framework theme; update its parent from "android:Theme.Material.Light.NoActionBar" to "Theme.Material3.DayNight.NoActionBar" so the Activity host theme matches the app's Material3 Compose setup (locate the <style name="Theme.RunAnyWhereLora" parent="..."> entry in themes.xml and change the parent value), and ensure the Gradle dependency "com.google.android.material:material" is added/updated to a compatible version (1.5.0+ or newer) so the Material3 theme is resolvable at build time..github/workflows/build-all-test.yml (4)
195-234: Extract duplicated config generation into a reusable action or script.This 35-line
development_config.cppgeneration block is identical to lines 108–147 incpp-android. Any bug fix (e.g., the escaping issue) must be applied in both places. Consider either:
- A composite action (e.g.,
.github/actions/setup-dev-config/action.yml) that both jobs call, or- A checked-in shell script (e.g.,
sdk/runanywhere-commons/scripts/generate-dev-config.sh) that both jobs invoke.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In @.github/workflows/build-all-test.yml around lines 195 - 234, Duplicate development_config.cpp generation appears in multiple workflow jobs (CONFIG_FILE creation and the rac_dev_config_* functions); extract this block into a reusable component. Create either a composite GitHub Action (e.g., .github/actions/setup-dev-config) or a checked-in shell script (e.g., sdk/.../generate-dev-config.sh) that accepts SUPABASE_URL, SUPABASE_ANON_KEY, BUILD_TOKEN and writes the same file contents (including the constexpr strings and functions like rac_dev_config_is_available, rac_dev_config_get_supabase_url, rac_dev_config_get_supabase_key, rac_dev_config_get_build_token, rac_dev_config_get_sentry_dsn, rac_dev_config_has_supabase, rac_dev_config_has_build_token), then update both workflow jobs to call that single action/script to avoid duplicating the CONFIG_FILE generation and ensure fixes apply once.
76-88: Consider adding dependency caching to reduce CI run times.None of the jobs cache their dependencies. For a workflow with 9+ jobs, this can add significant cumulative download time on every run. Key candidates:
- Gradle:
actions/cacheor the built-insetup-javacache option (cache: 'gradle') forkotlin-sdk,android-apps, andintellij-plugin.- npm/yarn:
actions/setup-nodehas a built-incacheoption forweb-sdkandreact-native-sdk.- Flutter/Dart:
subosito/flutter-actionsupports caching via its inputs.- Homebrew:
cpp-iosinstalls cmake/ninja via brew each time.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In @.github/workflows/build-all-test.yml around lines 76 - 88, Add dependency caching to reduce CI time: update the workflow jobs (e.g., cpp-android, kotlin-sdk, android-apps, intellij-plugin, web-sdk, react-native-sdk, flutter jobs, and cpp-ios) to use actions/cache or the built-in cache options for setup actions—use setup-java with cache: 'gradle' for Gradle-based jobs, setup-node with cache for npm/yarn for web-sdk and react-native-sdk, subosito/flutter-action cache inputs for Flutter/Dart jobs, and add a cache step or actions/cache for Homebrew artifacts (used by cpp-ios) and any toolchain downloads; ensure cache keys include OS, tool version, and lockfile (e.g., Gradle wrapper, package-lock.json/yarn.lock, pubspec.lock) so caches are correctly restored and invalidated.
300-315: Excessivecontinue-on-errormasks build failures in the summary.In this job, three of four build steps use
continue-on-error: true, meaning the job almost always reports "success" regardless of actual build outcomes. The summary table will show "Pass" for Kotlin SDK even if AAR, backend modules, and Maven publishing all fail. The same pattern (most steps are swallowed) recurs influtter-sdk,react-native-sdk, andandroid-apps.If the intent is a "best-effort" run, consider at least collecting and reporting per-step outcomes in the summary (e.g., via step outputs), so failures are visible even when the job proceeds.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In @.github/workflows/build-all-test.yml around lines 300 - 315, The workflow currently masks failures by setting continue-on-error: true for steps "Build Android AAR (Debug)", "Build Backend Modules", and "Publish to Maven Local"; update these steps to stop masking failures by removing continue-on-error where failures should fail the job, or if you want a best-effort run, capture each step’s exit status and expose it as a step output (e.g., run the gradle commands inside a shell that records the exit code and sets a step output or writes to GITHUB_STEP_SUMMARY), and then propagate those outputs to the job summary so failures for build tasks (assembleDebug, modules:runanywhere-core-llamacpp:assembleDebug, modules:runanywhere-core-onnx:assembleDebug, publishToMavenLocal) are visible even when the job continues.
417-429: Mixingyarnandnpmcan produce inconsistent dependency trees.Line 419 falls back from
yarn install --frozen-lockfiletonpm install. These tools resolve and lock dependencies differently, so an npm-resolved tree may not match whatyarn.lockspecifies, leading to non-reproducible builds. Prefer committing to one package manager. If yarn is the intended tool, let it fail explicitly so missing lockfile issues are caught early.Additionally, lines 428–429 use the same double failure suppression (
|| true+continue-on-error) noted above.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In @.github/workflows/build-all-test.yml around lines 417 - 429, The workflow mixes yarn and npm and suppresses failures twice; update the "Install Dependencies" step (named "Install Dependencies") to run only yarn install --frozen-lockfile and remove the npm install fallback so the job fails if yarn can't satisfy the lockfile, and remove the duplicate failure suppression in the "Build Packages" step (named "Build Packages") by deleting the "|| true" and/or the continue-on-error: true so build failures surface (apply the same removal of redundant suppression if present in "TypeScript Typecheck" step).examples/android/RunAnyWhereLora/app/src/main/res/xml/backup_rules.xml (1)
9-13: Consider excluding large model and adapter files from backup.With no rules defined, all app data is backed up by default. A LoRA demo app likely stores large model weight files and adapter binaries under the app's
files/directory. These would silently fail or be skipped due to Android's 25 MB per-app backup cap, and they are not meaningful to restore anyway (models are re-downloaded on first run).💡 Example exclusion rules for model files
<full-backup-content> - <!-- - <include domain="sharedpref" path="."/> - <exclude domain="sharedpref" path="device.xml"/> ---> + <!-- Exclude large model and LoRA adapter files from cloud backup --> + <exclude domain="file" path="models/"/> + <exclude domain="file" path="lora_adapters/"/> </full-backup-content>Adjust the paths to match wherever the app stores model weights and adapter files.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@examples/android/RunAnyWhereLora/app/src/main/res/xml/backup_rules.xml` around lines 9 - 13, The backup rules currently allow full backup (no excludes), so large model weights and adapter binaries under your app's files/ directory get included; update the XML in backup_rules.xml (inside the <full-backup-content> block) to add <exclude> entries targeting your model and adapter storage locations (e.g., paths you use like files/models, files/adapters or filename patterns for weights/adapter binaries) using domain="file" and the appropriate path attributes so those large files are not backed up.examples/android/RunAnyWhereLora/app/src/main/res/mipmap-anydpi-v26/ic_launcher.xml (1)
5-5: Consider a dedicated monochrome drawable for themed icon surfaces.The
<monochrome>layer reusesic_launcher_foreground, which contains a gradient shadow path and a white Android-robot vector. On Material You / themed icon launchers (Pixel Launcher, etc.), the system tints the monochrome drawable with a single hue, making the gradient path potentially render as a visible coloured blob rather than a clean silhouette. For a polished demo app a flat single-colour silhouette drawable (e.g.ic_launcher_monochrome.xml) would render more predictably.This is entirely optional for an example app and has no functional impact — API 26–32 devices silently ignore the
<monochrome>element, so there is no backward-compat risk.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@examples/android/RunAnyWhereLora/app/src/main/res/mipmap-anydpi-v26/ic_launcher.xml` at line 5, The <monochrome> element currently reuses ic_launcher_foreground which contains gradients and details that will tint poorly on themed (Material You) launchers; create a new flat silhouette drawable (e.g. ic_launcher_monochrome.xml) that is a single solid-path/vector suitable for tinting, replace the reference in ic_launcher.xml's <monochrome android:drawable="@drawable/ic_launcher_foreground" /> with android:drawable="@drawable/ic_launcher_monochrome", and ensure the new drawable contains only a single solid path (no gradients or multi-layer shadows) so it renders cleanly when the launcher applies a single hue.examples/android/RunAnyWhereLora/app/src/main/res/drawable/ic_launcher_foreground.xml (1)
1-1: Missing XML declaration — minor inconsistency with sibling files.All three other resource files in this PR (
ic_launcher_background.xml,ic_launcher.xml,ic_launcher_round.xml) begin with<?xml version="1.0" encoding="utf-8"?>. This file omits it. Android tooling does not require it, but adding it keeps the resource files consistent.🔧 Proposed fix
+<?xml version="1.0" encoding="utf-8"?> <vector xmlns:android="http://schemas.android.com/apk/res/android" xmlns:aapt="http://schemas.android.com/aapt"🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@examples/android/RunAnyWhereLora/app/src/main/res/drawable/ic_launcher_foreground.xml` at line 1, Add the missing XML declaration as the very first line before the existing vector root element: insert the standard XML declaration specifying version 1.0 and UTF-8 encoding so this file matches its siblings that start with the XML prolog; locate the root <vector xmlns:android="http://schemas.android.com/apk/res/android"> and add the declaration immediately above it.examples/android/RunAnyWhereLora/build.gradle.kts (1)
1-8:kotlin.multiplatform,android.library, andkotlin.serializationare unused and should be removed.The app submodule only applies
android.application,kotlin.android, andkotlin.compose. The three other plugin declarations add unnecessary noise to the root build script and contradict the simplicity principle. Remove them:♻️ Suggested simplification
plugins { alias(libs.plugins.android.application) apply false - alias(libs.plugins.android.library) apply false alias(libs.plugins.kotlin.android) apply false - alias(libs.plugins.kotlin.multiplatform) apply false alias(libs.plugins.kotlin.compose) apply false - alias(libs.plugins.kotlin.serialization) apply false }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@examples/android/RunAnyWhereLora/build.gradle.kts` around lines 1 - 8, The plugins block contains unused plugin aliases—remove alias(libs.plugins.android.library), alias(libs.plugins.kotlin.multiplatform), and alias(libs.plugins.kotlin.serialization) so only alias(libs.plugins.android.application), alias(libs.plugins.kotlin.android), and alias(libs.plugins.kotlin.compose) remain; update the plugins {} block accordingly to eliminate the unused declarations and keep the root script minimal and aligned with the app submodule.examples/android/RunAnyWhereLora/app/src/main/res/values/colors.xml (1)
3-9: Remove unused Material 1/2 template colours from colors.xml.The
purple_*andteal_*colour entries (lines 3-9) are boilerplate from the Android project wizard and are never referenced.themes.xmldoesn't use them, Compose theming (Color.kt) uses a separate colour palette, and there are no View-based layouts in this Compose-only app. Removing them reduces clutter.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@examples/android/RunAnyWhereLora/app/src/main/res/values/colors.xml` around lines 3 - 9, Remove unused Material template colors from colors.xml: delete the <color> entries named purple_200, purple_500, purple_700, teal_200, and teal_700. Before removing, confirm there are no references to these symbols (purple_200, purple_500, purple_700, teal_200, teal_700) in the project (check themes.xml and Color.kt and any XML layouts). After deletion, run a quick project-wide search to ensure no usages remain and commit the cleaned colors.xml.examples/android/RunAnyWhereLora/gradle/wrapper/gradle-wrapper.properties (1)
3-3: Consider updating the Gradle wrapper to a more recent stable version.Gradle 8.13 is outdated; as of February 2026, the latest stable release is 9.3.1 (released January 29, 2026). Updating provides access to bug fixes and security patches. If upgrading to 9.x is not viable, the latest 8.14.x patch release is an alternative.
Additionally, consider switching from the
-alldistribution to-binunless IDE source browsing of Gradle internals is required; the binary-only distribution is lighter and is the standard for CI environments.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@examples/android/RunAnyWhereLora/gradle/wrapper/gradle-wrapper.properties` at line 3, Update the Gradle wrapper distributionUrl value (the distributionUrl entry) to a current stable release; replace the existing gradle-8.13-all.zip URL with gradle-9.3.1-bin.zip (or gradle-8.14.x-bin.zip if you must stay on 8.x), and switch the suffix from -all to -bin unless IDE source browsing of Gradle internals is required for this project.sdk/runanywhere-kotlin/src/commonMain/kotlin/com/runanywhere/sdk/public/extensions/LLM/LLMTypes.kt (1)
202-210: Consider adding validation forscaleinLoRAAdapterConfig.The
pathis validated as non-blank, butscalehas no lower bound check. Negative scale values would be nonsensical for a LoRA adapter. A minimal guard would be consistent with the existingrequirepattern.🛡️ Proposed fix
data class LoRAAdapterConfig( val path: String, val scale: Float = 1.0f, ) { init { require(path.isNotBlank()) { "LoRA adapter path cannot be blank" } + require(scale >= 0f) { "LoRA adapter scale must be non-negative" } } }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@sdk/runanywhere-kotlin/src/commonMain/kotlin/com/runanywhere/sdk/public/extensions/LLM/LLMTypes.kt` around lines 202 - 210, LoRAAdapterConfig currently validates only path; add a require check for the scale field in the LoRAAdapterConfig init to enforce a sensible lower bound (e.g., require(scale >= 0f) or require(scale > 0f) depending on desired behavior) and provide a clear error message like "LoRA adapter scale must be non-negative" to match the existing require pattern; update the init block that references path to also validate scale so negative or invalid scale values are rejected at construction.sdk/runanywhere-commons/include/rac/features/llm/rac_llm_component.h (1)
221-234: Scale range documentation is narrower than actual support.The doc says
@param scale Adapter scale factor (0.0-1.0, default 1.0), but the implementation doc (lora_adapter_support.md) states values >1.0 are valid for amplified effect. Consider updating to(0.0-1.0 typical, >1.0 = amplified)for consistency.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@sdk/runanywhere-commons/include/rac/features/llm/rac_llm_component.h` around lines 221 - 234, Update the scale parameter documentation for rac_llm_component_load_lora to reflect that values >1.0 are supported for amplification: change the `@param` description for scale in rac_llm_component_load_lora to something like "(0.0-1.0 typical, >1.0 = amplified; default 1.0)" so it matches lora_adapter_support.md and clarifies valid values and default behavior for callers.docs/impl/lora_adapter_support.md (1)
467-490: Add a language identifier to the ASCII diagram code block.The fenced code block for the layer diagram has no language specified. Use a text identifier (e.g.,
```text) to satisfy linting and improve rendering.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/impl/lora_adapter_support.md` around lines 467 - 490, Update the fenced ASCII diagram block that starts with "Kotlin Public API (RunAnywhere.loadLoraAdapter)" so the opening fence specifies a language identifier (e.g., change ``` to ```text) to satisfy the linter and improve rendering; keep the diagram content unchanged and only modify the opening fence (and ensure the closing ``` remains intact).sdk/runanywhere-commons/src/backends/llamacpp/llamacpp_backend.h (1)
101-118: Duplicate section header: "TEXT GENERATION IMPLEMENTATION" appears twice.The original header is at lines 101-103, and a second identical one was added at lines 116-118 after the new
LoraAdapterEntrystruct. Remove one to avoid confusion.Proposed fix
-// ============================================================================= -// TEXT GENERATION IMPLEMENTATION -// ============================================================================= - // ============================================================================= // LORA ADAPTER ENTRY // ============================================================================= @@ -116,8 +112,6 @@ -// ============================================================================= -// TEXT GENERATION IMPLEMENTATION -// ============================================================================= - class LlamaCppTextGeneration {🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@sdk/runanywhere-commons/src/backends/llamacpp/llamacpp_backend.h` around lines 101 - 118, Remove the duplicated section header "TEXT GENERATION IMPLEMENTATION" that appears both before and after the new LoraAdapterEntry struct; keep a single header and place the remaining one in the appropriate location so the struct LoraAdapterEntry (llama_adapter_lora*, path, scale, applied) sits under the intended section header only.sdk/runanywhere-commons/include/rac/features/llm/rac_llm_service.h (1)
52-63: Add memory ownership and error-code documentation for new vtable entries.Per coding guidelines, public C API headers must document error codes and lifecycle requirements. Specifically:
get_lora_info: the caller needs to know who owns the*out_jsonmemory and how to free it.- All four ops: document expected return codes (e.g.,
RAC_ERROR_NOT_SUPPORTED,RAC_ERROR_OPERATION_FAILED) and preconditions (e.g., model must be loaded).Proposed documentation improvement
- /** Load a LoRA adapter (optional, NULL if not supported) */ + /** + * Load a LoRA adapter (optional, NULL if not supported). + * `@pre` Model must be loaded. + * `@return` RAC_SUCCESS, RAC_ERROR_OPERATION_FAILED, or RAC_ERROR_NOT_SUPPORTED + */ rac_result_t (*load_lora)(void* impl, const char* adapter_path, float scale); - /** Remove a LoRA adapter by path (optional, NULL if not supported) */ + /** + * Remove a LoRA adapter by path (optional, NULL if not supported). + * `@pre` Model must be loaded. + * `@return` RAC_SUCCESS, RAC_ERROR_OPERATION_FAILED, or RAC_ERROR_NOT_SUPPORTED + */ rac_result_t (*remove_lora)(void* impl, const char* adapter_path); - /** Clear all LoRA adapters (optional, NULL if not supported) */ + /** + * Clear all LoRA adapters (optional, NULL if not supported). + * `@return` RAC_SUCCESS or RAC_ERROR_NOT_SUPPORTED + */ rac_result_t (*clear_lora)(void* impl); - /** Get loaded LoRA adapters info as JSON (optional, NULL if not supported) */ + /** + * Get loaded LoRA adapters info as JSON (optional, NULL if not supported). + * `@param`[out] out_json JSON string; caller must free with rac_free(). + * `@pre` Model must be loaded. + * `@return` RAC_SUCCESS, RAC_ERROR_OPERATION_FAILED, or RAC_ERROR_NOT_SUPPORTED + */ rac_result_t (*get_lora_info)(void* impl, char** out_json);As per coding guidelines: "Public C API headers in include/rac/ must document vtable operations, error codes, and lifecycle requirements".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@sdk/runanywhere-commons/include/rac/features/llm/rac_llm_service.h` around lines 52 - 63, Update the documentation comments for the vtable entries load_lora, remove_lora, clear_lora and get_lora_info to explicitly state expected return codes (e.g., RAC_ERROR_NOT_SUPPORTED, RAC_ERROR_OPERATION_FAILED, RAC_OK) and preconditions (e.g., model must be loaded before calling), and for get_lora_info add ownership/lifecycle details that the function allocates *out_json (or returns NULL on error) and specify how the caller must free it (which allocator/free function to use); keep the wording concise and consistent with other public headers.sdk/runanywhere-kotlin/src/commonMain/kotlin/com/runanywhere/sdk/public/extensions/RunAnywhere+LoRA.kt (1)
18-18: Nit:// MARK: -is a Swift-ism.Kotlin convention is to use
// region/// endregionor section-divider comments without theMARK:prefix. Minor consistency point.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@sdk/runanywhere-kotlin/src/commonMain/kotlin/com/runanywhere/sdk/public/extensions/RunAnywhere`+LoRA.kt at line 18, The comment uses a Swift-style marker "// MARK: - LoRA Adapter Management"; replace it with a Kotlin-style section comment such as "// region LoRA Adapter Management" (and add a matching "// endregion" after the section) or a simple divider like "// --- LoRA Adapter Management ---" to follow Kotlin conventions and keep consistency across the file; ensure any other occurrences of "MARK:" in this file are updated similarly.sdk/runanywhere-commons/src/backends/llamacpp/llamacpp_backend.cpp (1)
787-823: Consider the scenario whererecreate_contextfails mid-way.If
llama_init_from_modelfails at line 810, bothcontext_andsampler_arenullptr, leaving the model in a degraded state (model_loaded_is still true, butis_ready()returns false). The caller (load_lora_adapter) only pops the new adapter — previously loaded adapters remain inlora_adapters_with staleappliedflags.This is safe (subsequent generate calls will fail cleanly via
is_ready()), but the model is effectively unusable until unloaded. A log warning at the call site indicating the model is now in a degraded state would help with debugging.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@sdk/runanywhere-commons/src/backends/llamacpp/llamacpp_backend.cpp` around lines 787 - 823, recreate_context can fail after freeing context_ and sampler_, leaving model_loaded_ true but the backend unusable; update the caller load_lora_adapter to detect when recreate_context() returns false and log a clear warning that the model is now in a degraded/unusable state (include context_/sampler_ nullity and model_loaded_ status), and also clear the applied flags for any adapters in lora_adapters_ that were marked applied (so stale applied=true entries are not left behind); reference functions/fields: recreate_context(), load_lora_adapter(), llama_init_from_model, sampler_, context_, lora_adapters_, applied, model_loaded_, is_ready().sdk/runanywhere-kotlin/src/jvmAndroidMain/kotlin/com/runanywhere/sdk/public/extensions/RunAnywhere+LoRA.jvmAndroid.kt (1)
62-80: Move LoRA info parsing to commonMain (and keep iOS parity).The JSON parsing logic here is business logic living in
jvmAndroidMain. Please move it intocommonMain(e.g., a shared parser onLoRAAdapterInfo) and keep it aligned with the iOS implementation to ensure consistent schema handling across platforms.
As per coding guidelines, "Define all business logic, protocols, interfaces, and structures incommonMain/rather than in platform-specific modules" and "When implementing KMP code, always refer to the iOS implementation as the source of truth for business logic".🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@sdk/runanywhere-kotlin/src/jvmAndroidMain/kotlin/com/runanywhere/sdk/public/extensions/RunAnywhere`+LoRA.jvmAndroid.kt around lines 62 - 80, The JSON parsing logic in RunAnywhere.getLoadedLoraAdapters (which currently uses loraJson, parses JsonArray and constructs LoRAAdapterInfo) must be moved from jvmAndroidMain into commonMain as shared business logic; add a parser on LoRAAdapterInfo (e.g., companion parseList/fromJsonString that mirrors the iOS implementation/schema) in commonMain that accepts the JSON string and returns List<LoRAAdapterInfo>, then update RunAnywhere.getLoadedLoraAdapters to only call CppBridgeLLM.getLoraInfo(), pass the returned JSON string to LoRAAdapterInfo.parseList(...), and keep existing error handling via loraLogger.error when parsing fails so platform code only handles IO/bridge concerns while parsing lives in the shared LoRAAdapterInfo parser.sdk/runanywhere-swift/Sources/RunAnywhere/Public/Sessions/LiveTranscriptionSession.swift (1)
39-39: Consider narrowing@unchecked Sendableto specific non-Sendable propertiesClass-level
@unchecked Sendableopts out of all compiler-enforced Sendable checking. The root issue isonPartialCallback: ((String) -> Void)?— a function type that isn'tSendable. In Swift 6, the targeted fix is to annotate the closure type explicitly so the compiler can validate the rest of the class:♻️ Proposed refactor
-private var onPartialCallback: ((String) -> Void)? +private var onPartialCallback: (`@MainActor` `@Sendable` (String) -> Void)?Any remaining non-Sendable stored dependencies (
AnyCancellable,AudioCaptureManager) would still require eithernonisolated(unsafe)per-property or@unchecked Sendableat the class level — but the scope of the suppression shrinks as properties become properly annotated.As per coding guidelines: "Use the latest Swift 6 APIs always."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@sdk/runanywhere-swift/Sources/RunAnywhere/Public/Sessions/LiveTranscriptionSession.swift` at line 39, The class-wide `@unchecked` Sendable suppression hides a single non-Sendable closure; change the onPartialCallback property in LiveTranscriptionSession to carry an explicit sendable function type (make onPartialCallback of type (`@Sendable` (String) -> Void)?), then remove or narrow the class-level `@unchecked` Sendable on LiveTranscriptionSession so the compiler can validate other members; if AnyCancellable or AudioCaptureManager remain non-Sendable, annotate those specific stored properties with nonisolated(unsafe) or keep a minimal `@unchecked` Sendable only for them rather than the whole class.
| run: | | ||
| CONFIG_FILE="src/infrastructure/network/development_config.cpp" | ||
| CLEAN_URL=$(printf '%s' "${SUPABASE_URL:-https://placeholder.supabase.co}" | tr -d '\n\r') | ||
| CLEAN_KEY=$(printf '%s' "${SUPABASE_ANON_KEY:-placeholder_key}" | tr -d '\n\r') | ||
| CLEAN_TOKEN=$(printf '%s' "${BUILD_TOKEN:-bt_test_build}" | tr -d '\n\r') | ||
|
|
||
| cat > "$CONFIG_FILE" << CONFIGEOF | ||
| #include <cstring> | ||
| #include "rac/core/rac_logger.h" | ||
| #include "rac/infrastructure/network/rac_dev_config.h" | ||
|
|
||
| namespace { | ||
| constexpr const char* SUPABASE_URL = "${CLEAN_URL}"; | ||
| constexpr const char* SUPABASE_ANON_KEY = "${CLEAN_KEY}"; | ||
| constexpr const char* BUILD_TOKEN = "${CLEAN_TOKEN}"; | ||
| constexpr const char* SENTRY_DSN = nullptr; | ||
| } | ||
|
|
||
| extern "C" { | ||
| bool rac_dev_config_is_available(void) { | ||
| if (SUPABASE_URL == nullptr || SUPABASE_ANON_KEY == nullptr) return false; | ||
| if (std::strlen(SUPABASE_URL) == 0 || std::strlen(SUPABASE_ANON_KEY) == 0) return false; | ||
| if (std::strstr(SUPABASE_URL, "YOUR_") != nullptr) return false; | ||
| if (std::strstr(SUPABASE_ANON_KEY, "YOUR_") != nullptr) return false; | ||
| return true; | ||
| } | ||
| const char* rac_dev_config_get_supabase_url(void) { return SUPABASE_URL; } | ||
| const char* rac_dev_config_get_supabase_key(void) { return SUPABASE_ANON_KEY; } | ||
| const char* rac_dev_config_get_build_token(void) { return BUILD_TOKEN; } | ||
| const char* rac_dev_config_get_sentry_dsn(void) { return SENTRY_DSN; } | ||
| bool rac_dev_config_has_supabase(void) { return rac_dev_config_is_available(); } | ||
| bool rac_dev_config_has_build_token(void) { return BUILD_TOKEN != nullptr && std::strlen(BUILD_TOKEN) > 0; } | ||
| } | ||
| CONFIGEOF |
There was a problem hiding this comment.
Unescaped secret values in C++ string literals can break compilation or inject code.
The heredoc delimiter CONFIGEOF is unquoted, so the shell expands ${CLEAN_URL}, ${CLEAN_KEY}, and ${CLEAN_TOKEN} directly into C++ string literals. If any secret value contains characters like ", \, or $, the generated C++ file will either fail to compile or produce incorrect string values. The tr -d '\n\r' sanitization only strips newlines.
Add escaping for C-string-dangerous characters before interpolation, e.g.:
Proposed fix
- CLEAN_URL=$(printf '%s' "${SUPABASE_URL:-https://placeholder.supabase.co}" | tr -d '\n\r')
- CLEAN_KEY=$(printf '%s' "${SUPABASE_ANON_KEY:-placeholder_key}" | tr -d '\n\r')
- CLEAN_TOKEN=$(printf '%s' "${BUILD_TOKEN:-bt_test_build}" | tr -d '\n\r')
+ CLEAN_URL=$(printf '%s' "${SUPABASE_URL:-https://placeholder.supabase.co}" | tr -d '\n\r' | sed 's/\\/\\\\/g; s/"/\\"/g')
+ CLEAN_KEY=$(printf '%s' "${SUPABASE_ANON_KEY:-placeholder_key}" | tr -d '\n\r' | sed 's/\\/\\\\/g; s/"/\\"/g')
+ CLEAN_TOKEN=$(printf '%s' "${BUILD_TOKEN:-bt_test_build}" | tr -d '\n\r' | sed 's/\\/\\\\/g; s/"/\\"/g')The same issue exists in the identical block at lines 201–234 (cpp-ios job).
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In @.github/workflows/build-all-test.yml around lines 114 - 147, The heredoc
injects raw CLEAN_URL/CLEAN_KEY/CLEAN_TOKEN into C++ string literals unescaped,
which can break compilation or inject code; before writing the file, add
escaping for C-string-special characters for each variable (CLEAN_URL,
CLEAN_KEY, CLEAN_TOKEN) by replacing backslashes with "\\\\", double quotes with
"\\\"", and encoding newlines/returns as "\\n" (and optionally escaping dollar
signs if you want to be extra safe), then use those escaped variables in the
heredoc (still writing the same variables into
SUPABASE_URL/SUPABASE_ANON_KEY/BUILD_TOKEN). Apply the same fix to the identical
block used by the cpp-ios job.
| - name: Run Tests | ||
| working-directory: sdk/runanywhere-flutter | ||
| run: melos run test || true | ||
| continue-on-error: true |
There was a problem hiding this comment.
Double failure suppression hides real breakage.
melos run test || true on line 397 already ensures the shell exits 0 on failure, and continue-on-error: true on line 398 independently marks the step green on failure. Using both means this step can never surface a failure in the job summary. The same pattern appears in react-native-sdk (lines 428–429).
Pick one strategy — use continue-on-error: true alone so that the step status still shows ❌ (visible as a warning) even though the job continues:
Proposed fix
- name: Run Tests
working-directory: sdk/runanywhere-flutter
- run: melos run test || true
+ run: melos run test
continue-on-error: true🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In @.github/workflows/build-all-test.yml around lines 395 - 398, The test step
"Run Tests" in the sdk/runanywhere-flutter workflow currently suppresses
failures twice (shell fallback `melos run test || true` plus `continue-on-error:
true`), which hides real failures; remove the shell fallback (`|| true`) from
the `run: melos run test || true` command so the step relies only on
`continue-on-error: true` to allow the job to continue while still showing a
failing step, and apply the same change for the react-native-sdk test step
(remove `|| true` from its `melos run test || true` invocation).
| // ---- Query loaded adapters ---- | ||
| // Returns a JSON array string describing all loaded adapters. | ||
| // Format: [{"path":"/path/to/file.gguf","scale":1.0,"applied":true}, ...] | ||
| // Caller MUST free the returned string with free(). | ||
| // | ||
| // Returns: RAC_SUCCESS, RAC_ERROR_COMPONENT_NOT_READY | ||
| rac_result_t rac_llm_component_get_lora_info( | ||
| rac_handle_t handle, | ||
| char** out_json // Output: heap-allocated JSON string | ||
| ); |
There was a problem hiding this comment.
Documentation says free() but the header says rac_free().
Line 283 says "Caller MUST free the returned string with free()" and the C usage example on line 402 also uses free(json). However, both rac_llm_component.h and rac_llm_llamacpp.h document rac_free(). Align these to avoid confusion for integrators.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@docs/impl/lora_adapter_support.md` around lines 280 - 289, The docs currently
state the caller must free the string with free(), but the public API headers
(rac_llm_component.h and rac_llm_llamacpp.h) expose rac_free(); update the
documentation for rac_llm_component_get_lora_info to require the caller to
release the returned heap string with rac_free() (and update the C usage example
at line ~402 to call rac_free(json) instead of free(json)), and ensure the
function signature comment mentions "Output: heap-allocated JSON string (must be
freed with rac_free())" to match the documented API.
| rac_handle_t component = 0; | ||
| rac_llm_component_create(&component); | ||
| rac_llm_component_load_model(component, "/path/to/model.gguf", | ||
| "my-model", "My Model", NULL); | ||
|
|
There was a problem hiding this comment.
Incorrect rac_llm_component_load_model call signature in the C example.
The call passes 5 arguments including a trailing NULL, but the declaration in rac_llm_component.h (line 120-121) takes only 4 parameters: (handle, model_path, model_id, model_name). This will cause a compilation error if someone copies this example.
Proposed fix
- rac_llm_component_load_model(component, "/path/to/model.gguf",
- "my-model", "My Model", NULL);
+ rac_llm_component_load_model(component, "/path/to/model.gguf",
+ "my-model", "My Model");📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| rac_handle_t component = 0; | |
| rac_llm_component_create(&component); | |
| rac_llm_component_load_model(component, "/path/to/model.gguf", | |
| "my-model", "My Model", NULL); | |
| rac_handle_t component = 0; | |
| rac_llm_component_create(&component); | |
| rac_llm_component_load_model(component, "/path/to/model.gguf", | |
| "my-model", "My Model"); |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@docs/impl/lora_adapter_support.md` around lines 379 - 383, The example uses
rac_llm_component_load_model with five arguments including a trailing NULL which
doesn't match the declared signature; update the call to pass only the four
parameters (handle, model_path, model_id, model_name) used by
rac_llm_component_load_model, leaving the rac_llm_component_create and
rac_handle_t component usage intact so the example compiles against the header
declaration.
| @@ -0,0 +1,15 @@ | |||
| *.iml | |||
| .gradle | |||
| /local.properties | |||
There was a problem hiding this comment.
Duplicate local.properties entry.
Line 3 (/local.properties) is redundant; the unanchored local.properties on Line 15 already covers the project root and any subdirectory. One of the two can be removed.
🧹 Proposed fix
-/local.properties
/.idea/caches
...
local.propertiesAlso applies to: 15-15
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@examples/android/RunAnyWhereLora/.gitignore` at line 3, Remove the duplicate
gitignore entry by deleting the redundant "/local.properties" line and keep the
unanchored "local.properties" entry so the file only contains one rule for
ignoring local.properties; reference the two entries "/local.properties" and
"local.properties" in the .gitignore and remove the anchored variant.
| // Recreate context so the new adapter is visible | ||
| if (!recreate_context()) { | ||
| // Remove the adapter entry we just added on failure | ||
| lora_adapters_.pop_back(); | ||
| return false; | ||
| } | ||
|
|
||
| // Apply all loaded adapters to the new context | ||
| if (!apply_lora_adapters()) { | ||
| lora_adapters_.pop_back(); | ||
| return false; |
There was a problem hiding this comment.
Orphaned adapter object on failure path.
If recreate_context() succeeds but apply_lora_adapters() fails, pop_back() drops the tracking entry, but the llama_adapter_lora object created at line 859 remains allocated against the model (since adapter memory is freed with the model, not individually). This means a "ghost" adapter is loaded in the model but untracked by lora_adapters_.
This isn't a memory leak (freed on model unload), but repeated load failures would accumulate orphaned adapters consuming memory until the model is unloaded. Consider explicitly freeing the adapter on the failure path if the llama.cpp API allows it, or at minimum documenting this behavior.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@sdk/runanywhere-commons/src/backends/llamacpp/llamacpp_backend.cpp` around
lines 873 - 883, When apply_lora_adapters() fails after recreate_context() has
succeeded, the previously created llama_adapter_lora object remains loaded into
the model but is removed from lora_adapters_ — free that adapter explicitly
before popping the tracking entry: call the appropriate llama.cpp API to
unload/free the adapter object you created (the llama_adapter_lora instance made
just before recreate_context()), then remove it from lora_adapters_
(pop_back()); if the llama.cpp API does not expose an unload/free call, add a
clear comment documenting that failed loads leave orphaned adapters and consider
guarding repeated attempts or changing ownership so the adapter is freed on
failure.
| rac_result_t rac_llm_llamacpp_get_lora_info(rac_handle_t handle, char** out_json) { | ||
| if (handle == nullptr || out_json == nullptr) { | ||
| return RAC_ERROR_NULL_POINTER; | ||
| } | ||
|
|
||
| auto* h = static_cast<rac_llm_llamacpp_handle_impl*>(handle); | ||
| if (!h->text_gen) { | ||
| return RAC_ERROR_INVALID_HANDLE; | ||
| } | ||
|
|
||
| auto info = h->text_gen->get_lora_info(); | ||
| std::string json_str = info.dump(); | ||
| *out_json = strdup(json_str.c_str()); | ||
|
|
||
| return RAC_SUCCESS; | ||
| } |
There was a problem hiding this comment.
strdup can return NULL on OOM, leading to RAC_SUCCESS with a null out_json.
If strdup fails, *out_json will be NULL but the function returns RAC_SUCCESS, which is misleading to callers. The same pattern exists in rac_llm_llamacpp_get_model_info (line 327), so this is a pre-existing issue, but worth addressing for the new code.
Proposed fix
auto info = h->text_gen->get_lora_info();
std::string json_str = info.dump();
*out_json = strdup(json_str.c_str());
+ if (!*out_json) {
+ return RAC_ERROR_OUT_OF_MEMORY;
+ }
return RAC_SUCCESS;📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| rac_result_t rac_llm_llamacpp_get_lora_info(rac_handle_t handle, char** out_json) { | |
| if (handle == nullptr || out_json == nullptr) { | |
| return RAC_ERROR_NULL_POINTER; | |
| } | |
| auto* h = static_cast<rac_llm_llamacpp_handle_impl*>(handle); | |
| if (!h->text_gen) { | |
| return RAC_ERROR_INVALID_HANDLE; | |
| } | |
| auto info = h->text_gen->get_lora_info(); | |
| std::string json_str = info.dump(); | |
| *out_json = strdup(json_str.c_str()); | |
| return RAC_SUCCESS; | |
| } | |
| rac_result_t rac_llm_llamacpp_get_lora_info(rac_handle_t handle, char** out_json) { | |
| if (handle == nullptr || out_json == nullptr) { | |
| return RAC_ERROR_NULL_POINTER; | |
| } | |
| auto* h = static_cast<rac_llm_llamacpp_handle_impl*>(handle); | |
| if (!h->text_gen) { | |
| return RAC_ERROR_INVALID_HANDLE; | |
| } | |
| auto info = h->text_gen->get_lora_info(); | |
| std::string json_str = info.dump(); | |
| *out_json = strdup(json_str.c_str()); | |
| if (!*out_json) { | |
| return RAC_ERROR_OUT_OF_MEMORY; | |
| } | |
| return RAC_SUCCESS; | |
| } |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@sdk/runanywhere-commons/src/backends/llamacpp/rac_llm_llamacpp.cpp` around
lines 388 - 403, The function rac_llm_llamacpp_get_lora_info currently calls
strdup and returns RAC_SUCCESS without checking for NULL, which can return a
success with a null out_json on OOM; change the code to check the result of
strdup (after creating json_str) and if it returns NULL set *out_json = NULL and
return an out-of-memory error (e.g., RAC_ERROR_OUT_OF_MEMORY or the project
equivalent) instead of RAC_SUCCESS; apply the same check-and-return fix to
rac_llm_llamacpp_get_model_info so callers never receive RAC_SUCCESS with a null
out_json and there are no leaks.
| JNIEXPORT jstring JNICALL | ||
| Java_com_runanywhere_sdk_native_bridge_RunAnywhereBridge_racLlmComponentGetLoraInfo( | ||
| JNIEnv* env, jclass clazz, jlong handle) { | ||
| if (handle == 0) { | ||
| return nullptr; | ||
| } | ||
|
|
||
| char* json = nullptr; | ||
| rac_result_t result = rac_llm_component_get_lora_info( | ||
| reinterpret_cast<rac_handle_t>(handle), &json); | ||
|
|
||
| if (result != RAC_SUCCESS || !json) { | ||
| return nullptr; | ||
| } | ||
|
|
||
| jstring jresult = env->NewStringUTF(json); | ||
| rac_free(json); | ||
| return jresult; |
There was a problem hiding this comment.
Free json on error to avoid a potential leak.
If the backend ever allocates json but returns a non-success result, the current code would leak. Consider freeing json when result != RAC_SUCCESS and json != nullptr before returning.
Suggested fix
char* json = nullptr;
rac_result_t result = rac_llm_component_get_lora_info(
reinterpret_cast<rac_handle_t>(handle), &json);
- if (result != RAC_SUCCESS || !json) {
+ if (result != RAC_SUCCESS || !json) {
+ if (json) {
+ rac_free(json);
+ }
return nullptr;
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| JNIEXPORT jstring JNICALL | |
| Java_com_runanywhere_sdk_native_bridge_RunAnywhereBridge_racLlmComponentGetLoraInfo( | |
| JNIEnv* env, jclass clazz, jlong handle) { | |
| if (handle == 0) { | |
| return nullptr; | |
| } | |
| char* json = nullptr; | |
| rac_result_t result = rac_llm_component_get_lora_info( | |
| reinterpret_cast<rac_handle_t>(handle), &json); | |
| if (result != RAC_SUCCESS || !json) { | |
| return nullptr; | |
| } | |
| jstring jresult = env->NewStringUTF(json); | |
| rac_free(json); | |
| return jresult; | |
| JNIEXPORT jstring JNICALL | |
| Java_com_runanywhere_sdk_native_bridge_RunAnywhereBridge_racLlmComponentGetLoraInfo( | |
| JNIEnv* env, jclass clazz, jlong handle) { | |
| if (handle == 0) { | |
| return nullptr; | |
| } | |
| char* json = nullptr; | |
| rac_result_t result = rac_llm_component_get_lora_info( | |
| reinterpret_cast<rac_handle_t>(handle), &json); | |
| if (result != RAC_SUCCESS || !json) { | |
| if (json) { | |
| rac_free(json); | |
| } | |
| return nullptr; | |
| } | |
| jstring jresult = env->NewStringUTF(json); | |
| rac_free(json); | |
| return jresult; | |
| } |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@sdk/runanywhere-commons/src/jni/runanywhere_commons_jni.cpp` around lines
1164 - 1181, In
Java_com_runanywhere_sdk_native_bridge_RunAnywhereBridge_racLlmComponentGetLoraInfo,
ensure any memory allocated into json by rac_llm_component_get_lora_info is
freed on error: if rac_llm_component_get_lora_info returns non-success but json
!= nullptr, call rac_free(json) before returning nullptr; keep the successful
path that converts json with env->NewStringUTF, frees json via rac_free, and
returns the jstring. Use the existing symbols rac_llm_component_get_lora_info,
rac_free, and env->NewStringUTF to locate and implement the fix.
| fun loadLoraAdapter(adapterPath: String, scale: Float = 1.0f): Int { | ||
| synchronized(lock) { | ||
| if (handle == 0L || state != LLMState.READY) { | ||
| CppBridgePlatformAdapter.logCallback( | ||
| CppBridgePlatformAdapter.LogLevel.ERROR, | ||
| TAG, | ||
| "Cannot load LoRA adapter: model not ready (state=${LLMState.getName(state)})", | ||
| ) | ||
| return -1 | ||
| } | ||
|
|
||
| CppBridgePlatformAdapter.logCallback( | ||
| CppBridgePlatformAdapter.LogLevel.INFO, | ||
| TAG, | ||
| "Loading LoRA adapter: $adapterPath (scale=$scale)", | ||
| ) | ||
|
|
||
| val result = RunAnywhereBridge.racLlmComponentLoadLora(handle, adapterPath, scale) | ||
| if (result != 0) { | ||
| CppBridgePlatformAdapter.logCallback( | ||
| CppBridgePlatformAdapter.LogLevel.ERROR, | ||
| TAG, | ||
| "Failed to load LoRA adapter: $adapterPath (error=$result)", | ||
| ) | ||
| } else { | ||
| CppBridgePlatformAdapter.logCallback( | ||
| CppBridgePlatformAdapter.LogLevel.INFO, | ||
| TAG, | ||
| "LoRA adapter loaded: $adapterPath", | ||
| ) | ||
| } | ||
|
|
||
| return result | ||
| } | ||
| } | ||
|
|
||
| /** | ||
| * Remove a specific LoRA adapter by path. | ||
| * | ||
| * @param adapterPath Path used when loading the adapter | ||
| * @return 0 on success, error code on failure | ||
| */ | ||
| fun removeLoraAdapter(adapterPath: String): Int { | ||
| synchronized(lock) { | ||
| if (handle == 0L || state != LLMState.READY) { | ||
| return -1 | ||
| } | ||
|
|
||
| val result = RunAnywhereBridge.racLlmComponentRemoveLora(handle, adapterPath) | ||
| if (result == 0) { | ||
| CppBridgePlatformAdapter.logCallback( | ||
| CppBridgePlatformAdapter.LogLevel.INFO, | ||
| TAG, | ||
| "LoRA adapter removed: $adapterPath", | ||
| ) | ||
| } | ||
| return result | ||
| } | ||
| } | ||
|
|
||
| /** | ||
| * Remove all LoRA adapters. | ||
| * | ||
| * @return 0 on success, error code on failure | ||
| */ | ||
| fun clearLoraAdapters(): Int { | ||
| synchronized(lock) { | ||
| if (handle == 0L) { | ||
| return -1 | ||
| } |
There was a problem hiding this comment.
Use RAC error codes instead of -1 for precondition failures.
Returning -1 from loadLoraAdapter/removeLoraAdapter (and similar guards) makes error handling inconsistent with the RAC error space used by the JNI/C API. Please return the appropriate RAC error constants (e.g., invalid handle / not ready) so the Kotlin layer can map errors predictably.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In
`@sdk/runanywhere-kotlin/src/jvmAndroidMain/kotlin/com/runanywhere/sdk/foundation/bridge/extensions/CppBridgeLLM.kt`
around lines 837 - 906, Change the hardcoded -1 precondition returns in
loadLoraAdapter, removeLoraAdapter and clearLoraAdapters to use the RAC error
constants exposed by the JNI bridge (refer to RunAnywhereBridge for the proper
constants), e.g. return the RAC_INVALID_HANDLE constant when handle == 0L and
the RAC_NOT_READY constant when state != LLMState.READY; update the checks
around synchronized(lock) in those methods (and any similar guard sites) to
return the corresponding RunAnywhereBridge.RAC_* error codes instead of -1 so
the Kotlin layer maps errors consistently with the C/JNI error space.
| // Start streaming transcription with callbacks | ||
| do { | ||
| try await RunAnywhere.startStreamingTranscription( | ||
| try await Self.startLegacyStreaming( |
There was a problem hiding this comment.
@available(*, deprecated) on the private wrapper re-introduces a deprecation warning at the call site
Because start() is not deprecated, the compiler emits a warning at line 130 when it calls the deprecated Self.startLegacyStreaming(...). The comment "to silence deprecation warning" is inaccurate — the warning is preserved (or even introduced, if RunAnywhere.startStreamingTranscription is not yet deprecated) at the call site in start().
Removing @available(*, deprecated) from the wrapper is the better approach: the warning from startStreamingTranscription (if it is deprecated) will appear once inside the wrapper body — exactly where the deprecated API is used — and start() remains warning-free. The method name and // TODO comment already communicate the transitional status:
🐛 Proposed fix
- // Wrapper to silence deprecation warning until migration to transcribeStream
- `@available`(*, deprecated, message: "Migrate to transcribeStream API")
- private static func startLegacyStreaming(
+ // TODO: Migrate to transcribeStream API — transitional wrapper isolating deprecated startStreamingTranscription usage
+ private static func startLegacyStreaming(Also applies to: 216-230
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In
`@sdk/runanywhere-swift/Sources/RunAnywhere/Public/Sessions/LiveTranscriptionSession.swift`
at line 130, The private wrapper method marked with `@available`(*, deprecated)
(the one calling Self.startLegacyStreaming(...) from start()) should have that
deprecation attribute removed so start() no longer emits a deprecation warning;
keep the wrapper body using the deprecated
RunAnywhere.startStreamingTranscription so any deprecation warning appears
inside the wrapper where the deprecated API is actually used, and retain the
existing TODO/comment indicating transitional status. Apply the same change to
the other private wrapper method referenced around lines 216-230 (the second
wrapper that forwards to startStreamingTranscription) so neither public start()
nor other callers get re-introduced deprecation warnings.
* ios initial changes * minimal sample needed to test lora * updating docs * addressed the comments
|
Hi @Siddhesh2377 For now, Additionally, do have full integration coming in via the console too - which could be a bit later if it does takes some time. |
There was a problem hiding this comment.
Actionable comments posted: 10
🧹 Nitpick comments (10)
examples/ios/RunAnywhereAI/RunAnywhereAI/Features/Chat/Models/DemoLoRAAdapter.swift (1)
95-97: Prefer Swift-nativeByteCountFormatStyleover legacyByteCountFormatter
ByteCountFormatteris an Objective-C–bridged class. TheFormatStyleprotocol was introduced with iOS 15 and Apple itself now recommends that new Swift code uses it instead of the oldFormattersubclasses ported from Objective-C.ByteCountFormatStyle(available from Xcode 13+) is the Swift-native equivalent for formattingInt64values.Replace the computed property body:
♻️ Proposed refactor
var fileSizeFormatted: String { - ByteCountFormatter.string(fromByteCount: fileSize, countStyle: .file) + fileSize.formatted(.byteCount(style: .file)) }As per coding guidelines: "Use the latest Swift 6 APIs always."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@examples/ios/RunAnywhereAI/RunAnywhereAI/Features/Chat/Models/DemoLoRAAdapter.swift` around lines 95 - 97, Replace the Objective-C–bridged ByteCountFormatter usage in the fileSizeFormatted computed property with the Swift-native ByteCountFormatStyle: locate the var fileSizeFormatted in DemoLoRAAdapter.swift and change its body to use ByteCountFormatStyle() (or .file) to format the Int64 fileSize, ensuring you call the format(_: ) method on the style (or use fileSize.formatted(...)) so the code uses the modern Swift FormatStyle API rather than ByteCountFormatter.sdk/runanywhere-swift/Sources/LlamaCPPRuntime/include/rac_llm_llamacpp.h (1)
231-241: Two clarifications needed inrac_llm_llamacpp_get_lora_info's doc.
Empty-list behavior: The doc doesn't specify what is returned when no adapters are loaded. Does
out_jsonreceive an empty array ("[]") and the function returnRAC_SUCCESS, or isout_jsonleft NULL? Callers need to handle these cases differently.The
"applied"field: The JSON schema includes"applied":true, which implies adapters can exist in a non-applied state. If that's a transient condition only (e.g., during partial load failure), it should be documented. If the field is alwaystruefor adapters returned by this call, the field is misleading and should be removed or renamed.📝 Suggested doc additions
/** * Get info about loaded LoRA adapters as JSON. * - * Returns JSON array: [{"path":"...", "scale":1.0, "applied":true}, ...] + * Returns JSON array: [{"path":"...", "scale":1.0, "applied":true}, ...] + * Returns an empty array ("[]") with RAC_SUCCESS when no adapters are loaded. + * "applied" indicates the adapter was successfully set on the current context; + * may be false for adapters in an inconsistent partial-load state. * * `@param` handle Service handle * `@param` out_json Output: JSON string (caller must free with rac_free) * `@return` RAC_SUCCESS or error code */🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@sdk/runanywhere-swift/Sources/LlamaCPPRuntime/include/rac_llm_llamacpp.h` around lines 231 - 241, Update the rac_llm_llamacpp_get_lora_info docblock to (1) state exact empty-list behavior: when no LoRA adapters are loaded the function returns RAC_SUCCESS and sets out_json to an allocated JSON empty array ("[]") which the caller must free with rac_free (do not leave out_json NULL), and (2) clarify the meaning of the "applied" field by stating whether it can be false (e.g., for adapters that are loaded but not currently applied or during partial/failed apply) or, if this call only returns adapters that are actively applied, remove or rename the field and state that returned objects always have "applied": true; update the comment above rac_llm_llamacpp_get_lora_info accordingly.sdk/runanywhere-swift/Sources/RunAnywhere/Public/Extensions/LLM/LLMTypes.swift (1)
414-421: Consider validatingscalewith apreconditionparallel to thepathcheck.The docstring documents
0.0 to 1.0+but nothing prevents a caller from passing a negative scale, which would yield undefined behaviour in the C++ backend. Sincepath.isEmptyis already guarded withprecondition, adding a matching guard for scale maintains symmetry.🛡️ Proposed validation
public init(path: String, scale: Float = 1.0) { precondition(!path.isEmpty, "LoRA adapter path cannot be empty") + precondition(scale >= 0.0, "LoRA adapter scale must be non-negative") self.path = path self.scale = scale }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@sdk/runanywhere-swift/Sources/RunAnywhere/Public/Extensions/LLM/LLMTypes.swift` around lines 414 - 421, Add a validation for the scale argument in the public init(path:scale:) initializer of LLMTypes (the LoRA adapter struct) by adding a precondition that scale is non-negative (e.g., scale >= 0.0) so callers cannot pass negative values that would break the C++ backend; keep the existing path precondition and consider also ensuring scale is finite (isFinite) if you want to reject NaN/inf as well.sdk/runanywhere-swift/Sources/RunAnywhere/Foundation/Bridge/Extensions/CppBridge+LLM.swift (1)
94-129: Inconsistent handle access across the four LoRA methods.
loadLoraAdaptercallsgetHandle()(which lazily creates the component), whileremoveLoraAdapterandclearLoraAdaptersuseguard let handle = handle(fail-fast if no component exists). This meansloadLoraAdaptercan silently create an uninitialized component then fail at the C layer, while the other two methods fail immediately with a clear error. All three should use the same strategy — since loading a LoRA requires a live component with a model, failing fast is preferable.♻️ Proposed fix — align `loadLoraAdapter` with the fail-fast pattern
public func loadLoraAdapter(_ config: LoRAAdapterConfig) throws { - let handle = try getHandle() + guard let handle = handle else { + throw SDKError.llm(.invalidState, "No LLM component active") + } let result = config.path.withCString { pathPtr in rac_llm_component_load_lora(handle, pathPtr, config.scale) }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@sdk/runanywhere-swift/Sources/RunAnywhere/Foundation/Bridge/Extensions/CppBridge`+LLM.swift around lines 94 - 129, loadLoraAdapter currently calls getHandle() while removeLoraAdapter/clearLoraAdapters use guard let handle = handle, causing inconsistent behavior; update loadLoraAdapter to use the same fail-fast pattern—replace the getHandle() call with guard let handle = handle else { throw SDKError.llm(.invalidState, "No LLM component active") }, then call rac_llm_component_load_lora(handle, ...) inside path.withCString and keep the existing result check and logger; also align the error variant used on C-layer failure to the same .invalidState style used by removeLoraAdapter/clearLoraAdapters for consistency.examples/ios/RunAnywhereAI/RunAnywhereAI/Features/Chat/Views/ChatInterfaceView.swift (4)
510-546: Duplicate tap action inloraAdapterBadgeandloraAddButton.Both buttons call
viewModel.refreshAvailableDemoAdapters()then setshowingLoRAManagement = true. Extract this into a private method to avoid silent divergence.♻️ Proposed fix
+ private func openLoRAManagement() { + viewModel.refreshAvailableDemoAdapters() + showingLoRAManagement = true + } var loraAdapterBadge: some View { Button { - viewModel.refreshAvailableDemoAdapters() - showingLoRAManagement = true + openLoRAManagement() } label: { ... } } var loraAddButton: some View { Button { - viewModel.refreshAvailableDemoAdapters() - showingLoRAManagement = true + openLoRAManagement() } label: { ... } }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@examples/ios/RunAnywhereAI/RunAnywhereAI/Features/Chat/Views/ChatInterfaceView.swift` around lines 510 - 546, Extract the duplicated tap logic into a single private helper (e.g., openLoRAManagement()) and call that from both loraAdapterBadge and loraAddButton; the helper should invoke viewModel.refreshAvailableDemoAdapters() and then set showingLoRAManagement = true so both Button actions use the same implementation and cannot silently diverge.
462-462: Padding condition is alwaystruewheninputAreais visible.
inputAreais only rendered whenhasModelSelectedistrue(viacontentArea), so the ternary always resolves to8. The condition containingviewModel.useToolCalling || !viewModel.loraAdapters.isEmptyis redundant.♻️ Proposed simplification
-.padding(.top, (viewModel.useToolCalling || !viewModel.loraAdapters.isEmpty || hasModelSelected) ? 8 : 0) +.padding(.top, 8)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@examples/ios/RunAnywhereAI/RunAnywhereAI/Features/Chat/Views/ChatInterfaceView.swift` at line 462, The padding condition on the input area is redundant because inputArea is only rendered when hasModelSelected is true (via contentArea), so simplify the expression by removing the unnecessary checks; update the .padding(.top, (viewModel.useToolCalling || !viewModel.loraAdapters.isEmpty || hasModelSelected) ? 8 : 0) to only check hasModelSelected (or unconditionally apply 8 when inputArea is present) in ChatInterfaceView by editing the line that references viewModel.useToolCalling, viewModel.loraAdapters, and hasModelSelected so the ternary no longer includes the redundant conditions.
92-102: Consider scoping the file picker to.gguffiles.
[.data]allows every binary file, while the section footer (line 805) explicitly says "Select a .gguf LoRA adapter file". A dynamically-resolved UTType keeps the picker focused and makes the mismatch between hint and filter disappear.♻️ Proposed fix
-import UniformTypeIdentifiers +import UniformTypeIdentifiers // At the call-site: .fileImporter( isPresented: $showingLoRAFilePicker, - allowedContentTypes: [.data], + allowedContentTypes: [UTType(filenameExtension: "gguf") ?? .data], allowsMultipleSelection: false )🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@examples/ios/RunAnywhereAI/RunAnywhereAI/Features/Chat/Views/ChatInterfaceView.swift` around lines 92 - 102, The file picker currently uses a broad [.data] UTType; replace this with a dynamically-resolved UTType for the .gguf extension so the picker is scoped to LoRA adapter files. In the fileImporter call (where showingLoRAFilePicker is used), resolve UTType(filenameExtension: "gguf") and pass it in allowedContentTypes (e.g., allowedContentTypes: [ggufType]) with a safe fallback to [.data] if resolution fails; also ensure UniformTypeIdentifiers is available so UTType can be used.
124-128: ReplaceDispatchQueue.main.asyncAfterwith Swift 6 structured concurrency.
DispatchQueue.main.asyncAfteris the GCD API; the Swift 6 equivalent isTask {@mainactorin try? await Task.sleep(for:) }.♻️ Proposed fix
onOpenFilePicker: { showingLoRAManagement = false - DispatchQueue.main.asyncAfter(deadline: .now() + 0.3) { - showingLoRAFilePicker = true - } + Task { `@MainActor` in + try? await Task.sleep(for: .seconds(0.3)) + showingLoRAFilePicker = true + } },As per coding guidelines: "Use the latest Swift 6 APIs always."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@examples/ios/RunAnywhereAI/RunAnywhereAI/Features/Chat/Views/ChatInterfaceView.swift` around lines 124 - 128, In the onOpenFilePicker closure (where showingLoRAManagement is set false and showingLoRAFilePicker is scheduled with DispatchQueue.main.asyncAfter), replace the GCD call with a Swift 6 structured-concurrency Task that runs on the main actor: create Task annotated to run on `@MainActor`, await a 300ms sleep using Task.sleep(for: .milliseconds(300)) (using try? if needed), then set showingLoRAFilePicker = true inside that Task; leave the initial showingLoRAManagement = false assignment untouched so UI state updates remain on the main actor.examples/ios/RunAnywhereAI/RunAnywhereAI/Features/Chat/ViewModels/LLMViewModel.swift (2)
455-458:urls(for:in:)[0]subscript will crash if the array is empty.
FileManager.default.urls(for: .documentDirectory, in: .userDomainMask)always returns at least one element on iOS/macOS in practice, but the subscript gives no compile-time safety. Preferfirstwith a documented precondition.♻️ Proposed fix
static func loraDownloadDirectory() -> URL { - let docs = FileManager.default.urls(for: .documentDirectory, in: .userDomainMask)[0] + let docs = FileManager.default.urls(for: .documentDirectory, in: .userDomainMask).first! return docs.appendingPathComponent("LoRA", isDirectory: true) }Or, for full safety, return an
Optional<URL>and propagate the nil upward.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@examples/ios/RunAnywhereAI/RunAnywhereAI/Features/Chat/ViewModels/LLMViewModel.swift` around lines 455 - 458, The loraDownloadDirectory() function currently force-subscripts the array returned by FileManager.default.urls(for:in:) which can crash; change it to safely unwrap using .first and a documented precondition (or alternatively return Optional<URL> and propagate nil) so that loraDownloadDirectory() either returns the first URL via guard/if let with a preconditionFailure message or becomes Optional<URL>; update references to loraDownloadDirectory() accordingly if you choose the Optional route.
340-358:removeLoraAdapterandclearLoraAdaptersdon't toggleisLoadingLoRA, unlikeloadLoraAdapter.Both operations call into the async C++ bridge and may take measurable time. Omitting the
isLoadingLoRAguard leaves the UI without a loading indicator for these operations and could allow concurrent adapter mutations.♻️ Proposed fix
func removeLoraAdapter(path: String) async { + isLoadingLoRA = true do { try await RunAnywhere.removeLoraAdapter(path) await refreshLoraAdapters() } catch { logger.error("Failed to remove LoRA adapter: \(error)") self.error = error } + isLoadingLoRA = false } func clearLoraAdapters() async { + isLoadingLoRA = true do { try await RunAnywhere.clearLoraAdapters() loraAdapters = [] } catch { logger.error("Failed to clear LoRA adapters: \(error)") self.error = error } + isLoadingLoRA = false }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@examples/ios/RunAnywhereAI/RunAnywhereAI/Features/Chat/ViewModels/LLMViewModel.swift` around lines 340 - 358, The removeLoraAdapter(path:) and clearLoraAdapters() methods lack the isLoadingLoRA guard and toggle; update both to mirror loadLoraAdapter by first checking guard (if isLoadingLoRA { return }), set isLoadingLoRA = true at the start, and ensure isLoadingLoRA = false is executed on all exit paths (use defer or finally-style pattern) so the UI shows a loading state and concurrent adapter mutations are prevented; keep existing logging/error handling and call refreshLoraAdapters() or clear loraAdapters as currently implemented.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In
`@examples/ios/RunAnywhereAI/RunAnywhereAI/Features/Chat/ViewModels/LLMViewModelTypes.swift`:
- Around line 52-54: The progress calculation can exceed 1.0 when
totalBytesExpectedToWrite is inaccurate; update the block that computes progress
(using totalBytesWritten, totalBytesExpectedToWrite and calling onProgress) to
clamp the computed Double progress into the range 0.0...1.0 before calling
onProgress (e.g., apply min/max or Swift’s clamping to the progress value) and
keep the existing early return guard for zero expected bytes.
- Around line 38-59: The DownloadProgressDelegate using
urlSession(_:downloadTask:didWriteData:...) won't be called when using
URLSession.download(from:delegate:) because that API takes a
URLSessionTaskDelegate for task-level concerns; remove the
DownloadProgressDelegate class and instead track progress inside
LLMViewModel.downloadAdapter by capturing the returned URLSessionDownloadTask
and observing its task.progress.fractionCompleted (or switch to
URLSession.bytes(from:) and accumulate bytes) to update adapterDownloadProgress;
update references to DownloadProgressDelegate and the delegate parameter usage
accordingly.
In
`@examples/ios/RunAnywhereAI/RunAnywhereAI/Features/Chat/Views/ChatInterfaceView.swift`:
- Around line 111-115: The sheet is being dismissed unconditionally after
awaiting viewModel.loadLoraAdapter(path:scale:), so change the Task to handle
success/failure: call await inside a do/catch (or check the returned result) and
only set showingLoRAScaleSheet = false when loadLoraAdapter succeeds; on failure
set a viewModel.error (or trigger a dedicated alert state) so the UI surfaces
the error to the user; keep the existing defer { if accessed {
url.stopAccessingSecurityScopedResource() } } to always stop access.
- Line 559: Replace the deprecated NavigationView calls with NavigationStack in
the three views that currently use it: iOSView, LoRAScaleSheetView, and
LoRAManagementSheetView (the occurrences around the functions named iOSView,
LoRAScaleSheetView, and LoRAManagementSheetView). Change the container
initializer from NavigationView { ... } to NavigationStack { ... } (no other API
changes required since these views do not use NavigationLink); keep
.navigationBarTitleDisplayMode(.inline) and .navigationTitle(...) as-is and
ensure SwiftUI is imported.
In `@sdk/runanywhere-commons/src/backends/llamacpp/llamacpp_backend.cpp`:
- Around line 888-896: The calls to
llama_memory_clear(llama_get_memory(context_), true) can dereference null;
before each such call (where llama_get_memory is used: the llama_memory_clear
invocations after adapter changes and the two other new sites) retrieve
llama_memory_t mem = llama_get_memory(context_) and only call
llama_memory_clear(mem, true) if mem is non-null; update the sites that
currently call llama_memory_clear(llama_get_memory(context_), true) to follow
the existing safe pattern used elsewhere in this file.
- Around line 800-818: recreate_context() currently frees the old context and
returns false if llama_init_from_model(...) fails but leaves model_loaded_ true;
update the failure path after context_ becomes nullptr to set model_loaded_ =
false (and ensure sampler_ is cleared if applicable) and log the state change so
callers know the model is no longer loaded; locate the block around
llama_free(context_), llama_init_from_model(model_, ctx_params) and the
subsequent if (!context_) check to apply this change.
In `@sdk/runanywhere-swift/Sources/LlamaCPPRuntime/include/rac_llm_llamacpp.h`:
- Around line 195-209: Update the rac_llm_llamacpp_load_lora docblock to (1)
document that a partial-apply failure can leave already-applied adapters active
and callers must call rac_llm_llamacpp_clear_lora before retrying, (2) state
that adapter lifetime is tied to the model (do not free adapter objects
manually; use rac_llm_llamacpp_remove_lora / rac_llm_llamacpp_clear_lora for
logical removal) and reference the llama.cpp behavior, and (3) change the `@param`
scale text from "default 1.0" to clarify there is no API default (e.g.,
"recommended: 1.0"). Ensure these notes appear in the comment for
rac_llm_llamacpp_load_lora so callers see the behavioral risks and correct
cleanup functions.
In
`@sdk/runanywhere-swift/Sources/RunAnywhere/Foundation/Bridge/Extensions/CppBridge`+LLM.swift:
- Around line 94-103: In loadLoraAdapter (CppBridge+LLM.swift) the thrown
SDKError uses .modelLoadFailed but LoRA adapter failures should use
.invalidState; update the throw in public func loadLoraAdapter(_ config:
LoRAAdapterConfig) to throw SDKError.llm(.invalidState, "...") with the same
error message, matching the other LoRA methods (e.g., the companion LoRA
load/unload methods) for consistent error handling.
- Around line 132-154: Remove the spurious throws from getLoadedLoraAdapters (so
its signature is non-throwing) and ensure any C-allocated jsonPtr is always
freed: call rac_llm_component_get_lora_info(handle, &jsonPtr), capture jsonPtr
into a local temp before checking result, and call rac_free on that temp on all
exit paths (including the non-success branch) to avoid leaks; also update
callers (e.g., the code that currently calls try getLoadedLoraAdapters in
RunAnywhere+LoRA.swift) to drop the unnecessary try. Ensure parsing and
LoRAAdapterInfo construction remain the same but without a throws declaration.
In
`@sdk/runanywhere-swift/Sources/RunAnywhere/Public/Extensions/LLM/RunAnywhere`+LoRA.swift:
- Around line 53-58: The bridge call
CppBridge.LLM.shared.getLoadedLoraAdapters() no longer throws, so remove the
superfluous `try` in RunAnywhere+LoRA.swift while keeping the public method
signature (static func getLoadedLoraAdapters() async throws ->
[LoRAAdapterInfo]) as-is to preserve the isInitialized guard error; update the
return line to `return await CppBridge.LLM.shared.getLoadedLoraAdapters()` (no
`try`) and run tests to ensure compilation succeeds after the bridge method's
throws removal.
---
Duplicate comments:
In `@sdk/runanywhere-commons/src/backends/llamacpp/llamacpp_backend.cpp`:
- Around line 877-888: When apply_lora_adapters() fails after adding a new
adapter and recreating the context, the code only calls
lora_adapters_.pop_back() which leaves the runtime context in an inconsistent
state; update the failure path so that you restore the prior adapter state and
context: before modifying lora_adapters_ (the push/back that precedes
recreate_context()), snapshot the existing lora_adapters_ (or record the index
of the new adapter), and on any failure of recreate_context() or
apply_lora_adapters() revert the in-memory list (pop_back or restore snapshot)
AND recreate the context and re-apply the previous adapters (call
recreate_context() and apply_lora_adapters() for the prior set) to guarantee the
runtime matches the tracking list; reference recreate_context(),
apply_lora_adapters(), and lora_adapters_.pop_back() to locate where to
implement this rollback/restore logic.
---
Nitpick comments:
In
`@examples/ios/RunAnywhereAI/RunAnywhereAI/Features/Chat/Models/DemoLoRAAdapter.swift`:
- Around line 95-97: Replace the Objective-C–bridged ByteCountFormatter usage in
the fileSizeFormatted computed property with the Swift-native
ByteCountFormatStyle: locate the var fileSizeFormatted in DemoLoRAAdapter.swift
and change its body to use ByteCountFormatStyle() (or .file) to format the Int64
fileSize, ensuring you call the format(_: ) method on the style (or use
fileSize.formatted(...)) so the code uses the modern Swift FormatStyle API
rather than ByteCountFormatter.
In
`@examples/ios/RunAnywhereAI/RunAnywhereAI/Features/Chat/ViewModels/LLMViewModel.swift`:
- Around line 455-458: The loraDownloadDirectory() function currently
force-subscripts the array returned by FileManager.default.urls(for:in:) which
can crash; change it to safely unwrap using .first and a documented precondition
(or alternatively return Optional<URL> and propagate nil) so that
loraDownloadDirectory() either returns the first URL via guard/if let with a
preconditionFailure message or becomes Optional<URL>; update references to
loraDownloadDirectory() accordingly if you choose the Optional route.
- Around line 340-358: The removeLoraAdapter(path:) and clearLoraAdapters()
methods lack the isLoadingLoRA guard and toggle; update both to mirror
loadLoraAdapter by first checking guard (if isLoadingLoRA { return }), set
isLoadingLoRA = true at the start, and ensure isLoadingLoRA = false is executed
on all exit paths (use defer or finally-style pattern) so the UI shows a loading
state and concurrent adapter mutations are prevented; keep existing
logging/error handling and call refreshLoraAdapters() or clear loraAdapters as
currently implemented.
In
`@examples/ios/RunAnywhereAI/RunAnywhereAI/Features/Chat/Views/ChatInterfaceView.swift`:
- Around line 510-546: Extract the duplicated tap logic into a single private
helper (e.g., openLoRAManagement()) and call that from both loraAdapterBadge and
loraAddButton; the helper should invoke viewModel.refreshAvailableDemoAdapters()
and then set showingLoRAManagement = true so both Button actions use the same
implementation and cannot silently diverge.
- Line 462: The padding condition on the input area is redundant because
inputArea is only rendered when hasModelSelected is true (via contentArea), so
simplify the expression by removing the unnecessary checks; update the
.padding(.top, (viewModel.useToolCalling || !viewModel.loraAdapters.isEmpty ||
hasModelSelected) ? 8 : 0) to only check hasModelSelected (or unconditionally
apply 8 when inputArea is present) in ChatInterfaceView by editing the line that
references viewModel.useToolCalling, viewModel.loraAdapters, and
hasModelSelected so the ternary no longer includes the redundant conditions.
- Around line 92-102: The file picker currently uses a broad [.data] UTType;
replace this with a dynamically-resolved UTType for the .gguf extension so the
picker is scoped to LoRA adapter files. In the fileImporter call (where
showingLoRAFilePicker is used), resolve UTType(filenameExtension: "gguf") and
pass it in allowedContentTypes (e.g., allowedContentTypes: [ggufType]) with a
safe fallback to [.data] if resolution fails; also ensure UniformTypeIdentifiers
is available so UTType can be used.
- Around line 124-128: In the onOpenFilePicker closure (where
showingLoRAManagement is set false and showingLoRAFilePicker is scheduled with
DispatchQueue.main.asyncAfter), replace the GCD call with a Swift 6
structured-concurrency Task that runs on the main actor: create Task annotated
to run on `@MainActor`, await a 300ms sleep using Task.sleep(for:
.milliseconds(300)) (using try? if needed), then set showingLoRAFilePicker =
true inside that Task; leave the initial showingLoRAManagement = false
assignment untouched so UI state updates remain on the main actor.
In `@sdk/runanywhere-swift/Sources/LlamaCPPRuntime/include/rac_llm_llamacpp.h`:
- Around line 231-241: Update the rac_llm_llamacpp_get_lora_info docblock to (1)
state exact empty-list behavior: when no LoRA adapters are loaded the function
returns RAC_SUCCESS and sets out_json to an allocated JSON empty array ("[]")
which the caller must free with rac_free (do not leave out_json NULL), and (2)
clarify the meaning of the "applied" field by stating whether it can be false
(e.g., for adapters that are loaded but not currently applied or during
partial/failed apply) or, if this call only returns adapters that are actively
applied, remove or rename the field and state that returned objects always have
"applied": true; update the comment above rac_llm_llamacpp_get_lora_info
accordingly.
In
`@sdk/runanywhere-swift/Sources/RunAnywhere/Foundation/Bridge/Extensions/CppBridge`+LLM.swift:
- Around line 94-129: loadLoraAdapter currently calls getHandle() while
removeLoraAdapter/clearLoraAdapters use guard let handle = handle, causing
inconsistent behavior; update loadLoraAdapter to use the same fail-fast
pattern—replace the getHandle() call with guard let handle = handle else { throw
SDKError.llm(.invalidState, "No LLM component active") }, then call
rac_llm_component_load_lora(handle, ...) inside path.withCString and keep the
existing result check and logger; also align the error variant used on C-layer
failure to the same .invalidState style used by
removeLoraAdapter/clearLoraAdapters for consistency.
In
`@sdk/runanywhere-swift/Sources/RunAnywhere/Public/Extensions/LLM/LLMTypes.swift`:
- Around line 414-421: Add a validation for the scale argument in the public
init(path:scale:) initializer of LLMTypes (the LoRA adapter struct) by adding a
precondition that scale is non-negative (e.g., scale >= 0.0) so callers cannot
pass negative values that would break the C++ backend; keep the existing path
precondition and consider also ensuring scale is finite (isFinite) if you want
to reject NaN/inf as well.
| final class DownloadProgressDelegate: NSObject, URLSessionDownloadDelegate, @unchecked Sendable { | ||
| private let onProgress: @Sendable (Double) -> Void | ||
|
|
||
| init(onProgress: @escaping @Sendable (Double) -> Void) { | ||
| self.onProgress = onProgress | ||
| } | ||
|
|
||
| func urlSession( | ||
| _ session: URLSession, | ||
| downloadTask: URLSessionDownloadTask, | ||
| didWriteData bytesWritten: Int64, | ||
| totalBytesWritten: Int64, | ||
| totalBytesExpectedToWrite: Int64 | ||
| ) { | ||
| guard totalBytesExpectedToWrite > 0 else { return } | ||
| let progress = Double(totalBytesWritten) / Double(totalBytesExpectedToWrite) | ||
| onProgress(progress) | ||
| } | ||
|
|
||
| func urlSession(_ session: URLSession, downloadTask: URLSessionDownloadTask, didFinishDownloadingTo location: URL) { | ||
| // Handled by the async download call | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
git ls-files | head -20Repository: RunanywhereAI/runanywhere-sdks
Length of output: 777
🏁 Script executed:
fd "LLMViewModel" --type fRepository: RunanywhereAI/runanywhere-sdks
Length of output: 729
🏁 Script executed:
fd "LLMViewModelTypes" --type fRepository: RunanywhereAI/runanywhere-sdks
Length of output: 163
🏁 Script executed:
cat -n examples/ios/RunAnywhereAI/RunAnywhereAI/Features/Chat/ViewModels/LLMViewModelTypes.swift | head -70Repository: RunanywhereAI/runanywhere-sdks
Length of output: 2114
🏁 Script executed:
cat -n examples/ios/RunAnywhereAI/RunAnywhereAI/Features/Chat/ViewModels/LLMViewModel.swift | grep -A 10 -B 5 "download"Repository: RunanywhereAI/runanywhere-sdks
Length of output: 5569
🌐 Web query:
URLSession async download(from:delegate:) didWriteData progress callback suppressed Swift
💡 Result:
urlSession(_:downloadTask:didWriteData:...) is not guaranteed to fire when you use the async convenience API URLSession.download(from:/for:delegate:), because the delegate: parameter there is a URLSessionTaskDelegate, not a URLSessionDownloadDelegate—it’s intended mainly for things like authentication challenges, not download-progress callbacks. [1]
What to do instead (recommended)
Observe the task’s Progress rather than relying on didWriteData:
- Implement a
URLSessionTaskDelegate - In
urlSession(_:didCreateTask:), grabtask.progressand observefractionCompleted/ KVO / Combine, etc. [1]
If you specifically want didWriteData
Use the delegate-based download task API (create a URLSession with a session delegate that conforms to URLSessionDownloadDelegate, then call downloadTask(with:) and resume()). Apple documents didWriteData as a download-task delegate callback for download tasks. [2]
Also check your server response
If the response doesn’t include a Content-Length (or uses transfer encodings where the total is unknown), progress may appear to “jump” to 100% at the end because the expected byte count can’t be computed. [3]
Sources: [1] [2] [3]
🌐 Web query:
Swift async URLSession.download delegate callbacks behavior iOS
💡 Result:
URLSession’s Swift-concurrency download APIs don’t behave like “true delegate-driven” URLSessionDownloadTasks on iOS.
What happens with try await session.download(…)
- The async
downloadAPIs are layered on top of URLSession’s convenience / completion-handler style APIs, so some download-specific delegate callbacks are not triggered (notablyurlSession(_:downloadTask:didFinishDownloadingTo:), and in practice oftendidWriteDataprogress callbacks, too). Apple DTS explicitly calls this out. [1] - The
delegate:parameter on the async API is forURLSessionTaskDelegate-level concerns (commonly authentication / metrics), not a guarantee thatURLSessionDownloadDelegateprogress/finish methods will be used. This matches what people observe when trying to get progress viaURLSessionDownloadDelegatewith async/await. [2]
What happens with classic delegate-based downloads
- If you create a download task without a completion handler, the session can deliver download progress via
urlSession(_:downloadTask:didWriteData:…)and completion viaurlSession(_:downloadTask:didFinishDownloadingTo:). Apple’sURLSessionDownloadTaskdocs describe these callbacks for download tasks. [3] - If you create a download task with a completion handler, the delegate is effectively bypassed for download callbacks (expected behavior). [4]
Background session gotcha
- Because the async convenience API is built on top of completion-handler plumbing, it can fail / be rejected in background sessions with errors like “Completion handler blocks are not supported in background sessions. Use a delegate instead.” (reported alongside the missing delegate callbacks). [1]
Practical options (if you need progress)
- Use a delegate-driven
downloadTask(with:)(no completion handler) and implementURLSessionDownloadDelegatefor progress. [3][4] - Or observe
task.progress(works even when you’re using completion handlers; often simplest for UI progress). [4] - If you want an async/await style stream, use
bytes(for:delegate:)and compute progress from the byte stream (Apple DTS suggests this approach). [1]
Sources: [1] Apple Developer Forums thread “URLSessionDownloadDelegate never called” (DTS answer), [2] Stack Overflow discussion of async download(for:delegate:) not calling download delegate progress, [3] Apple docs for URLSessionDownloadTask delegate behavior, [4] Stack Overflow explanation that completion-handler tasks bypass delegate + using task.progress.
🏁 Script executed:
wc -l examples/ios/RunAnywhereAI/RunAnywhereAI/Features/Chat/ViewModels/LLMViewModel.swiftRepository: RunanywhereAI/runanywhere-sdks
Length of output: 162
🏁 Script executed:
# Let's verify the exact signature of URLSession.download in Swift docs/code
python3 << 'EOF'
# Search for URLSession download method signatures in the codebase
import subprocess
result = subprocess.run(['rg', 'download\\(from:', '-A', '3', '-B', '1'],
capture_output=True, text=True)
print(result.stdout)
EOFRepository: RunanywhereAI/runanywhere-sdks
Length of output: 56
🌐 Web query:
Swift 6 URLSession.download(from:delegate:) exact parameter type URLSessionTaskDelegate
💡 Result:
In Swift 6, the delegate parameter for URLSession.download(from:delegate:) is an optional existential of URLSessionTaskDelegate:
delegate: (any URLSessionTaskDelegate)?So you pass either nil or an object conforming to URLSessionTaskDelegate (and it may also conform to more specific protocols like URLSessionDownloadDelegate, but the parameter type is still URLSessionTaskDelegate). [1][2][3]
Example:
final class D: NSObject, URLSessionTaskDelegate {}
let (fileURL, response) = try await URLSession.shared.download(from: url, delegate: D())Sources: [1] https://forums.swift.org/t/are-existential-types-sendable/58946 [2] https://forums.swift.org/t/how-to-call-urlsession-cache-handler-delegate-method-in-async-await/66473 [3] https://stackoverflow.com/questions/68276940/how-to-get-the-download-progress-with-the-new-try-await-urlsession-shared-downlo
URLSessionDownloadDelegate progress callbacks are not invoked by the async download(from:delegate:) API — progress will remain stuck at 0.
URLSession.download(from:delegate:) accepts URLSessionTaskDelegate?, not URLSessionDownloadDelegate?. The delegate: parameter is intended for task-level concerns like authentication challenges, not download-specific progress callbacks. As a result, urlSession(_:downloadTask:didWriteData:...) will never be called, and adapterDownloadProgress will remain 0.0 throughout the download, even though the download completes successfully. This is documented behavior from Apple DTS.
Use one of these approaches instead:
- Observe
task.progress— capture the download task and observe itsfractionCompletedproperty. - Use
URLSession.bytes(from:)— manually iterate the byte stream and track progress:
🔧 Proposed alternative in `LLMViewModel.downloadAdapter`
- let delegate = DownloadProgressDelegate { [weak self] progress in
- Task { `@MainActor` in
- self?.adapterDownloadProgress[adapter.id] = progress
- }
- }
- let (tempURL, _) = try await URLSession.shared.download(from: adapter.downloadURL, delegate: delegate)
+ let (asyncBytes, response) = try await URLSession.shared.bytes(from: adapter.downloadURL)
+ let expectedLength = response.expectedContentLength
+ let destinationData = try await {
+ var data = Data()
+ if expectedLength > 0 { data.reserveCapacity(Int(expectedLength)) }
+ for try await byte in asyncBytes {
+ data.append(byte)
+ if expectedLength > 0 {
+ let progress = min(1.0, Double(data.count) / Double(expectedLength))
+ Task { `@MainActor` [weak self] in
+ self?.adapterDownloadProgress[adapter.id] = progress
+ }
+ }
+ }
+ return data
+ }()
+ try destinationData.write(to: destinationURL)The DownloadProgressDelegate class in LLMViewModelTypes.swift can then be removed.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In
`@examples/ios/RunAnywhereAI/RunAnywhereAI/Features/Chat/ViewModels/LLMViewModelTypes.swift`
around lines 38 - 59, The DownloadProgressDelegate using
urlSession(_:downloadTask:didWriteData:...) won't be called when using
URLSession.download(from:delegate:) because that API takes a
URLSessionTaskDelegate for task-level concerns; remove the
DownloadProgressDelegate class and instead track progress inside
LLMViewModel.downloadAdapter by capturing the returned URLSessionDownloadTask
and observing its task.progress.fractionCompleted (or switch to
URLSession.bytes(from:) and accumulate bytes) to update adapterDownloadProgress;
update references to DownloadProgressDelegate and the delegate parameter usage
accordingly.
| guard totalBytesExpectedToWrite > 0 else { return } | ||
| let progress = Double(totalBytesWritten) / Double(totalBytesExpectedToWrite) | ||
| onProgress(progress) |
There was a problem hiding this comment.
Clamp progress to [0.0, 1.0].
totalBytesExpectedToWrite can be inaccurate for some servers (e.g., gzip-encoded responses), allowing totalBytesWritten > totalBytesExpectedToWrite and a value above 1.0.
🛡️ Proposed fix
- let progress = Double(totalBytesWritten) / Double(totalBytesExpectedToWrite)
+ let progress = min(1.0, Double(totalBytesWritten) / Double(totalBytesExpectedToWrite))📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| guard totalBytesExpectedToWrite > 0 else { return } | |
| let progress = Double(totalBytesWritten) / Double(totalBytesExpectedToWrite) | |
| onProgress(progress) | |
| guard totalBytesExpectedToWrite > 0 else { return } | |
| let progress = min(1.0, Double(totalBytesWritten) / Double(totalBytesExpectedToWrite)) | |
| onProgress(progress) |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In
`@examples/ios/RunAnywhereAI/RunAnywhereAI/Features/Chat/ViewModels/LLMViewModelTypes.swift`
around lines 52 - 54, The progress calculation can exceed 1.0 when
totalBytesExpectedToWrite is inaccurate; update the block that computes progress
(using totalBytesWritten, totalBytesExpectedToWrite and calling onProgress) to
clamp the computed Double progress into the range 0.0...1.0 before calling
onProgress (e.g., apply min/max or Swift’s clamping to the progress value) and
keep the existing early return guard for zero expected bytes.
| Task { | ||
| defer { if accessed { url.stopAccessingSecurityScopedResource() } } | ||
| await viewModel.loadLoraAdapter(path: url.path, scale: loraScale) | ||
| showingLoRAScaleSheet = false | ||
| } |
There was a problem hiding this comment.
Load failure silently dismisses the sheet with no user feedback.
showingLoRAScaleSheet = false executes unconditionally after await viewModel.loadLoraAdapter(...) regardless of whether the load succeeded or failed. If the call fails, the user sees the sheet disappear with no indication of what went wrong.
Condition the dismissal on success and surface an error (e.g., via viewModel.error or a dedicated alert):
🛡️ Proposed fix
Task {
defer { if accessed { url.stopAccessingSecurityScopedResource() } }
await viewModel.loadLoraAdapter(path: url.path, scale: loraScale)
- showingLoRAScaleSheet = false
+ if viewModel.error == nil {
+ showingLoRAScaleSheet = false
+ }
}🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In
`@examples/ios/RunAnywhereAI/RunAnywhereAI/Features/Chat/Views/ChatInterfaceView.swift`
around lines 111 - 115, The sheet is being dismissed unconditionally after
awaiting viewModel.loadLoraAdapter(path:scale:), so change the Task to handle
success/failure: call await inside a do/catch (or check the returned result) and
only set showingLoRAScaleSheet = false when loadLoraAdapter succeeds; on failure
set a viewModel.error (or trigger a dedicated alert state) so the UI surfaces
the error to the user; keep the existing defer { if accessed {
url.stopAccessingSecurityScopedResource() } } to always stop access.
| let onCancel: () -> Void | ||
|
|
||
| var body: some View { | ||
| NavigationView { |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# First, check if the file exists and locate NavigationView usage
fd "ChatInterfaceView.swift" examples/ios/Repository: RunanywhereAI/runanywhere-sdks
Length of output: 158
🏁 Script executed:
# Get file size and examine relevant lines around 559 and 633
wc -l examples/ios/RunAnywhereAI/RunAnywhereAI/Features/Chat/Views/ChatInterfaceView.swiftRepository: RunanywhereAI/runanywhere-sdks
Length of output: 162
🏁 Script executed:
# Read the specific lines mentioned in the review
sed -n '555,565p' examples/ios/RunAnywhereAI/RunAnywhereAI/Features/Chat/Views/ChatInterfaceView.swiftRepository: RunanywhereAI/runanywhere-sdks
Length of output: 402
🏁 Script executed:
# Check lines around 633
sed -n '629,639p' examples/ios/RunAnywhereAI/RunAnywhereAI/Features/Chat/Views/ChatInterfaceView.swiftRepository: RunanywhereAI/runanywhere-sdks
Length of output: 388
🏁 Script executed:
# Search for all NavigationView instances in the file
rg "NavigationView" examples/ios/RunAnywhereAI/RunAnywhereAI/Features/Chat/Views/ChatInterfaceView.swift -nRepository: RunanywhereAI/runanywhere-sdks
Length of output: 160
🏁 Script executed:
# Check context around line 156
sed -n '150,165p' examples/ios/RunAnywhereAI/RunAnywhereAI/Features/Chat/Views/ChatInterfaceView.swiftRepository: RunanywhereAI/runanywhere-sdks
Length of output: 490
🏁 Script executed:
# Search for NavigationLink usage in the file to see if any NavigationView pushes child views
rg "NavigationLink" examples/ios/RunAnywhereAI/RunAnywhereAI/Features/Chat/Views/ChatInterfaceView.swift -nRepository: RunanywhereAI/runanywhere-sdks
Length of output: 56
🏁 Script executed:
# Get broader context around line 559 to identify the struct name
sed -n '540,575p' examples/ios/RunAnywhereAI/RunAnywhereAI/Features/Chat/Views/ChatInterfaceView.swiftRepository: RunanywhereAI/runanywhere-sdks
Length of output: 1198
🏁 Script executed:
# Get broader context around line 633 to identify the struct name
sed -n '615,650p' examples/ios/RunAnywhereAI/RunAnywhereAI/Features/Chat/Views/ChatInterfaceView.swiftRepository: RunanywhereAI/runanywhere-sdks
Length of output: 1136
🏁 Script executed:
# Check deployment target and Swift version requirements
fd "Package.swift" examples/ios/ --exec cat {} \; 2>/dev/null | head -50Repository: RunanywhereAI/runanywhere-sdks
Length of output: 1756
🌐 Web query:
NavigationView deprecated iOS 16 Swift SwiftUI replacement NavigationStack
💡 Result:
NavigationView is effectively superseded starting in iOS 16. The SwiftUI replacements are:
NavigationStack: for the classic iPhone-style “push” navigation (single column). Apple positions this as the direct replacement for stack-basedNavigationViewusage. [1][2]NavigationSplitView: for sidebar / multi-column navigation (common on iPad and macOS). [1]
Typical migration (NavigationView → NavigationStack)
Before (iOS 15 style):
NavigationView {
List(items) { item in
NavigationLink(destination: Detail(item: item)) {
Text(item.title)
}
}
}After (iOS 16+ style):
NavigationStack {
List(items) { item in
NavigationLink(value: item) {
Text(item.title)
}
}
.navigationDestination(for: Item.self) { item in
Detail(item: item)
}
}This aligns with the iOS 16 deprecations that tell you to use NavigationLink(value:) plus navigationDestination(...) inside a NavigationStack/NavigationSplitView. [2][3]
Supporting iOS 15 and iOS 16+
struct ContentView: View {
var body: some View {
if `#available`(iOS 16.0, *) {
NavigationStack { RootView() }
} else {
NavigationView { RootView() }
}
}
}Sources: Apple’s “Migrating to new navigation types” doc (NavigationStack / NavigationSplitView) [1], plus examples/discussion of the iOS 16 navigation API and deprecation guidance. [2][3]
References
[1] Apple Developer Documentation — Migrating to new navigation types (developer.apple.com)
[2] Shipbook Blog — SwiftUI Navigation in iOS 16 (blog.shipbook.io)
[3] Wizlab Systems — Mastering the new iOS 16 Navigation API in SwiftUI (wizlabsystems.com)
Citations:
- 1: https://developer.apple.com/documentation/swiftui/migrating-to-new-navigation-types?utm_source=openai
- 2: https://blog.shipbook.io/swiftui-navigation?utm_source=openai
- 3: https://wizlabsystems.com/2022/08/03/mastering-the-new-ios-16-navigation-api-in-swiftui/?utm_source=openai
NavigationView is deprecated since iOS 16 — replace all three instances with NavigationStack.
Apple deprecated NavigationView in favor of NavigationStack for apps targeting iOS 16 and later. The codebase uses it in three locations: iOSView (line 156), LoRAScaleSheetView (line 559), and LoRAManagementSheetView (line 633). Since none of these views use NavigationLink, the replacement is a straightforward rename; .navigationBarTitleDisplayMode(.inline) and .navigationTitle() continue to work as-is with NavigationStack.
♻️ Proposed fix — all three instances
- NavigationView {
+ NavigationStack {
ZStack { ... }
.navigationTitle(hasModelSelected ? "Chat" : "")
.navigationBarTitleDisplayMode(.inline)
- }
+ }- NavigationView {
+ NavigationStack {
VStack(spacing: 24) { ... }
.navigationTitle("Load LoRA Adapter")
.navigationBarTitleDisplayMode(.inline)
- }
+ }- NavigationView {
+ NavigationStack {
List { ... }
.navigationTitle("LoRA Adapters")
.navigationBarTitleDisplayMode(.inline)
.toolbar { ... }
- }
+ }As per coding guidelines: "Use the latest Swift 6 APIs always."
Also applies to: 156-156 and 633-633
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In
`@examples/ios/RunAnywhereAI/RunAnywhereAI/Features/Chat/Views/ChatInterfaceView.swift`
at line 559, Replace the deprecated NavigationView calls with NavigationStack in
the three views that currently use it: iOSView, LoRAScaleSheetView, and
LoRAManagementSheetView (the occurrences around the functions named iOSView,
LoRAScaleSheetView, and LoRAManagementSheetView). Change the container
initializer from NavigationView { ... } to NavigationStack { ... } (no other API
changes required since these views do not use NavigationLink); keep
.navigationBarTitleDisplayMode(.inline) and .navigationTitle(...) as-is and
ensure SwiftUI is imported.
| if (context_) { | ||
| llama_free(context_); | ||
| context_ = nullptr; | ||
| } | ||
|
|
||
| // Create new context (adapters are now visible to it) | ||
| llama_context_params ctx_params = llama_context_default_params(); | ||
| ctx_params.n_ctx = context_size_; | ||
| ctx_params.n_batch = context_size_; | ||
| ctx_params.n_ubatch = context_size_; | ||
| ctx_params.n_threads = backend_->get_num_threads(); | ||
| ctx_params.n_threads_batch = backend_->get_num_threads(); | ||
| ctx_params.no_perf = true; | ||
|
|
||
| context_ = llama_init_from_model(model_, ctx_params); | ||
| if (!context_) { | ||
| LOGE("Failed to recreate context after LoRA adapter load"); | ||
| return false; | ||
| } |
There was a problem hiding this comment.
recreate_context() failure leaves the model permanently unusable without an explicit error state.
After llama_free(context_) (line 801) frees the old context, if llama_init_from_model fails (line 814), the method returns false with context_ = nullptr and sampler_ = nullptr, but model_loaded_ remains true. is_ready() returns false, so generation calls fail gracefully, but there is no path to recover without a full model unload+reload cycle. Callers of load_lora_adapter receive false but have no indication that the model itself is now dead.
At minimum, set model_loaded_ = false on failure so callers and the component layer can surface a clean "model not loaded" error rather than silently degrading.
🐛 Proposed fix
context_ = llama_init_from_model(model_, ctx_params);
if (!context_) {
LOGE("Failed to recreate context after LoRA adapter load");
+ model_loaded_ = false;
return false;
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| if (context_) { | |
| llama_free(context_); | |
| context_ = nullptr; | |
| } | |
| // Create new context (adapters are now visible to it) | |
| llama_context_params ctx_params = llama_context_default_params(); | |
| ctx_params.n_ctx = context_size_; | |
| ctx_params.n_batch = context_size_; | |
| ctx_params.n_ubatch = context_size_; | |
| ctx_params.n_threads = backend_->get_num_threads(); | |
| ctx_params.n_threads_batch = backend_->get_num_threads(); | |
| ctx_params.no_perf = true; | |
| context_ = llama_init_from_model(model_, ctx_params); | |
| if (!context_) { | |
| LOGE("Failed to recreate context after LoRA adapter load"); | |
| return false; | |
| } | |
| if (context_) { | |
| llama_free(context_); | |
| context_ = nullptr; | |
| } | |
| // Create new context (adapters are now visible to it) | |
| llama_context_params ctx_params = llama_context_default_params(); | |
| ctx_params.n_ctx = context_size_; | |
| ctx_params.n_batch = context_size_; | |
| ctx_params.n_ubatch = context_size_; | |
| ctx_params.n_threads = backend_->get_num_threads(); | |
| ctx_params.n_threads_batch = backend_->get_num_threads(); | |
| ctx_params.no_perf = true; | |
| context_ = llama_init_from_model(model_, ctx_params); | |
| if (!context_) { | |
| LOGE("Failed to recreate context after LoRA adapter load"); | |
| model_loaded_ = false; | |
| return false; | |
| } |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@sdk/runanywhere-commons/src/backends/llamacpp/llamacpp_backend.cpp` around
lines 800 - 818, recreate_context() currently frees the old context and returns
false if llama_init_from_model(...) fails but leaves model_loaded_ true; update
the failure path after context_ becomes nullptr to set model_loaded_ = false
(and ensure sampler_ is cleared if applicable) and log the state change so
callers know the model is no longer loaded; locate the block around
llama_free(context_), llama_init_from_model(model_, ctx_params) and the
subsequent if (!context_) check to apply this change.
| } | ||
|
|
||
| // Clear KV cache after adapter changes | ||
| llama_memory_clear(llama_get_memory(context_), true); | ||
|
|
||
| LOGI("LoRA adapter loaded and applied: %s (%zu total adapters)", | ||
| adapter_path.c_str(), lora_adapters_.size()); | ||
| return true; | ||
| } |
There was a problem hiding this comment.
Missing null check for llama_get_memory() before llama_memory_clear() — potential null-pointer dereference.
Lines 891, 926, and 941 all call llama_memory_clear(llama_get_memory(context_), true) without checking the return of llama_get_memory. The established pattern in this same file (lines 566–569 and 750–752) always guards this call:
// Existing safe pattern (lines 566–569):
llama_memory_t mem = llama_get_memory(context_);
if (mem) {
llama_memory_clear(mem, true);
}All three new call sites skip this guard and will dereference null if llama_get_memory returns nullptr (e.g., for contexts that have no KV cache).
🐛 Proposed fix — apply at all three sites (lines 891, 926, 941)
- llama_memory_clear(llama_get_memory(context_), true);
+ if (llama_memory_t mem = llama_get_memory(context_)) {
+ llama_memory_clear(mem, true);
+ }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@sdk/runanywhere-commons/src/backends/llamacpp/llamacpp_backend.cpp` around
lines 888 - 896, The calls to llama_memory_clear(llama_get_memory(context_),
true) can dereference null; before each such call (where llama_get_memory is
used: the llama_memory_clear invocations after adapter changes and the two other
new sites) retrieve llama_memory_t mem = llama_get_memory(context_) and only
call llama_memory_clear(mem, true) if mem is non-null; update the sites that
currently call llama_memory_clear(llama_get_memory(context_), true) to follow
the existing safe pattern used elsewhere in this file.
| /** | ||
| * Load a LoRA adapter from a GGUF file and apply it. | ||
| * | ||
| * The adapter is loaded against the current model and applied to the context. | ||
| * Context is recreated internally to accommodate the new adapter. | ||
| * KV cache is cleared automatically. | ||
| * | ||
| * @param handle Service handle (from rac_llm_llamacpp_create) | ||
| * @param adapter_path Path to the LoRA adapter GGUF file | ||
| * @param scale Adapter scale factor (0.0-2.0, default 1.0; lower values recommended for F16 adapters on quantized models) | ||
| * @return RAC_SUCCESS or error code | ||
| */ | ||
| RAC_LLAMACPP_API rac_result_t rac_llm_llamacpp_load_lora(rac_handle_t handle, | ||
| const char* adapter_path, | ||
| float scale); |
There was a problem hiding this comment.
Two known behavioral risks are undocumented in rac_llm_llamacpp_load_lora's doc block.
Per the PR description, there are two material facts callers need that are missing from the header:
-
Partial-apply failure leaves the context in an inconsistent state. If the load fails mid-way (e.g., second adapter in a batch can't be applied), adapters that were already applied remain active. Callers cannot safely retry without calling
rac_llm_llamacpp_clear_lorafirst. -
Adapter lifetime is tied to the model. Per llama.cpp b8011,
llama_adapter_lora_freeis deprecated and adapters are freed automatically when the model is freed. The header should make clear that callers must not attempt to free the adapter object independently and should userac_llm_llamacpp_remove_lora/rac_llm_llamacpp_clear_lorafor logical removal instead.
Additionally, the @param scale description says "default 1.0", which is misleading — there is no API-level default since the caller must always supply this argument. "recommended: 1.0" is more accurate.
📝 Suggested doc additions
/**
* Load a LoRA adapter from a GGUF file and apply it.
*
* The adapter is loaded against the current model and applied to the context.
* Context is recreated internally to accommodate the new adapter.
* KV cache is cleared automatically.
*
+ * `@note` On partial failure (e.g., adapter incompatible mid-apply), previously
+ * applied adapters from this call may remain active, leaving the context
+ * in an inconsistent state. Call rac_llm_llamacpp_clear_lora before retrying.
+ *
+ * `@note` Adapter memory is owned by the model and freed automatically when the
+ * model is unloaded (llama.cpp b8011+). Do not attempt to free adapter
+ * handles independently; use remove_lora/clear_lora for logical removal.
+ *
* `@param` handle Service handle (from rac_llm_llamacpp_create)
* `@param` adapter_path Path to the LoRA adapter GGUF file
- * `@param` scale Adapter scale factor (0.0-2.0, default 1.0; lower values recommended for F16 adapters on quantized models)
+ * `@param` scale Adapter scale factor (0.0-2.0, recommended: 1.0; lower values
+ * recommended for F16 adapters on quantized models)
* `@return` RAC_SUCCESS or error code
*/🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@sdk/runanywhere-swift/Sources/LlamaCPPRuntime/include/rac_llm_llamacpp.h`
around lines 195 - 209, Update the rac_llm_llamacpp_load_lora docblock to (1)
document that a partial-apply failure can leave already-applied adapters active
and callers must call rac_llm_llamacpp_clear_lora before retrying, (2) state
that adapter lifetime is tied to the model (do not free adapter objects
manually; use rac_llm_llamacpp_remove_lora / rac_llm_llamacpp_clear_lora for
logical removal) and reference the llama.cpp behavior, and (3) change the `@param`
scale text from "default 1.0" to clarify there is no API default (e.g.,
"recommended: 1.0"). Ensure these notes appear in the comment for
rac_llm_llamacpp_load_lora so callers see the behavioral risks and correct
cleanup functions.
| public func loadLoraAdapter(_ config: LoRAAdapterConfig) throws { | ||
| let handle = try getHandle() | ||
| let result = config.path.withCString { pathPtr in | ||
| rac_llm_component_load_lora(handle, pathPtr, config.scale) | ||
| } | ||
| guard result == RAC_SUCCESS else { | ||
| throw SDKError.llm(.modelLoadFailed, "Failed to load LoRA adapter: \(result)") | ||
| } | ||
| logger.info("LoRA adapter loaded: \(config.path) (scale=\(config.scale))") | ||
| } |
There was a problem hiding this comment.
Wrong error code — use .invalidState instead of .modelLoadFailed for LoRA load failures.
.modelLoadFailed implies the LLM model itself failed to load, not a LoRA adapter. All other LoRA methods in this same block correctly use .invalidState. This mismatch makes error handling at the call site ambiguous.
🐛 Proposed fix
- throw SDKError.llm(.modelLoadFailed, "Failed to load LoRA adapter: \(result)")
+ throw SDKError.llm(.invalidState, "Failed to load LoRA adapter: \(result)")📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| public func loadLoraAdapter(_ config: LoRAAdapterConfig) throws { | |
| let handle = try getHandle() | |
| let result = config.path.withCString { pathPtr in | |
| rac_llm_component_load_lora(handle, pathPtr, config.scale) | |
| } | |
| guard result == RAC_SUCCESS else { | |
| throw SDKError.llm(.modelLoadFailed, "Failed to load LoRA adapter: \(result)") | |
| } | |
| logger.info("LoRA adapter loaded: \(config.path) (scale=\(config.scale))") | |
| } | |
| public func loadLoraAdapter(_ config: LoRAAdapterConfig) throws { | |
| let handle = try getHandle() | |
| let result = config.path.withCString { pathPtr in | |
| rac_llm_component_load_lora(handle, pathPtr, config.scale) | |
| } | |
| guard result == RAC_SUCCESS else { | |
| throw SDKError.llm(.invalidState, "Failed to load LoRA adapter: \(result)") | |
| } | |
| logger.info("LoRA adapter loaded: \(config.path) (scale=\(config.scale))") | |
| } |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In
`@sdk/runanywhere-swift/Sources/RunAnywhere/Foundation/Bridge/Extensions/CppBridge`+LLM.swift
around lines 94 - 103, In loadLoraAdapter (CppBridge+LLM.swift) the thrown
SDKError uses .modelLoadFailed but LoRA adapter failures should use
.invalidState; update the throw in public func loadLoraAdapter(_ config:
LoRAAdapterConfig) to throw SDKError.llm(.invalidState, "...") with the same
error message, matching the other LoRA methods (e.g., the companion LoRA
load/unload methods) for consistent error handling.
| public func getLoadedLoraAdapters() throws -> [LoRAAdapterInfo] { | ||
| guard let handle = handle else { return [] } | ||
| var jsonPtr: UnsafeMutablePointer<CChar>? | ||
| let result = rac_llm_component_get_lora_info(handle, &jsonPtr) | ||
| guard result == RAC_SUCCESS, let ptr = jsonPtr else { | ||
| return [] | ||
| } | ||
| defer { rac_free(ptr) } | ||
|
|
||
| let jsonString = String(cString: ptr) | ||
| guard let data = jsonString.data(using: .utf8), | ||
| let array = try? JSONSerialization.jsonObject(with: data) as? [[String: Any]] else { | ||
| logger.error("Failed to parse LoRA info JSON") | ||
| return [] | ||
| } | ||
|
|
||
| return array.compactMap { dict in | ||
| guard let path = dict["path"] as? String, | ||
| let scale = dict["scale"] as? Double, | ||
| let applied = dict["applied"] as? Bool else { return nil } | ||
| return LoRAAdapterInfo(path: path, scale: Float(scale), applied: applied) | ||
| } | ||
| } |
There was a problem hiding this comment.
getLoadedLoraAdapters is declared throws but never actually throws — and risks a memory leak on a non-success path.
Spurious throws: Every error path returns [] instead of throwing. The throws annotation is misleading and forces callers to use try unnecessarily. Removing it makes the contract accurate (downstream effect: RunAnywhere+LoRA.swift line 57 can drop try).
Memory leak: If the C API ever sets jsonPtr non-nil on a non-success result, the guard exits before defer { rac_free(ptr) } is registered, leaking the allocation. Capture jsonPtr before the guard to ensure cleanup.
♻️ Proposed fix
-public func getLoadedLoraAdapters() throws -> [LoRAAdapterInfo] {
+public func getLoadedLoraAdapters() -> [LoRAAdapterInfo] {
guard let handle = handle else { return [] }
var jsonPtr: UnsafeMutablePointer<CChar>?
let result = rac_llm_component_get_lora_info(handle, &jsonPtr)
- guard result == RAC_SUCCESS, let ptr = jsonPtr else {
- return []
- }
- defer { rac_free(ptr) }
+ defer { if let p = jsonPtr { rac_free(p) } }
+ guard result == RAC_SUCCESS, let ptr = jsonPtr else { return [] }
let jsonString = String(cString: ptr)
guard let data = jsonString.data(using: .utf8),
let array = try? JSONSerialization.jsonObject(with: data) as? [[String: Any]] else {
logger.error("Failed to parse LoRA info JSON")
return []
}🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In
`@sdk/runanywhere-swift/Sources/RunAnywhere/Foundation/Bridge/Extensions/CppBridge`+LLM.swift
around lines 132 - 154, Remove the spurious throws from getLoadedLoraAdapters
(so its signature is non-throwing) and ensure any C-allocated jsonPtr is always
freed: call rac_llm_component_get_lora_info(handle, &jsonPtr), capture jsonPtr
into a local temp before checking result, and call rac_free on that temp on all
exit paths (including the non-success branch) to avoid leaks; also update
callers (e.g., the code that currently calls try getLoadedLoraAdapters in
RunAnywhere+LoRA.swift) to drop the unnecessary try. Ensure parsing and
LoRAAdapterInfo construction remain the same but without a throws declaration.
| static func getLoadedLoraAdapters() async throws -> [LoRAAdapterInfo] { | ||
| guard isInitialized else { | ||
| throw SDKError.general(.notInitialized, "SDK not initialized") | ||
| } | ||
| return try await CppBridge.LLM.shared.getLoadedLoraAdapters() | ||
| } |
There was a problem hiding this comment.
Downstream: try on getLoadedLoraAdapters() bridge call is currently superfluous.
CppBridge.LLM.shared.getLoadedLoraAdapters() never actually throws (it only returns [] on all error paths, as flagged in CppBridge+LLM.swift). Once the bridge method's throws is removed, the try here can also be dropped, though the public method itself still correctly throws to cover the isInitialized guard.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In
`@sdk/runanywhere-swift/Sources/RunAnywhere/Public/Extensions/LLM/RunAnywhere`+LoRA.swift
around lines 53 - 58, The bridge call
CppBridge.LLM.shared.getLoadedLoraAdapters() no longer throws, so remove the
superfluous `try` in RunAnywhere+LoRA.swift while keeping the public method
signature (static func getLoadedLoraAdapters() async throws ->
[LoRAAdapterInfo]) as-is to preserve the isInitialized guard error; update the
return line to `return await CppBridge.LLM.shared.getLoadedLoraAdapters()` (no
`try`) and run tests to ensure compilation succeeds after the bridge method's
throws removal.
|
Okay @shubhammalhotra28 I am Merging this PR |
feat(lora): add LoRA adapter support across SDK + demo app
Implement LoRA (Low-Rank Adaptation) adapter hot-swapping for llama.cpp
backend across all 6 SDK layers (C++ -> C API -> Component -> JNI ->
Kotlin Bridge -> Kotlin Public API).
feat(ci): add selectable build targets to Build All workflow + fix Swift concurrency errors
Rewrite build-all-test.yml with 9 boolean checkbox inputs so each build
target can be toggled independently from the GitHub Actions UI:
Fix two Swift strict-concurrency errors that fail the Swift SDK build:
fix(swift): resolve strict concurrency errors in LiveTranscriptionSession and VLM streaming
LiveTranscriptionSession.swift:
let session = selfbefore closures to avoid captured var in @Sendable/@task contexts (class is @mainactor @unchecked Sendable so strong ref is safe, bounded by stream lifecycle)RunAnywhere+VisionLanguage.swift:
let capturedCImage = cImagebefore AsyncThrowingStream closure so the Task captures an immutable let instead of a mutable varextension rac_vlm_image_t: @unchecked Sendable {}for the C struct to cross Task concurrency boundaries safelyfix(jni): address CodeRabbit review findings in LoRA JNI functions
Description
Brief description of the changes made.
Type of Change
Testing
Platform-Specific Testing (check all that apply)
Swift SDK / iOS Sample:
Kotlin SDK / Android Sample:
Flutter SDK / Flutter Sample:
React Native SDK / React Native Sample:
Playground:
Web SDK / Web Sample:
Labels
Please add the appropriate label(s):
SDKs:
Swift SDK- Changes to Swift SDK (sdk/runanywhere-swift)Kotlin SDK- Changes to Kotlin SDK (sdk/runanywhere-kotlin)Flutter SDK- Changes to Flutter SDK (sdk/runanywhere-flutter)React Native SDK- Changes to React Native SDK (sdk/runanywhere-react-native)Web SDK- Changes to Web SDK (sdk/runanywhere-web)Commons- Changes to shared native code (sdk/runanywhere-commons)Sample Apps:
iOS Sample- Changes to iOS example app (examples/ios)Android Sample- Changes to Android example app (examples/android)Flutter Sample- Changes to Flutter example app (examples/flutter)React Native Sample- Changes to React Native example app (examples/react-native)Web Sample- Changes to Web example app (examples/web)Checklist
Screenshots
Attach relevant UI screenshots for changes (if applicable):
Important
Adds LoRA adapter support across SDK layers, updates build workflow for selective targets, and fixes Swift concurrency issues.
runanywhere_commons_jni.cpp.LoRAAdapterConfigandLoRAAdapterInfodata classes inLLMTypes.kt.CppBridgeLLMwith methods for LoRA adapter management.build-all-test.ymlto allow selective build targets via GitHub Actions UI.RunAnyWhereLorawith Material 3 UI.LiveTranscriptionSession.swiftandRunAnywhere+VisionLanguage.swift.@unchecked Sendableconformance torac_vlm_image_t.settings.gradle.ktsto include new Android demo app.This description was created by
for 4cfd510. You can customize this summary. It will automatically update as commits are pushed.
Summary by CodeRabbit
New Features
Chores
Greptile Summary
This PR adds LoRA (Low-Rank Adaptation) adapter hot-swapping support across the full SDK stack — from C++ backend through C API, component layer, JNI, Kotlin bridge, and public Kotlin API — along with a new Android demo app (
RunAnyWhereLora) and a rewritten CI workflow with selectable build targets. It also fixes Swift strict-concurrency errors inLiveTranscriptionSessionandRunAnywhere+VisionLanguage.librac_commonsfromlibrac_backend_llamacppllamacpp_backend.cpp): Context recreation + adapter re-application on load; KV cache clearing on adapter changes. Error recovery on partialapply_lora_adaptersfailure leaves system in inconsistent state (see comment)RunAnywhere+LoRA.ktexpect/actual pattern withLoRAAdapterConfig/LoRAAdapterInfodata classes incommonMainlet capturedCImageinstead of mutablevar cImagefor Task boundary safety; adds@unchecked Sendableconformance; simplifiesStreamingCollectorinitializationbuild-all-test.yml): Rewritten with 9 boolean checkbox inputs for selective build targets (C++ Android/iOS, Kotlin, Swift, Web, Flutter, React Native, Android apps, IntelliJ)RunAnyWhereLoraAndroid app with Material 3 UI, file picker for models/adapters, scale slider, and streaming Q&A — thoughrunBlockinginside a coroutine inLoraApplication.ktshould be addressedConfidence Score: 3/5
sdk/runanywhere-commons/src/backends/llamacpp/llamacpp_backend.cpp(LoRA error recovery) andexamples/android/RunAnyWhereLora/app/src/main/java/com/runanywhere/run_anywhere_lora/LoraApplication.kt(runBlocking inside coroutine).Important Files Changed
capturedCImageinstead of mutablecImage, adds@unchecked Sendablefor C struct, simplifies StreamingCollector initialization. Correct fixes for strict concurrency.[weak self]with stronglet session = selfcapture, adds@unchecked Sendable. Wraps deprecated API call in helper. The strong reference capture is safe given the@MainActorand session lifecycle constraints.Sequence Diagram
sequenceDiagram participant App as Kotlin App participant API as RunAnywhere+LoRA participant Bridge as CppBridgeLLM participant JNI as JNI Layer participant Comp as llm_component.cpp participant VTable as rac_llm_service_ops participant Backend as llamacpp_backend.cpp App->>API: loadLoraAdapter(config) API->>Bridge: loadLoraAdapter(path, scale) Bridge->>JNI: racLlmComponentLoadLora(handle, path, scale) JNI->>Comp: rac_llm_component_load_lora() Comp->>VTable: ops->load_lora(impl, path, scale) VTable->>Backend: load_lora_adapter(path, scale) Backend->>Backend: llama_adapter_lora_init(model) Backend->>Backend: recreate_context() Backend->>Backend: apply_lora_adapters() Backend->>Backend: llama_memory_clear() Backend-->>VTable: RAC_SUCCESS VTable-->>Comp: RAC_SUCCESS Comp-->>JNI: RAC_SUCCESS JNI-->>Bridge: 0 Bridge-->>API: 0 API-->>App: successLast reviewed commit: 4cfd510
Context used:
dashboard- CLAUDE.md (source)