Add memory layer with vector similarity search and RAG pipeline by vedantagarwal-web · Pull Request #337 · RunanywhereAI/runanywhere-sdks

vedantagarwal-web · 2026-02-06T10:45:30Z

Summary

Adds a complete memory/vector similarity search layer to the RunAnywhere C++ core and Swift SDK
Implements two backends: Flat (exact brute-force, ideal for <10K vectors) and HNSW (approximate nearest neighbor via hnswlib, scales to 100K+)
Provides a full RAG pipeline in Swift: remember() → embed + store, recall() → embed + search, RAGAgent → retrieve + augment prompt + LLM generate
All 10 C++ tests pass, including a 10K-vector HNSW benchmark at 0.10ms/query

Architecture

Swift SDK: RAGAgent → RAGMemoryService → CppBridge.Memory (C FFI)
C++ Core:  rac_memory_* C API → vtable dispatch → Flat or HNSW backend
Storage:   .racm binary format (hnswlib native + JSON metadata sidecar)

What's included

C++ core (~1,500 lines):

rac_memory_types.h / rac_memory_service.h — types, vtable interface, public C API (create, add, search, remove, save, load)
Flat backend — contiguous vector buffer, linear scan, thread-safe with shared_mutex
HNSW backend — wraps hnswlib::HierarchicalNSW<float>, auto-resize, cosine/L2/IP metrics
Persistence in .racm format with metadata sidecar
rac_llm_llamacpp_get_embeddings() — extract embeddings from any GGUF model

Swift SDK (~900 lines):

CppBridge.Memory actor — wraps all C FFI calls
MemoryConfiguration, MemoryTypes, MemoryModule
EmbeddingProvider protocol + LlamaCppEmbeddingProvider
RAGMemoryService — remember/recall/forget with automatic embedding
RAGAgent — full RAG: retrieve context from memory, augment prompt, generate with LLM
TextChunker — configurable text splitting for document ingestion

Tests & example app:

Standalone C++ test suite (10 tests covering both backends, persistence, deletion, 10K perf benchmark)
iOS example app Memory tab with visual test runner

Test plan

C++ build verification — all targets compile (rac_commons, rac_backend_memory, rac_backend_llamacpp)
Standalone C++ test — 10/10 tests pass (flat CRUD, HNSW CRUD, save/load, 10K benchmark)
iOS simulator build verification
Integration test with embedding model + LLM for full RAG pipeline

🤖 Generated with Claude Code

Summary by CodeRabbit

Release Notes

New Features
- Added Kotlin Android starter app with LLM Chat, Speech-to-Text, Text-to-Speech, and Voice Pipeline integration
- Introduced vector memory and retrieval-augmented generation (RAG) capabilities with multiple index backends
- Added text embedding extraction from LLM models
- Implemented RAG agent for context-aware, memory-enhanced responses
- Added text chunking utility for RAG ingestion
Documentation
- Added comprehensive Kotlin starter app README with setup instructions and architecture overview
Tests
- Added vector memory layer tests with various backend scenarios

Greptile Overview

Greptile Summary

Adds a new C++ memory/vector-search feature with a C API (rac_memory_*) and vtable-dispatched backends (Flat brute-force + HNSW via hnswlib), including persistence.
Extends the llama.cpp backend with an embedding-extraction API (rac_llm_llamacpp_get_embeddings) for use in retrieval workflows.
Introduces a Swift-side RAG pipeline (RAGMemoryService, RAGAgent, TextChunker) and a CppBridge.Memory actor wrapping the C FFI.
Registers the new vector-search capability/backend during Swift SDK initialization and adds a standalone C++ test target for the memory layer.

Confidence Score: 2/5

This PR is not safe to merge yet due to a definite C++ compile break and a couple of memory-safety/ABI contract issues.
Confidence is reduced because memory_backend_hnswlib.cpp uses std::priority_queue without including (compile failure), Swift bridge lacks basic length validation before passing pointers into C (OOB risk), and the new embeddings API allocates with malloc while callers free via rac_free (allocator contract mismatch).
sdk/runanywhere-commons/src/backends/memory/memory_backend_hnswlib.cpp; sdk/runanywhere-commons/src/backends/llamacpp/rac_llm_llamacpp.cpp; sdk/runanywhere-swift/Sources/RunAnywhere/Foundation/Bridge/Extensions/CppBridge+Memory.swift

Important Files Changed

Filename	Overview
sdk/runanywhere-commons/src/backends/memory/memory_backend_hnswlib.cpp	Adds HNSW vector search backend; compile break due to missing include for std::priority_queue, plus general correctness risk if stats/dim mismatches propagate.
sdk/runanywhere-commons/src/backends/llamacpp/rac_llm_llamacpp.cpp	Adds embeddings extraction API; output buffer allocated with malloc while API/Swift frees via rac_free, risking allocator mismatch if rac_free changes.
sdk/runanywhere-commons/src/features/memory/rac_memory_service.cpp	Adds C memory service dispatch layer; add/search obtain dimension via get_stats but ignore its return value, so dimension may be 0 and cause backend calls to fail.
sdk/runanywhere-swift/Sources/RunAnywhere/Foundation/Bridge/Extensions/CppBridge+Memory.swift	Adds Swift memory bridge; missing length validation between vectors/ids/metadata can cause out-of-bounds reads when calling into C.
sdk/runanywhere-swift/Sources/RunAnywhere/Features/Memory/EmbeddingProvider.swift	Adds embedding provider protocol and llama.cpp implementation; embed() returns outDim without validating it matches expected dimension, leading to runtime dimension mismatch errors downstream.

Sequence Diagram

sequenceDiagram
    participant App as Swift App
    participant RAG as RAGAgent (Swift actor)
    participant MemSvc as RAGMemoryService (Swift actor)
    participant Bridge as CppBridge.Memory (Swift actor)
    participant CAPI as rac_memory_* (C API)
    participant Core as rac_memory_service (vtable dispatch)
    participant Backend as Flat/HNSW backend
    participant LLM as rac_llm_llamacpp_get_embeddings

    App->>RAG: query(question,k)
    RAG->>MemSvc: recall(question,k)
    MemSvc->>LLM: embed(question)
    LLM-->>MemSvc: [Float] queryEmbedding
    MemSvc->>Bridge: search(queryEmbedding,k)
    Bridge->>CAPI: rac_memory_search(handle, query, k, &out)
    CAPI->>Core: ops->get_stats() (dimension)
    Core->>Backend: ops->search(query, dimension, k)
    Backend-->>CAPI: rac_memory_search_results_t
    CAPI-->>Bridge: results + metadata C strings
    Bridge-->>MemSvc: [MemorySearchResult]
    MemSvc-->>RAG: [MemoryRecallResult]
    RAG->>App: augmented prompt -> LLM generate

    App->>RAG: ingest(text)
    RAG->>MemSvc: remember(chunk, metadata)
    MemSvc->>LLM: embed(chunk)
    LLM-->>MemSvc: [Float] embedding
    MemSvc->>Bridge: add([embedding],[id],[metadataJson])
    Bridge->>CAPI: rac_memory_add(handle, vectors, ids, metadata, count)
    CAPI->>Core: ops->get_stats() (dimension)
    Core->>Backend: ops->add(vectors, ids, metadata, count, dimension)
    Backend-->>CAPI: RAC_SUCCESS

Replace the full production RunAnywhere app with the clean kotlin-starter-example that matches the blog tutorial series. Uses io.github.sanchitmonga22 Maven Central SDK coordinates. Add kotlin-starter-app entry to Playground README.

Implements a full memory/RAG layer using hnswlib for approximate nearest neighbor search and a flat brute-force backend for small indices. The C++ core provides a vtable-based service abstraction, and the Swift SDK wraps it with a high-level remember/recall/forget API and a RAGAgent that composes memory retrieval with LLM generation. C++ core: - Flat backend (exact brute-force, <10K vectors) - HNSW backend (hnswlib v0.8.0, approximate, scalable) - Vtable dispatch, module/service registration - Persistence (.racm format) with save/load - Embedding extraction via llama.cpp - 10/10 standalone tests passing Swift SDK: - CppBridge.Memory actor (C FFI bridge) - RAGMemoryService (remember/recall/forget) - RAGAgent (full RAG: retrieve + augment + generate) - EmbeddingProvider protocol + LlamaCpp implementation - MemoryConfiguration, MemoryModule, MemoryTypes Performance: 0.10ms/query at 10K vectors (128-dim, HNSW) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

coderabbitai · 2026-02-06T10:45:52Z

📝 Walkthrough

Walkthrough

This PR introduces a comprehensive memory/vector search subsystem across the RunAnywhere SDK, adds a new Kotlin Android starter app demonstrating LLM chat, speech-to-text, text-to-speech, and voice pipeline capabilities, and extends iOS test infrastructure with memory testing components.

Changes

Cohort / File(s)	Summary
Kotlin Starter App Documentation `Playground/README.md`, `Playground/kotlin-starter-app/README.md`	Added documentation for new Kotlin starter app including features, requirements, and SDK integration guide.
Kotlin Starter App Build & Configuration `Playground/kotlin-starter-app/build.gradle.kts`, `Playground/kotlin-starter-app/app/build.gradle.kts`, `Playground/kotlin-starter-app/gradle/...`, `Playground/kotlin-starter-app/settings.gradle.kts`, `Playground/kotlin-starter-app/app/proguard-rules.pro`, `Playground/kotlin-starter-app/.gitignore`, `Playground/kotlin-starter-app/app/.gitignore`	Gradle build configuration, wrapper scripts, version catalog, ProGuard rules, and ignore files for the Kotlin starter app.
Kotlin Starter App Manifest & Resources `Playground/kotlin-starter-app/app/src/main/AndroidManifest.xml`, `Playground/kotlin-starter-app/app/src/main/res/...`	Android manifest with permissions (INTERNET, RECORD_AUDIO, MODIFY_AUDIO_SETTINGS), app theme configuration, drawable resources for launcher icons, and string/color resource definitions.
Kotlin Starter App Core & Services `Playground/kotlin-starter-app/app/src/main/java/com/runanywhere/kotlin_starter_example/MainActivity.kt`, `Playground/kotlin-starter-app/app/src/main/java/.../services/ModelService.kt`	MainActivity entry point with SDK initialization, backend registration, and Compose navigation setup; ModelService ViewModel managing LLM, STT, and TTS model lifecycle with download/load/unload operations.
Kotlin Starter App UI Screens `Playground/kotlin-starter-app/app/src/main/java/.../ui/screens/ChatScreen.kt`, `.../HomeScreen.kt`, `.../SpeechToTextScreen.kt`, `.../TextToSpeechScreen.kt`, `.../VoicePipelineScreen.kt`	Jetpack Compose screens for home navigation, LLM chat, speech-to-text transcription, text-to-speech synthesis, and end-to-end voice pipeline integrating STT→LLM→TTS.
Kotlin Starter App UI Components & Theme `Playground/kotlin-starter-app/app/src/main/java/.../ui/components/FeatureCard.kt`, `.../ModelLoaderWidget.kt`, `.../ui/theme/Theme.kt`, `.../ui/theme/Type.kt`	Reusable Compose UI components (FeatureCard, ModelLoaderWidget) and theme/typography definitions for Material 3 design.
Kotlin Starter App Tests `Playground/kotlin-starter-app/app/src/androidTest/java/...`, `Playground/kotlin-starter-app/app/src/test/java/...`	Example instrumented and unit test files for app validation.
Memory Backend C/C++ API Headers `sdk/runanywhere-commons/include/rac/backends/rac_backend_memory.h`, `.../rac/features/memory/rac_memory.h`, `.../rac_memory_service.h`, `.../rac_memory_types.h`, `sdk/runanywhere-commons/include/rac/core/rac_component_types.h`, `.../rac_error.h`, `.../rac_types.h`	Public C API headers defining memory backend registration, service operations (create, add, search, remove, save, load, destroy), configuration types (distance metrics, index types, HNSW parameters), and error codes.
Memory Backend C/C++ Implementation `sdk/runanywhere-commons/src/backends/memory/memory_backend_flat.cpp`, `.../memory_backend_flat.h`, `.../memory_backend_hnswlib.cpp`, `.../memory_backend_hnswlib.h`, `.../rac_backend_memory_register.cpp`, `.../rac_memory_backend.cpp`	Flat (brute-force) and HNSW-based vector search implementations with persistence, metadata storage, thread-safe operations, and backend dispatcher logic.
Memory Backend Integration `sdk/runanywhere-commons/src/features/memory/rac_memory_service.cpp`, `sdk/runanywhere-commons/CMakeLists.txt`	Memory service dispatcher exposing unified C API for backend-agnostic index operations; CMake configuration enabling memory backend build with platform-specific optimizations.
Memory Backend Tests `sdk/runanywhere-commons/tests/test_memory.cpp`, `.../CMakeLists.txt`	Comprehensive test suite for flat and HNSW backends covering lifecycle, operations, persistence, and performance scenarios.
LLaMA Embeddings API `sdk/runanywhere-commons/src/backends/llamacpp/llamacpp_backend.cpp`, `.../llamacpp_backend.h`, `.../rac_llm_llamacpp.cpp`, `sdk/runanywhere-commons/include/rac/backends/rac_llm_llamacpp.h`	New embedding extraction functionality in LlamaCPP backend via `get_embeddings()` and public C API `rac_llm_llamacpp_get_embeddings()`.
Swift SDK Memory Layer - Core `sdk/runanywhere-swift/Sources/RunAnywhere/Features/Memory/MemoryConfiguration.swift`, `.../MemoryTypes.swift`, `.../MemoryModule.swift`	Swift types and configuration for memory subsystem: distance metrics, index types, search/recall/stats result structures, and module registration.
Swift SDK Memory Layer - High-Level APIs `sdk/runanywhere-swift/Sources/RunAnywhere/Features/Memory/EmbeddingProvider.swift`, `.../RAGMemoryService.swift`, `.../RAGAgent.swift`, `.../TextChunker.swift`	Public actors and protocols for text embeddings, RAG-driven memory operations (remember/recall/forget), retrieval-augmented generation with LLM integration, and text chunking utilities.
Swift SDK Memory Bridge `sdk/runanywhere-swift/Sources/RunAnywhere/Foundation/Bridge/Extensions/CppBridge+Memory.swift`	Actor-based bridge exposing C memory backend operations (create, add, search, remove, save/load, stats, destroy) with error handling and lifecycle management.
Swift SDK Initialization & Error Handling `sdk/runanywhere-swift/Sources/RunAnywhere/Foundation/Bridge/CppBridge.swift`, `.../Errors/ErrorCategory.swift`, `.../Errors/SDKError.swift`, `sdk/runanywhere-swift/Sources/RunAnywhere/CRACommons/include/CRACommons.h`	Backend memory registration during SDK initialization, new memory error category, factory method for memory errors, and inclusion of memory headers in public umbrella header.
Swift SDK C Headers (Memory) `sdk/runanywhere-swift/Sources/RunAnywhere/CRACommons/include/rac_backend_memory.h`, `.../rac_memory_service.h`, `.../rac_memory_types.h`, `.../rac_memory.h`, `.../rac_llm_llamacpp.h`	Mirrored C header definitions for memory backend, service, and types in Swift SDK; embedding function addition to LlamaCPP header.
iOS Memory Testing `examples/ios/RunAnywhereAI/RunAnywhereAI/Features/Memory/MemoryTestView.swift`, `.../MemoryTestViewModel.swift`, `.../App/ContentView.swift`	New MemoryTestView and viewmodel for comprehensive memory backend testing (flat/HNSW lifecycle, add, search, remove, save/load, HNSW-specific scenarios) with UI for test execution and results display; ContentView updated with Memory test tab.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~75 minutes

Possibly related PRs

Both this PR and #239 introduce substantial memory/vector-search subsystem changes affecting Swift SDK memory management, RAG APIs, and high-level service integration across overlapping files and modules.
Both this PR and #46 add memory backend registration steps (rac_backend_memory_register()) to the core Swift SDK initialization flow in CppBridge.swift.
Both this PR and #203 implement Android starter apps with speech-to-text and text-to-speech screens, model-loading ViewModels, and Compose navigation patterns.

Suggested labels

kotlin-sdk, memory-backend, swift-sdk, vector-search

Suggested reviewers

sanchitmonga22
shubhammalhotra28

🐰 A flutter of joy, a burst of code,
Memory and voices down the road,
LLMs chat, embeddings align,
RAG agents learn, so fine!
Kotlin hops with screens so bright, 🚀

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 15.32% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title 'Add memory layer with vector similarity search and RAG pipeline' clearly summarizes the main change in the PR, which is the addition of a complete memory/vector similarity search layer with RAG capabilities.
Description check	✅ Passed	The PR description provides comprehensive details about the changes including architecture, components, testing status, and implementation details. It covers most required sections from the template, though some testing checkboxes remain unchecked.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

greptile-apps

_{5 files reviewed, 5 comments}

_{Edit Code Review Agent Settings | Greptile}

greptile-apps · 2026-02-06T10:49:30Z

sdk/runanywhere-commons/src/backends/memory/memory_backend_hnswlib.cpp

+#include <algorithm>
+#include <chrono>
+#include <cmath>
+#include <cstdlib>
+#include <cstring>
+#include <memory>
+#include <mutex>
+#include <shared_mutex>
+#include <string>
+#include <unordered_map>
+#include <vector>


Missing header include

std::priority_queue is used in this file but <queue> is never included, which will fail compilation on standard toolchains.

Suggested change

#include <algorithm>

#include <chrono>

#include <cmath>

#include <cstdlib>

#include <cstring>

#include <memory>

#include <mutex>

#include <shared_mutex>

#include <string>

#include <unordered_map>

#include <vector>

#include <queue>

Prompt To Fix With AI

This is a comment left during a code review. Path: sdk/runanywhere-commons/src/backends/memory/memory_backend_hnswlib.cpp Line: 12:22 Comment: **Missing header include** `std::priority_queue` is used in this file but `<queue>` is never included, which will fail compilation on standard toolchains. ```suggestion #include <queue> ``` How can I resolve this? If you propose a fix, please make it concise.

greptile-apps · 2026-02-06T10:49:31Z

sdk/runanywhere-commons/src/backends/llamacpp/rac_llm_llamacpp.cpp

+    // Allocate output buffer (caller frees with rac_free)
+    float* result = static_cast<float*>(malloc(embeddings.size() * sizeof(float)));
+    if (!result) {
+        return RAC_ERROR_OUT_OF_MEMORY;
+    }
+
+    memcpy(result, embeddings.data(), embeddings.size() * sizeof(float));
+    *out_embeddings = result;
+    *out_dimension = static_cast<uint32_t>(embeddings.size());


Allocator mismatch for embeddings

rac_llm_llamacpp_get_embeddings allocates the output buffer with malloc, but the API contract (and Swift caller) frees it via rac_free. If rac_free is ever swapped to a different allocator (as intended by the wrapper), this becomes a real crash/corruption risk. Allocate with rac_alloc (or at minimum ensure the free function matches the allocator used).

Prompt To Fix With AI

This is a comment left during a code review. Path: sdk/runanywhere-commons/src/backends/llamacpp/rac_llm_llamacpp.cpp Line: 282:290 Comment: **Allocator mismatch for embeddings** `rac_llm_llamacpp_get_embeddings` allocates the output buffer with `malloc`, but the API contract (and Swift caller) frees it via `rac_free`. If `rac_free` is ever swapped to a different allocator (as intended by the wrapper), this becomes a real crash/corruption risk. Allocate with `rac_alloc` (or at minimum ensure the free function matches the allocator used). How can I resolve this? If you propose a fix, please make it concise.

greptile-apps · 2026-02-06T10:49:32Z

sdk/runanywhere-commons/src/features/memory/rac_memory_service.cpp

+    // Get dimension from stats
+    rac_memory_stats_t stats = {};
+    if (service->ops->get_stats) {
+        service->ops->get_stats(service->impl, &stats);
+    }
+
+    return service->ops->add(service->impl, vectors, ids, metadata, count, stats.dimension);


Dimension can be zero

rac_memory_add (and similarly rac_memory_search) derives dimension from get_stats but ignores the return value; if get_stats fails or isn't implemented correctly, stats.dimension stays 0 and every backend will return RAC_ERROR_MEMORY_DIMENSION_MISMATCH even though the handle is valid. This makes the public API brittle (operations can start failing after an internal stats error). At minimum, check get_stats’s result and treat failure as an error instead of passing dimension=0 into the backend.

Prompt To Fix With AI

This is a comment left during a code review. Path: sdk/runanywhere-commons/src/features/memory/rac_memory_service.cpp Line: 72:78 Comment: **Dimension can be zero** `rac_memory_add` (and similarly `rac_memory_search`) derives `dimension` from `get_stats` but ignores the return value; if `get_stats` fails or isn't implemented correctly, `stats.dimension` stays 0 and every backend will return `RAC_ERROR_MEMORY_DIMENSION_MISMATCH` even though the handle is valid. This makes the public API brittle (operations can start failing after an internal stats error). At minimum, check `get_stats`’s result and treat failure as an error instead of passing `dimension=0` into the backend. How can I resolve this? If you propose a fix, please make it concise.

greptile-apps · 2026-02-06T10:49:33Z

sdk/runanywhere-swift/Sources/RunAnywhere/Features/Memory/EmbeddingProvider.swift

+        guard result == RAC_SUCCESS, let emb = embeddings else {
+            throw SDKError.memory(.processingFailed,
+                                  "Failed to extract embeddings: \(result)")
+        }
+
+        defer { rac_free(emb) }
+
+        return Array(UnsafeBufferPointer(start: emb, count: Int(outDim)))


Embedding dimension not validated

LlamaCppEmbeddingProvider stores an expected dimension, but embed() returns outDim without checking it matches _dimension. If the loaded model’s embedding size differs (or changes after reload), the Swift memory bridge will try to add/search with mismatched dimensions and fail at runtime with RAC_ERROR_MEMORY_DIMENSION_MISMATCH. Validate outDim == _dimension and surface a clear error when it doesn’t match.

Prompt To Fix With AI

This is a comment left during a code review. Path: sdk/runanywhere-swift/Sources/RunAnywhere/Features/Memory/EmbeddingProvider.swift Line: 60:67 Comment: **Embedding dimension not validated** `LlamaCppEmbeddingProvider` stores an expected `dimension`, but `embed()` returns `outDim` without checking it matches `_dimension`. If the loaded model’s embedding size differs (or changes after reload), the Swift memory bridge will try to add/search with mismatched dimensions and fail at runtime with `RAC_ERROR_MEMORY_DIMENSION_MISMATCH`. Validate `outDim == _dimension` and surface a clear error when it doesn’t match. How can I resolve this? If you propose a fix, please make it concise.

greptile-apps · 2026-02-06T10:49:35Z

sdk/runanywhere-swift/Sources/RunAnywhere/Foundation/Bridge/Extensions/CppBridge+Memory.swift

+        /// Add vectors with IDs and optional metadata
+        public func add(vectors: [[Float]], ids: [UInt64], metadata: [String?]? = nil) throws {
+            let handle = try getHandle()
+            let count = UInt32(vectors.count)
+            guard count > 0 else { return }
+
+            // Flatten vectors into contiguous array
+            let flat = vectors.flatMap { $0 }
+


Vectors/IDs length mismatch

add(vectors:ids:metadata:) computes count from vectors.count but never verifies ids.count == vectors.count (and same for metadata?.count). If these arrays differ, the bridge will read past the end of ids/metadata buffers when calling rac_memory_add, causing memory-safety issues (crash/corruption). Add explicit count checks before flattening/calling into C.

Prompt To Fix With AI

This is a comment left during a code review. Path: sdk/runanywhere-swift/Sources/RunAnywhere/Foundation/Bridge/Extensions/CppBridge+Memory.swift Line: 68:76 Comment: **Vectors/IDs length mismatch** `add(vectors:ids:metadata:)` computes `count` from `vectors.count` but never verifies `ids.count == vectors.count` (and same for `metadata?.count`). If these arrays differ, the bridge will read past the end of `ids`/`metadata` buffers when calling `rac_memory_add`, causing memory-safety issues (crash/corruption). Add explicit count checks before flattening/calling into C. How can I resolve this? If you propose a fix, please make it concise.

coderabbitai

Actionable comments posted: 17

Note

Due to the large number of review comments, Critical, Major severity comments were prioritized as inline comments.

🤖 Fix all issues with AI agents

In
`@Playground/kotlin-starter-app/app/src/main/java/com/runanywhere/kotlin_starter_example/services/ModelService.kt`:
- Around line 288-299: unloadAllModels currently wraps
RunAnywhere.unloadLLMModel, unloadSTTModel, and unloadTTSVoice in one try/catch
so a thrown exception stops subsequent unloads and skips refreshModelState;
change it so each unload call (RunAnywhere.unloadLLMModel(),
RunAnywhere.unloadSTTModel(), RunAnywhere.unloadTTSVoice()) is executed in its
own try/catch (or use runCatching) to log/collect individual errors without
aborting the rest, and ensure refreshModelState() is invoked after all attempts
(e.g., in a finally or after all runCatching blocks) so the UI reflects the
final state.

In
`@Playground/kotlin-starter-app/app/src/main/java/com/runanywhere/kotlin_starter_example/ui/screens/SpeechToTextScreen.kt`:
- Around line 80-114: The background read Thread started in the recording method
isn’t tracked or joined, so stopRecording() sets isRecording=false and
immediately releases audioRecord and returns audioData while the thread may
still be in audioRecord?.read(...); fix by storing the Thread reference (e.g.,
readingThread) when starting it, then in stopRecording() after setting
isRecording = false call readingThread.join(<short timeout>) and handle
InterruptedException before stopping/releasing audioRecord and returning
audioData to ensure the reader thread has finished accessing audioRecord and
audioData.

In
`@Playground/kotlin-starter-app/app/src/main/java/com/runanywhere/kotlin_starter_example/ui/screens/TextToSpeechScreen.kt`:
- Around line 85-111: The AudioTrack created with AudioTrack.Builder (assigned
to audioTrack) can leak if the coroutine is cancelled during the delay: ensure
the play/cleanup sequence is wrapped in a try/finally so stop() and release()
always run; specifically move audioTrack.write(...) and audioTrack.play() into a
try block and call audioTrack.stop() and audioTrack.release() in the finally
block, and apply the same change to the playWavAudio implementation in
VoicePipelineScreen.kt to guarantee cleanup on CancellationException.

In
`@Playground/kotlin-starter-app/app/src/main/java/com/runanywhere/kotlin_starter_example/ui/screens/VoicePipelineScreen.kt`:
- Around line 326-335: When handling VoiceSessionEvent.TurnCompleted in the Flow
collector, don't call the suspending playWavAudio(audio) inline because it
blocks collection; instead launch playback in a separate coroutine (e.g.,
CoroutineScope(Dispatchers.IO).launch or viewModelScope.launch) so the collector
can continue receiving events. Set sessionState = VoiceSessionState.SPEAKING
before launching the coroutine, call playWavAudio(audio) inside that coroutine,
and after playback update sessionState = VoiceSessionState.LISTENING and
audioLevel = 0f inside the coroutine to restore UI state without blocking the
Flow collector. Ensure you reference the same symbols:
VoiceSessionEvent.TurnCompleted, playWavAudio, sessionState, and audioLevel.

In `@sdk/runanywhere-commons/include/rac/core/rac_component_types.h`:
- Line 38: Add a switch case for RAC_COMPONENT_MEMORY in component_types.cpp
inside rac_sdk_component_display_name, rac_sdk_component_raw_value, and
rac_component_to_resource_type: return the human label "Memory" from
rac_sdk_component_display_name, the raw identifier "memory" from
rac_sdk_component_raw_value, and map to the same resource-type enum/constant
used for memory in other components (do not return -1)—use the existing
resource-type symbol (e.g., RESOURCE_TYPE_MEMORY or the project’s equivalent) to
keep the pattern consistent with other cases.

In `@sdk/runanywhere-commons/include/rac/core/rac_error.h`:
- Around line 386-395: Add switch cases for the five new memory error macros
inside the rac_error_message() switch (the "OTHER ERRORS (-800 to -899)"
section): handle RAC_ERROR_MEMORY_INDEX_FULL,
RAC_ERROR_MEMORY_DIMENSION_MISMATCH, RAC_ERROR_MEMORY_INDEX_NOT_FOUND,
RAC_ERROR_MEMORY_INVALID_CONFIG, and RAC_ERROR_MEMORY_CORRUPT_INDEX by returning
descriptive string literals (e.g., "Memory index is full (max_elements
reached)", "Vector dimension does not match index dimension", "Memory index not
found at specified path", "Invalid memory index configuration", "Memory index
file is corrupt") so these codes no longer fall through to "Unknown error code".

In `@sdk/runanywhere-commons/src/backends/memory/CMakeLists.txt`:
- Around line 1-46: The CMake file unconditionally fetches hnswlib and builds
the memory backend which violates the RAC_BUILD_* gating guideline; add an
option named RAC_BUILD_MEMORY and early-return if it's OFF, then wrap the
FetchContent_Declare(hnswlib) and the creation of target rac_backend_memory (and
its
target_include_directories/target_link_libraries/target_compile_features/platform-specific
blocks) so they only run when RAC_BUILD_MEMORY is ON; ensure the option text
matches "Build the memory/vector search backend" and that all references to
hnswlib_SOURCE_DIR and target rac_backend_memory are only configured inside that
guarded block.

In `@sdk/runanywhere-commons/src/backends/memory/memory_backend_flat.cpp`:
- Around line 281-296: The save routine currently writes metadata with
fprintf("%llu\t%s\n") which breaks when metadata contains newlines; change the
format to a binary length-prefixed write: for each id in index->ids iterate the
metadata via index->metadata and write id (same type) then a uint32_t length
followed by the raw metadata bytes (use fwrite) instead of the tab/newline text
format, and update the corresponding load code that currently uses fgets (around
the fgets call referenced) to read the id, then read the uint32_t length and
then fread exactly that many bytes to reconstruct meta_it->second (unlike
line-based parsing). Ensure the length type and byte-ordering match the existing
binary header handling for vectors/IDs.
- Around line 316-346: In flat_load, several fread calls (reading version,
index_type, dimension, metric, num_vectors and the bulk reads into
index->vectors.data() and index->ids.data()) are unchecked and can leave the
index in a corrupt state; update flat_load to verify each fread returns the
expected count and on any short read fclose(f) and return
RAC_ERROR_MEMORY_CORRUPT_INDEX, ensuring you validate reads for
sizeof(uint32_t)/sizeof(uint64_t) for the header fields and for (num_vectors *
dimension) floats and num_vectors uint64_t for the bulk reads into
index->vectors.data() and index->ids.data(), and only resize/populate
index->id_to_index and index fields after successful reads.

In `@sdk/runanywhere-commons/src/backends/memory/memory_backend_hnswlib.cpp`:
- Around line 394-401: RAC_DISTANCE_COSINE is currently mapped to
hnswlib::L2Space (in rac_memory_hnsw_create and hnsw_load) which yields
incorrect results for non-normalized vectors; change the mapping so
RAC_DISTANCE_COSINE uses hnswlib::InnerProductSpace and then normalize vectors
to unit length on insertion and before searching: update rac_memory_hnsw_create
and hnsw_load to construct InnerProductSpace when config->metric ==
RAC_DISTANCE_COSINE, and modify hnsw_add and hnsw_search to perform L2
normalization of the input vectors (unit-length) before passing them to hnswlib
so cosine similarity is computed correctly via inner product.
- Around line 262-298: The header fields read after the magic (version,
index_type, dimension, metric, num_vectors, ef_search) are not validated, so
truncated files can leave those locals uninitialized and later crash when used
(e.g., creating hnswlib::L2Space(dimension) or HierarchicalNSW). Fix it by
checking the return value of each fread (or fread into a buffer and validate
total bytes read) and if any read returns fewer items than expected, fclose(f)
and return RAC_ERROR_MEMORY_CORRUPT_INDEX; ensure the checks cover version,
index_type, dimension, metric, num_vectors and ef_search before using them. Also
keep the existing index_type check (index_type != RAC_INDEX_HNSW) after these
validated reads.

In `@sdk/runanywhere-commons/src/features/memory/rac_memory_service.cpp`:
- Around line 128-169: In rac_memory_load: validate the return values of the two
fread calls that read version and index_type and return
RAC_ERROR_MEMORY_CORRUPT_INDEX on failure to avoid using uninitialized locals;
after creating the service (rac_memory_backend_create_service) check that
service->ops is non-null and that service->ops->load is set before calling it,
returning an appropriate error (and cleaning up via rac_memory_destroy(handle)
if needed) — mirror the existing dispatch-pattern used elsewhere in this file
(check service->ops and service->ops->load, then call load(service->impl,
path)).

In `@sdk/runanywhere-commons/tests/CMakeLists.txt`:
- Around line 1-4: Wrap the test target creation for test_memory in an
RAC_BACKEND_MEMORY guard so it is only added when the memory backend is enabled:
conditionally add the add_executable(test_memory ...),
target_link_libraries(test_memory PRIVATE rac_commons rac_backend_memory) and
target_include_directories(test_memory ...) inside an if(RAC_BACKEND_MEMORY) ...
endif() block to avoid linking against rac_backend_memory when the backend is
disabled.

In `@sdk/runanywhere-swift/Sources/RunAnywhere/Features/Memory/MemoryTypes.swift`:
- Around line 30-45: MemorySearchResult and MemoryRecallResult incorrectly
declare Sendable while using metadata: [String: Any]?; change the metadata type
to a Sendable-safe form (e.g., metadata: [String: any Sendable]?) and update the
corresponding init parameters and stored property types in both structs
(MemorySearchResult and MemoryRecallResult) so the structs can legitimately
conform to Sendable under Swift 6.

In
`@sdk/runanywhere-swift/Sources/RunAnywhere/Features/Memory/RAGMemoryService.swift`:
- Line 33: The nextId counter resets because nextId (declared in
RAGMemoryService) isn't persisted or recomputed on load, causing ID collisions
when remember() generates new entries after load() or restart; fix by persisting
nextId with the index (e.g., metadata sidecar or dedicated field) during save()
and restoring it in load(), or alternatively implement logic in load() to scan
existing memory IDs (or query backend stats) to compute nextId = maxExistingId +
1 before returning, ensuring remember() uses the updated nextId.

In
`@sdk/runanywhere-swift/Sources/RunAnywhere/Foundation/Bridge/Extensions/CppBridge`+Memory.swift:
- Around line 78-93: Remove the dead preallocation for metadataPointers and
metadataBuffers in CppBridge+Memory.swift: the variables metadataPointers and
metadataBuffers (and the block that builds ptrs/Data from metadata) are never
used because the code later uses strdup-based allocation; delete that entire
if-let metadata block so only the strdup approach remains (search for
metadataPointers, metadataBuffers, and the for meta in metadata loop to locate
the code to remove).
- Around line 69-73: In add(vectors: [[Float]], ids: [UInt64], metadata:
[String?]? = nil) validate that ids.count == vectors.count and if metadata !=
nil then metadata!.count == vectors.count (and not empty for non-zero vectors)
before calling rac_memory_add; if counts mismatch throw an error/return early.
Update the metadata handling so you do not force-unwrap metaBuf.baseAddress!
when metadata is non-nil — only create and pass a metadata buffer when
metadata.count == vectors.count, otherwise pass nil to the C call; reference the
add(...) method, the rac_memory_add invocation, and the metaBuf.baseAddress
usage when making these changes.

🟡 Minor comments (16)

Playground/kotlin-starter-app/.gitignore-82-84 (1)
82-84: ⚠️ Potential issue | 🟡 Minor

**/models/ is overly broad and may silently ignore source files.

Line 83 already covers the intended app/src/main/assets/models/ path for downloaded AI models. The **/models/ glob on line 84 would also match any Kotlin package directory named models/ (e.g., com/runanywhere/.../models/), causing source files to be silently untracked.

Consider removing the broad pattern and keeping only the specific path:
Proposed fix
 # RunAnywhere downloaded models
 app/src/main/assets/models/
-**/models/
Playground/kotlin-starter-app/app/src/main/java/com/runanywhere/kotlin_starter_example/ui/components/FeatureCard.kt-55-56 (1)

55-56: ⚠️ Potential issue | 🟡 Minor

Empty gradientColors list will crash at runtime.

Brush.linearGradient(gradientColors) throws IllegalArgumentException if fewer than 2 colors are provided. Consider adding a require(gradientColors.size >= 2) precondition or documenting the constraint.

Playground/kotlin-starter-app/app/src/main/java/com/runanywhere/kotlin_starter_example/ui/screens/TextToSpeechScreen.kt-48-49 (1)

48-49: ⚠️ Potential issue | 🟡 Minor

audioFormat is parsed but never validated.

Line 49 reads the audio format code (1 = PCM) but the value is never checked. Non-PCM WAV files (e.g., compressed formats) would be silently misinterpreted as raw PCM, producing garbled audio. A guard like if (audioFormat != 1.toShort().toInt()) return@withContext would make this defensive.
Playground/kotlin-starter-app/app/src/main/java/com/runanywhere/kotlin_starter_example/ui/screens/SpeechToTextScreen.kt-94-98 (1)
94-98: ⚠️ Potential issue | 🟡 Minor

Swallowed SecurityException loses diagnostic context.

When AudioRecord creation throws a SecurityException, the function returns false but the exception message is silently discarded. This makes it difficult to diagnose permission-related failures on specific devices.
🔧 Proposed fix: log before returning
         } catch (e: SecurityException) {
+            android.util.Log.w("AudioRecorder", "Permission denied for audio recording", e)
             audioRecord?.release()
             audioRecord = null
             return false
         }
Playground/kotlin-starter-app/app/src/main/java/com/runanywhere/kotlin_starter_example/ui/screens/HomeScreen.kt-64-109 (1)
64-109: ⚠️ Potential issue | 🟡 Minor

Nested scrollable: LazyVerticalGrid inside a verticalScroll Column is fragile.

LazyVerticalGrid is a scrollable container placed inside a verticalScroll Column (Line 48). This forces a fixed .height(700.dp) (Line 68) which defeats lazy loading and breaks on varying screen sizes (clips on small, wastes space on large).

Since you only have 4 fixed items in a 2-column grid, replace LazyVerticalGrid with a simple Column + Row layout to avoid nested scrollable issues entirely.
💡 Suggested approach: replace with static grid layout
-            LazyVerticalGrid(
-                columns = GridCells.Fixed(2),
-                horizontalArrangement = Arrangement.spacedBy(16.dp),
-                verticalArrangement = Arrangement.spacedBy(16.dp),
-                modifier = Modifier.height(700.dp)
-            ) {
-                item {
-                    FeatureCard(
-                        title = "Chat",
-                        subtitle = "LLM Text Generation",
-                        icon = Icons.Rounded.Chat,
-                        gradientColors = listOf(AccentCyan, Color(0xFF0EA5E9)),
-                        onClick = onNavigateToChat
-                    )
-                }
-                
-                item {
-                    FeatureCard(
-                        title = "Speech",
-                        subtitle = "Speech to Text",
-                        icon = Icons.Rounded.Mic,
-                        gradientColors = listOf(AccentViolet, Color(0xFF7C3AED)),
-                        onClick = onNavigateToSTT
-                    )
-                }
-                
-                item {
-                    FeatureCard(
-                        title = "Voice",
-                        subtitle = "Text to Speech",
-                        icon = Icons.Rounded.VolumeUp,
-                        gradientColors = listOf(AccentPink, Color(0xFFDB2777)),
-                        onClick = onNavigateToTTS
-                    )
-                }
-                
-                item {
-                    FeatureCard(
-                        title = "Pipeline",
-                        subtitle = "Voice Agent",
-                        icon = Icons.Rounded.AutoAwesome,
-                        gradientColors = listOf(AccentGreen, Color(0xFF059669)),
-                        onClick = onNavigateToVoicePipeline
-                    )
-                }
-            }
+            Row(horizontalArrangement = Arrangement.spacedBy(16.dp)) {
+                FeatureCard(
+                    title = "Chat",
+                    subtitle = "LLM Text Generation",
+                    icon = Icons.Rounded.Chat,
+                    gradientColors = listOf(AccentCyan, Color(0xFF0EA5E9)),
+                    onClick = onNavigateToChat,
+                    modifier = Modifier.weight(1f)
+                )
+                FeatureCard(
+                    title = "Speech",
+                    subtitle = "Speech to Text",
+                    icon = Icons.Rounded.Mic,
+                    gradientColors = listOf(AccentViolet, Color(0xFF7C3AED)),
+                    onClick = onNavigateToSTT,
+                    modifier = Modifier.weight(1f)
+                )
+            }
+            Spacer(modifier = Modifier.height(16.dp))
+            Row(horizontalArrangement = Arrangement.spacedBy(16.dp)) {
+                FeatureCard(
+                    title = "Voice",
+                    subtitle = "Text to Speech",
+                    icon = Icons.Rounded.VolumeUp,
+                    gradientColors = listOf(AccentPink, Color(0xFFDB2777)),
+                    onClick = onNavigateToTTS,
+                    modifier = Modifier.weight(1f)
+                )
+                FeatureCard(
+                    title = "Pipeline",
+                    subtitle = "Voice Agent",
+                    icon = Icons.Rounded.AutoAwesome,
+                    gradientColors = listOf(AccentGreen, Color(0xFF059669)),
+                    onClick = onNavigateToVoicePipeline,
+                    modifier = Modifier.weight(1f)
+                )
+            }
Playground/kotlin-starter-app/app/src/main/java/com/runanywhere/kotlin_starter_example/services/ModelService.kt-277-283 (1)
277-283: ⚠️ Potential issue | 🟡 Minor

downloadAndLoadAllModels silently clears errors from earlier model failures.

Each downloadAndLoad*Suspend() starts with errorMessage = null (e.g., line 147, 193, 239). When called sequentially here, a failure in the LLM step sets errorMessage, but the subsequent STT step immediately clears it. The user loses visibility into the first failure.

Consider collecting errors rather than overwriting, or short-circuiting on failure:
Option A: stop on first failure
 fun downloadAndLoadAllModels() {
     viewModelScope.launch {
-        if (!isLLMLoaded) downloadAndLoadLLMSuspend()
-        if (!isSTTLoaded) downloadAndLoadSTTSuspend()
-        if (!isTTSLoaded) downloadAndLoadTTSSuspend()
+        if (!isLLMLoaded) downloadAndLoadLLMSuspend()
+        if (errorMessage != null) return@launch
+        if (!isSTTLoaded) downloadAndLoadSTTSuspend()
+        if (errorMessage != null) return@launch
+        if (!isTTSLoaded) downloadAndLoadTTSSuspend()
     }
 }
sdk/runanywhere-commons/include/rac/core/rac_error.h-383-398 (1)
383-398: ⚠️ Potential issue | 🟡 Minor

Error codes look well-placed; update the top-level range summary to match.

The new -850 to -869 memory error sub-range is correctly carved out of the Other Errors block, but the range table at line 47 still reads Other errors: -800 to -899. Add a Memory entry to keep the summary consistent with the actual layout.
Suggested range table update (around line 47)
 //   - Event errors:             -700 to -799
-//   - Other errors:             -800 to -899
+//   - Other errors:             -800 to -849, -870 to -899
+//   - Memory/Vector Search:     -850 to -869
 //   - Reserved:                 -900 to -999
sdk/runanywhere-swift/Sources/RunAnywhere/Features/Memory/TextChunker.swift-39-56 (1)
39-56: ⚠️ Potential issue | 🟡 Minor

Chunks can silently exceed maxCharacters in two scenarios.

A single sentence longer than maxCharacters is added verbatim at line 41 with no character-level fallback (contrary to the doc at line 19).

After overlap insertion (line 52), the new currentChunk is overlapTail + " " + sentence, which is never re-checked against maxCharacters, so the next append at line 46 emits an oversized chunk.

Both paths can produce chunks significantly larger than the requested limit, which may cause downstream embedding models to silently truncate or error.
Proposed fix sketch: re-check after overlap and add character-level fallback
             } else {
                 // Current chunk is full
                 chunks.append(currentChunk)

                 // Start new chunk with overlap from end of previous
                 if overlap > 0 && currentChunk.count > overlap {
                     let overlapStart = currentChunk.index(currentChunk.endIndex,
                                                           offsetBy: -overlap)
-                    currentChunk = String(currentChunk[overlapStart...]) + " " + sentence
+                    let overlapText = String(currentChunk[overlapStart...])
+                    if overlapText.count + 1 + sentence.count <= maxCharacters {
+                        currentChunk = overlapText + " " + sentence
+                    } else {
+                        currentChunk = sentence
+                    }
                 } else {
                     currentChunk = sentence
                 }
+
+                // If a single sentence exceeds maxCharacters, split by characters
+                while currentChunk.count > maxCharacters {
+                    let splitIdx = currentChunk.index(currentChunk.startIndex,
+                                                      offsetBy: maxCharacters)
+                    chunks.append(String(currentChunk[..<splitIdx]))
+                    currentChunk = String(currentChunk[splitIdx...])
+                }
             }
sdk/runanywhere-commons/include/rac/features/memory/rac_memory.h-11-12 (1)
11-12: ⚠️ Potential issue | 🟡 Minor

Use relative includes to match the Swift bridge implementation and avoid unnecessary path verbosity.

This header is in sdk/runanywhere-commons/include/rac/features/memory/ alongside rac_memory_service.h and rac_memory_types.h. Use relative includes instead of full paths:
`#include` "rac_memory_service.h"
`#include` "rac_memory_types.h"
This matches the pattern used in the Swift bridge version (sdk/runanywhere-swift/Sources/RunAnywhere/CRACommons/include/rac_memory.h) and follows the standard practice for umbrella headers in the same directory.
sdk/runanywhere-swift/Sources/RunAnywhere/Features/Memory/EmbeddingProvider.swift-48-68 (1)
48-68: ⚠️ Potential issue | 🟡 Minor

Missing validation: returned embedding dimension vs. expected _dimension.

outDim from the C API may not match the configured _dimension. If the model produces embeddings of a different size, this silently returns a mismatched vector that will fail downstream at rac_memory_add with a cryptic DIMENSION_MISMATCH error.

Adding an early check here gives a much clearer error to the caller.
Proposed fix
         defer { rac_free(emb) }
 
+        guard outDim == _dimension else {
+            throw SDKError.memory(.processingFailed,
+                "Embedding dimension mismatch: expected \(_dimension), got \(outDim)")
+        }
+
         return Array(UnsafeBufferPointer(start: emb, count: Int(outDim)))
sdk/runanywhere-commons/tests/test_memory.cpp-494-509 (1)
494-509: ⚠️ Potential issue | 🟡 Minor

Benchmark silently ignores search failures.

Line 498 calls rac_memory_search without checking the return value. If the search silently fails (e.g., returns an error), the benchmark will still report 0 ms/query and PASS.
Proposed fix
     for (int q = 0; q < 100; q++) {
         rac_memory_search_results_t results = {};
-        rac_memory_search(handle, &vecs[q * D], 10, &results);
+        rac_result_t r = rac_memory_search(handle, &vecs[q * D], 10, &results);
+        if (r != RAC_SUCCESS) {
+            rac_memory_destroy(handle);
+            FAIL("search failed during benchmark");
+        }
         rac_memory_search_results_free(&results);
     }
sdk/runanywhere-commons/src/backends/memory/memory_backend_hnswlib.cpp-126-136 (1)

126-136: ⚠️ Potential issue | 🟡 Minor

setEf under shared_lock is a data race.

index->hnsw->setEf(...) on line 128 mutates the internal ef_ member of the hnswlib index, but the lock held is a shared_lock (read lock). Concurrent hnsw_search calls will race on this write. Although the value is always the same here, this is UB per the C++ memory model.

Move setEf to index creation and hnsw_load instead (where it's set once under exclusive lock), and remove it from the search path.

sdk/runanywhere-commons/src/backends/memory/memory_backend_hnswlib.cpp-283-298 (1)

283-298: ⚠️ Potential issue | 🟡 Minor

Fixed 65 KB stack buffer for metadata lines may silently truncate large metadata entries.

char line[65536] on line 285 means any metadata JSON longer than ~65 KB is truncated by fgets. This is unlikely for typical use but creates a silent data corruption path. Also, 64 KB on the stack is large for some embedded targets.
sdk/runanywhere-swift/Sources/RunAnywhere/Features/Memory/RAGAgent.swift-110-125 (1)
110-125: ⚠️ Potential issue | 🟡 Minor

Potential resource leak: rac_llm_result_free not called on the error path.

If rac_llm_generate returns RAC_SUCCESS but result.text is nil (or if the C layer partially allocates on failure), rac_llm_result_free is never called. Calling it unconditionally after extracting the text is safer, since the free function should handle zero/nil members gracefully.
Proposed fix
         guard genResult == RAC_SUCCESS, let text = result.text else {
+            rac_llm_result_free(&result)
             throw SDKError.llm(.generationFailed,
                               "LLM generation failed: \(genResult)")
         }
 
         let answer = String(cString: text)
         rac_llm_result_free(&result)
sdk/runanywhere-swift/Sources/RunAnywhere/CRACommons/include/rac_memory_types.h-92-101 (1)
92-101: ⚠️ Potential issue | 🟡 Minor

Document score semantics for RAC_DISTANCE_INNER_PRODUCT.

Line 96 only documents L2/cosine behavior ("lower is closer"). For inner product, higher scores indicate greater similarity, and the polarity is reversed. Callers interpreting scores without knowing which metric is in use could sort or threshold incorrectly.
📝 Proposed documentation fix
     /** Vector ID */
     uint64_t id;

-    /** Distance/similarity score (lower is closer for L2/cosine) */
+    /** Distance/similarity score.
+     *  For L2/cosine: lower is more similar.
+     *  For inner product: higher is more similar. */
     float score;
sdk/runanywhere-swift/Sources/RunAnywhere/CRACommons/include/rac_memory_types.h-106-118 (1)
106-118: ⚠️ Potential issue | 🟡 Minor

Clarify sort order for inner product results.

Line 107 documents ascending sort for L2/cosine but doesn't specify the order for RAC_DISTANCE_INNER_PRODUCT, where "best" results have the highest score. Callers need to know whether results are still ascending (least similar first) or descending (most similar first) when using inner product.
📝 Proposed documentation fix
-    /** Array of results sorted by score (ascending for L2/cosine) */
+    /** Array of results sorted by relevance (best match first).
+     *  For L2/cosine: ascending score (lowest distance first).
+     *  For inner product: descending score (highest similarity first). */
     rac_memory_result_t* results;

🧹 Nitpick comments (33)

Playground/kotlin-starter-app/app/src/main/java/com/runanywhere/kotlin_starter_example/ui/components/FeatureCard.kt (1)
30-34: Prefer the Card(onClick = ...) overload for built-in click semantics.

Using the non-clickable Card + Modifier.clickable works, but the Card(onClick = onClick) overload provides proper semantic role, state handling, and a consistent interaction indication out of the box.
♻️ Suggested refactor
     Card(
-        modifier = modifier
-            .fillMaxWidth()
-            .aspectRatio(0.85f)
-            .clickable(onClick = onClick),
+        onClick = onClick,
+        modifier = modifier
+            .fillMaxWidth()
+            .aspectRatio(0.85f),
         shape = RoundedCornerShape(20.dp),
Playground/kotlin-starter-app/app/src/main/java/com/runanywhere/kotlin_starter_example/ui/components/ModelLoaderWidget.kt (1)

76-102: Both progress indicators can be visible simultaneously if isDownloading and isLoading are both true.

If the caller ever sets both flags to true, two stacked progress bars will appear. Consider whether these states should be mutually exclusive or if the component should enforce precedence (e.g., show only the downloading indicator when both are true).

Playground/kotlin-starter-app/app/src/main/java/com/runanywhere/kotlin_starter_example/ui/screens/TextToSpeechScreen.kt (1)

41-112: Extract playWavAudio to a shared utility — implementations have minor differences to reconcile.

This function exists in both TextToSpeechScreen.kt and VoicePipelineScreen.kt. While the core AudioTrack playback logic is identical, the implementations differ in WAV header parsing: TextToSpeechScreen uses buffer.position(20) without RIFF validation, whereas VoicePipelineScreen uses buffer.position(22) with RIFF magic byte validation and a raw PCM fallback. Consider extracting to a shared utility (e.g., AudioPlaybackUtils.kt) while deciding which parsing approach better suits your needs.
Playground/kotlin-starter-app/app/src/main/java/com/runanywhere/kotlin_starter_example/ui/screens/VoicePipelineScreen.kt (2)
145-221: playWavAudio() is duplicated from TextToSpeechScreen.kt.

This function is nearly identical to the one in TextToSpeechScreen.kt (Lines 40-111 in that file). Extract it into a shared utility (e.g., a utils/AudioPlayer.kt) to avoid maintaining the same WAV-parsing and AudioTrack playback logic in two places.

346-353: Swallowing CancellationException breaks cooperative cancellation.

While the comment says "Expected when stopping," swallowing CancellationException in coroutine code prevents proper cancellation propagation. The idiomatic Kotlin coroutine pattern is to re-throw it.
♻️ Proposed fix
             } catch (e: CancellationException) {
-                // Expected when stopping
+                throw e // Maintain cooperative cancellation
             } catch (e: Exception) {
Playground/kotlin-starter-app/app/src/main/java/com/runanywhere/kotlin_starter_example/ui/screens/ChatScreen.kt (1)
195-197: Redundant fully qualified call — chat is already imported.

Line 59 imports com.runanywhere.sdk.public.extensions.chat. The FQN on Line 196 can be simplified.
♻️ Suggested fix
-                                            val response = com.runanywhere.sdk.public.RunAnywhere.chat(userMessage)
+                                            val response = RunAnywhere.chat(userMessage)
Playground/kotlin-starter-app/app/src/main/java/com/runanywhere/kotlin_starter_example/services/ModelService.kt (1)

139-180: Consider extracting a generic download-and-load helper to reduce duplication.

The LLM, STT, and TTS flows (lines 139–180, 185–226, 231–272) share identical structure. In a playground app this is understandable for readability, but the boilerplate still invites copy-paste divergence over time. A small generic helper parameterized by model ID, state setters, and the load function would keep it maintainable.
sdk/runanywhere-commons/src/backends/llamacpp/llamacpp_backend.cpp (2)
574-587: Creating a temporary llama_context per embedding call is expensive — consider caching it.

Each call to get_embeddings allocates and tears down a full llama_context with embeddings enabled (Lines 575-587, 650-651). For RAG workflows where remember() is called in a loop (e.g., ingesting chunked documents), this repeated allocation/deallocation is a significant bottleneck — context creation involves memory allocation and potentially GPU resource setup.

Consider lazily creating a persistent embedding context (emb_ctx_) alongside the generation context, reusing it across calls, and tearing it down in unload_model_internal().
Sketch of caching approach
 // In the class header (llamacpp_backend.h), add:
+    llama_context* emb_ctx_ = nullptr;
+    void ensure_embedding_context();

 // In get_embeddings(), replace temp context creation with:
-    llama_context* emb_ctx = llama_init_from_model(model_, ctx_params);
-    if (!emb_ctx) { ... }
+    ensure_embedding_context();
+    if (!emb_ctx_) {
+        LOGE("Failed to create embedding context");
+        return {};
+    }

 // Remove per-call llama_free(emb_ctx) at cleanup;
 // Instead, free emb_ctx_ in unload_model_internal():
+    if (emb_ctx_) {
+        llama_free(emb_ctx_);
+        emb_ctx_ = nullptr;
+    }
604-618: Batch is allocated for full context size — could be sized to actual token count.

Line 605 allocates llama_batch_init(n_ctx, 0, 1) using the full context window. For embedding extraction, where inputs are typically short, sizing the batch to tokens.size() would reduce memory waste.
Minimal batch allocation
-    llama_batch batch = llama_batch_init(n_ctx, 0, 1);
+    llama_batch batch = llama_batch_init((int)tokens.size(), 0, 1);
sdk/runanywhere-commons/src/backends/memory/memory_backend_hnswlib.h (1)

1-7: Consider adding lifecycle/ownership documentation.

The Doxygen file header describes what the backend does but doesn't mention that the caller owns the out_handle from rac_memory_hnsw_create and must destroy it via the vtable's destroy op. Given the coding guideline about documenting lifecycle requirements, a brief note on the rac_memory_hnsw_create doc would be helpful. As per coding guidelines, public C API headers must document vtable operations, error codes, and lifecycle requirements.
sdk/runanywhere-swift/Sources/RunAnywhere/Features/Memory/TextChunker.swift (2)
26-30: No validation on maxCharacters and overlap relationship.

If overlap >= maxCharacters or maxCharacters <= 0, the chunking loop can produce degenerate output (infinite growth or empty chunks). A precondition or early clamp would make this safer.
Proposed guard
     public static func chunk(
         _ text: String,
         maxCharacters: Int,
         overlap: Int = 50
     ) -> [String] {
+        precondition(maxCharacters > 0, "maxCharacters must be positive")
+        let overlap = min(overlap, maxCharacters / 2)
         guard !text.isEmpty else { return [] }
         guard text.count > maxCharacters else { return [text] }
68-90: Naive sentence splitting will mis-split on abbreviations and decimals.

Splitting on every . will break on "Dr. Smith", "3.14", "U.S.A.", URLs, etc. For a RAG chunker this can degrade retrieval quality. Consider using NLTokenizer with .sentence unit from NaturalLanguage framework, which handles these cases natively on Apple platforms.
sdk/runanywhere-commons/include/rac/backends/rac_llm_llamacpp.h (1)

184-200: Good API shape and lifecycle documentation.

The ownership contract (caller must free with rac_free) is clear. Per project guidelines, consider enumerating the specific error codes this function may return (NULL_POINTER, INVALID_HANDLE, INFERENCE_FAILED, OUT_OF_MEMORY based on the implementation) to complete the public API documentation.

As per coding guidelines: "Public C API headers in include/rac/ must document vtable operations, error codes, and lifecycle requirements."
sdk/runanywhere-commons/src/backends/memory/CMakeLists.txt (1)
35-38: Accelerate framework linked unconditionally despite "if available" comment.

On macOS/iOS this will typically succeed, but for cross-compilation or minimal toolchains it may fail. Consider using find_library to verify availability first.
Suggested conditional check
 if(CMAKE_SYSTEM_NAME STREQUAL "iOS" OR CMAKE_SYSTEM_NAME STREQUAL "Darwin")
-    # Use Accelerate framework for optimized BLAS operations if available
-    target_link_libraries(rac_backend_memory PUBLIC "-framework Accelerate")
+    include(CheckLibraryExists)
+    find_library(ACCELERATE_FRAMEWORK Accelerate)
+    if(ACCELERATE_FRAMEWORK)
+        target_link_libraries(rac_backend_memory PUBLIC "-framework Accelerate")
+        message(STATUS "    Accelerate framework found")
+    endif()
 endif()
sdk/runanywhere-swift/Sources/RunAnywhere/Foundation/Bridge/Extensions/CppBridge+Memory.swift (1)

143-154: Metadata assumed to be JSON, fails silently for plain strings.

If the stored metadata is a plain string rather than JSON, JSONSerialization fails and metadata becomes nil, silently dropping the value. If the contract is JSON-only, document it; otherwise consider falling back to a {"text": rawString} wrapper.
sdk/runanywhere-swift/Sources/RunAnywhere/Foundation/Bridge/CppBridge.swift (2)
61-61: NSLock usage violates Swift 6 concurrency guidelines.

This NSLock is pre-existing, but new code in this PR (lines 119–127) continues to rely on it. Consider migrating to a Swift 6 concurrency primitive (e.g., an os_unfair_lock-based wrapper, or restructuring CppBridge as an actor) in a follow-up. As per coding guidelines, **/*.swift: "Do not use NSLock as it is outdated. Use Swift 6 concurrency primitives instead."

191-191: Consider logging if memory backend unregistration fails unexpectedly.

rac_backend_memory_unregister() can return RAC_ERROR_MODULE_NOT_FOUND if something went wrong during init. The return value is silently discarded. This is fine for normal shutdown, but a debug-level log on unexpected failure would be consistent with the registration path (line 122).
🔧 Optional: log unexpected unregister failures
-        rac_backend_memory_unregister()
+        let unregResult = rac_backend_memory_unregister()
+        if unregResult != RAC_SUCCESS && unregResult != RAC_ERROR_MODULE_NOT_FOUND {
+            SDKLogger(category: "CppBridge").warning("Memory backend unregistration failed: \(unregResult)")
+        }
sdk/runanywhere-commons/src/backends/llamacpp/llamacpp_backend.h (1)
123-124: Temporary embedding context per call may be a performance bottleneck for batch ingestion.

The implementation (in llamacpp_backend.cpp) creates and destroys a llama_context with embeddings = true on every get_embeddings() invocation. For a RAG remember() flow ingesting many chunks, this context allocation/deallocation overhead will dominate. Consider lazily creating and caching a dedicated embedding context (similar to the existing context_ member for generation) and reusing it across calls.
💡 Sketch: cache the embedding context
 // In llamacpp_backend.h, add a member:
+    llama_context* emb_context_ = nullptr;

 // In get_embeddings(), reuse instead of recreating:
-    llama_context* emb_ctx = llama_init_from_model(model_, ctx_params);
+    if (!emb_context_) {
+        emb_context_ = llama_init_from_model(model_, ctx_params);
+    }
+    llama_context* emb_ctx = emb_context_;

 // Clean up in unload_model_internal() and destructor:
+    if (emb_context_) { llama_free(emb_context_); emb_context_ = nullptr; }
examples/ios/RunAnywhereAI/RunAnywhereAI/Features/Memory/MemoryTestView.swift (1)

11-12: Consider migrating to @Observable macro.

The @StateObject + ObservableObject pattern works but the newer @Observable macro (Observation framework, iOS 17+) with @State is the idiomatic Swift approach, giving better performance by tracking property-level access. As per coding guidelines, **/*.swift: Use the latest Swift 6 APIs always.
sdk/runanywhere-commons/include/rac/backends/rac_backend_memory.h (1)
19-36: Document specific error codes per coding guidelines.

The @return annotations say "RAC_SUCCESS or error code" without listing which error codes can be returned. From the implementation in rac_backend_memory_register.cpp, register may return RAC_ERROR_MODULE_ALREADY_REGISTERED (if called twice) or propagated registry errors, and unregister returns RAC_ERROR_MODULE_NOT_FOUND (if not registered). Listing these helps consumers handle failures correctly. Based on learnings: "Public C API headers in include/rac/ must document vtable operations, error codes, and lifecycle requirements."
📝 Proposed documentation improvement
 /**
  * Registers the Memory backend with the commons module and service registries.
  *
  * Should be called once during SDK initialization.
  * This registers:
  * - Module: "memory" with VECTOR_SEARCH capability
  * - Service provider: Memory vector search provider
  *
- * `@return` RAC_SUCCESS or error code
+ * `@return` RAC_SUCCESS on success, or one of:
+ *   - RAC_ERROR_MODULE_ALREADY_REGISTERED if called more than once
+ *   - Other registry error codes on internal failure
  */
 RAC_API rac_result_t rac_backend_memory_register(void);

 /**
  * Unregisters the Memory backend.
  *
- * `@return` RAC_SUCCESS or error code
+ * `@return` RAC_SUCCESS on success, or one of:
+ *   - RAC_ERROR_MODULE_NOT_FOUND if not currently registered
  */
 RAC_API rac_result_t rac_backend_memory_unregister(void);
sdk/runanywhere-commons/CMakeLists.txt (1)
279-283: Memory backend bypasses the RAC_BUILD_BACKENDS gate — intentional?

The other individual backend options (RAC_BACKEND_LLAMACPP, RAC_BACKEND_ONNX, RAC_BACKEND_WHISPERCPP) are all gated behind RAC_BUILD_BACKENDS (line 285). RAC_BACKEND_MEMORY is independent and ON by default. This means cmake -DRAC_BUILD_BACKENDS=OFF disables ML backends but still builds the memory backend, which may surprise users. If this is intentional (memory has no heavy external deps), the comment on line 279 is helpful but consider also documenting this in the options section.
📝 Suggested documentation enhancement in options block
 option(RAC_BUILD_BACKENDS "Build ML backends (LlamaCPP, ONNX, WhisperCPP)" OFF)
-option(RAC_BACKEND_MEMORY "Build Memory/Vector Search backend (hnswlib)" ON)
+option(RAC_BACKEND_MEMORY "Build Memory/Vector Search backend (hnswlib) - independent of RAC_BUILD_BACKENDS" ON)
sdk/runanywhere-commons/src/backends/memory/rac_memory_backend.cpp (2)
48-57: Defensive cleanup: destroy backend_handle if malloc fails but also if create partially succeeds.

Two concerns with resource management:

Lines 48-50: If a future backend implementation sets *out_handle before returning an error, the handle leaks. Consider destroying the handle when it's non-null regardless of result.

Line 54: malloc leaves memory uninitialized. If rac_memory_service_t gains additional fields, they'll contain garbage. Prefer calloc for zero-initialization.
🛡️ Proposed defensive improvements
     if (result != RAC_SUCCESS || !backend_handle) {
+        if (backend_handle && ops) {
+            ops->destroy(backend_handle);
+        }
         RAC_LOG_ERROR(LOG_CAT, "Failed to create backend: %d", result);
         return nullptr;
     }

     // Allocate service struct
-    auto* service = static_cast<rac_memory_service_t*>(malloc(sizeof(rac_memory_service_t)));
+    auto* service = static_cast<rac_memory_service_t*>(calloc(1, sizeof(rac_memory_service_t)));
25-28: No error code propagation to callers.

rac_memory_backend_create_service returns rac_handle_t (an opaque pointer), so callers only see nullptr on failure with no way to distinguish between a null config, unknown index type, backend creation failure, or OOM. The specific rac_result_t from the backend is logged but lost. If this function is part of the public C API surface, consider an out-parameter for the error code or a different return convention.
sdk/runanywhere-commons/tests/test_memory.cpp (1)
45-60: Resource leak on early FAIL/EXPECT_OK exits in tests.

The FAIL macro issues a bare return, and EXPECT_OK delegates to FAIL. When these trigger mid-test (e.g., Line 102 after rac_memory_create succeeds but before destroy), the rac_handle_t is leaked. This is tolerable for a short-lived test process, but it can mask bugs in the destroy path and makes valgrind/ASAN runs noisy.

A common pattern is a goto cleanup label or an RAII guard:
struct HandleGuard {
    rac_handle_t h = nullptr;
    ~HandleGuard() { if (h) rac_memory_destroy(h); }
};
Not blocking, just a suggestion to improve test robustness.
examples/ios/RunAnywhereAI/RunAnywhereAI/Features/Memory/MemoryTestViewModel.swift (2)
150-182: Loose assertion in testSaveLoad — consider asserting exact vector count.

Line 167 checks stats.numVectors > 0, but at this point in the test sequence (after testRemove deleted id=5), the expected count is exactly 4. A tighter assertion would catch subtle persistence bugs (e.g., duplicates on load, dropped entries).
-        guard stats.numVectors > 0 else {
-            throw TestError("No vectors after load")
+        guard stats.numVectors == 4 else {
+            throw TestError("Expected 4 vectors after load, got \(stats.numVectors)")
         }
35-59: Tests depend on sequential execution order — state leaks between tests.

The test functions (e.g., testAddVectors, testSearchExact) rely on index state created/mutated by prior tests. If any middle test fails, subsequent tests will also fail with confusing errors (e.g., "Memory index not created"). This is acceptable for a visual test runner, but consider adding a note in the status message when a test fails mid-chain, or short-circuiting remaining tests.
sdk/runanywhere-commons/src/backends/memory/memory_backend_flat.cpp (1)

138-209: search_time_us includes lock acquisition time.

The timer at Line 148 starts before the shared_lock on Line 150. Under write contention this inflates the reported search latency. Consider moving start after lock acquisition if the metric is meant to reflect pure computation time.

sdk/runanywhere-commons/src/backends/memory/rac_backend_memory_register.cpp (1)

122-136: Unchecked return values from rac_service_unregister_provider / rac_module_unregister.

Lines 130–131 discard the results of the unregister calls. If either fails (e.g., the provider was already removed externally), state.registered is set to false anyway, which could leave the registry in an inconsistent state — one component unregistered and the other still present.

Not blocking, but consider logging warnings if either returns an unexpected error.

sdk/runanywhere-swift/Sources/RunAnywhere/Features/Memory/RAGAgent.swift (1)

106-108: LLM generation options are hardcoded.

max_tokens and temperature are not configurable per query. Consider accepting an optional configuration parameter for callers who need different generation behavior.

sdk/runanywhere-commons/src/backends/memory/memory_backend_hnswlib.cpp (2)

174-193: Remove silently swallows all exceptions — consider logging.

Line 186 catches all exceptions with catch (...) when marking an element for deletion. An element-not-found scenario is expected, but other exceptions (corruption, etc.) are silently lost. A debug-level log would aid troubleshooting.

195-246: Save mixes binary header with text metadata — verify metadata cannot contain raw newlines/tabs.

The format writes a binary header, then metadata as id\tmetadata\n lines. This relies on the metadata JSON strings never containing literal tab or newline bytes. Standard JSON serializers escape these, but there's no validation here. A corrupted or hand-crafted metadata string could break the load parser.

sdk/runanywhere-swift/Sources/RunAnywhere/CRACommons/include/rac_memory_service.h (1)

63-64: index_id ownership is undocumented.

const char* index_id in rac_memory_service_t is heap-allocated and freed via free() in rac_memory_destroy (see rac_memory_service.cpp line 194), but the header doesn't document this. Consider adding a brief ownership comment. Based on learnings, "Public C API headers in include/rac/ must document vtable operations, error codes, and lifecycle requirements."

sdk/runanywhere-commons/src/features/memory/rac_memory_service.cpp (1)

59-100: Dimension retrieved via get_stats on every add and search — unnecessary overhead.

dimension is immutable after index creation. Calling get_stats on every add/search acquires a shared_lock in the backend just to read the dimension. If get_stats is unset or fails, stats.dimension remains 0, silently passing a zero dimension to the backend.

Consider storing dimension directly in rac_memory_service_t at creation time.

coderabbitai · 2026-02-06T10:53:54Z

...tarter-app/app/src/main/java/com/runanywhere/kotlin_starter_example/services/ModelService.kt

+    fun unloadAllModels() {
+        viewModelScope.launch {
+            try {
+                RunAnywhere.unloadLLMModel()
+                RunAnywhere.unloadSTTModel()
+                RunAnywhere.unloadTTSVoice()
+                refreshModelState()
+            } catch (e: Exception) {
+                errorMessage = "Failed to unload models: ${e.message}"
+            }
+        }
+    }


⚠️ Potential issue | 🟠 Major

unloadAllModels: if the first unload throws, remaining models stay loaded.

The three unload calls are wrapped in a single try/catch. If unloadLLMModel() throws, unloadSTTModel() and unloadTTSVoice() are never invoked, leaving those models resident in memory. refreshModelState() is also skipped, so the UI will not reflect the actual state. For an app managing large on-device models this can silently waste significant memory.

Proposed fix: unload each model independently

fun unloadAllModels() { viewModelScope.launch { - try { - RunAnywhere.unloadLLMModel() - RunAnywhere.unloadSTTModel() - RunAnywhere.unloadTTSVoice() - refreshModelState() - } catch (e: Exception) { - errorMessage = "Failed to unload models: ${e.message}" - } + val errors = mutableListOf<String>() + try { RunAnywhere.unloadLLMModel() } catch (e: Exception) { errors += "LLM: ${e.message}" } + try { RunAnywhere.unloadSTTModel() } catch (e: Exception) { errors += "STT: ${e.message}" } + try { RunAnywhere.unloadTTSVoice() } catch (e: Exception) { errors += "TTS: ${e.message}" } + refreshModelState() + if (errors.isNotEmpty()) { + errorMessage = "Failed to unload: ${errors.joinToString("; ")}" + } } }

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

fun unloadAllModels() {

viewModelScope.launch {

try {

RunAnywhere.unloadLLMModel()

RunAnywhere.unloadSTTModel()

RunAnywhere.unloadTTSVoice()

refreshModelState()

} catch (e: Exception) {

errorMessage = "Failed to unload models: ${e.message}"

}

}

}

fun unloadAllModels() {

viewModelScope.launch {

val errors = mutableListOf<String>()

try { RunAnywhere.unloadLLMModel() } catch (e: Exception) { errors += "LLM: ${e.message}" }

try { RunAnywhere.unloadSTTModel() } catch (e: Exception) { errors += "STT: ${e.message}" }

try { RunAnywhere.unloadTTSVoice() } catch (e: Exception) { errors += "TTS: ${e.message}" }

refreshModelState()

if (errors.isNotEmpty()) {

errorMessage = "Failed to unload: ${errors.joinToString("; ")}"

}

}

}

🤖 Prompt for AI Agents

In `@Playground/kotlin-starter-app/app/src/main/java/com/runanywhere/kotlin_starter_example/services/ModelService.kt` around lines 288 - 299, unloadAllModels currently wraps RunAnywhere.unloadLLMModel, unloadSTTModel, and unloadTTSVoice in one try/catch so a thrown exception stops subsequent unloads and skips refreshModelState; change it so each unload call (RunAnywhere.unloadLLMModel(), RunAnywhere.unloadSTTModel(), RunAnywhere.unloadTTSVoice()) is executed in its own try/catch (or use runCatching) to log/collect individual errors without aborting the rest, and ensure refreshModelState() is invoked after all attempts (e.g., in a finally or after all runCatching blocks) so the UI reflects the final state.

coderabbitai · 2026-02-06T10:53:54Z

...pp/app/src/main/java/com/runanywhere/kotlin_starter_example/ui/screens/SpeechToTextScreen.kt

+            // Start reading audio in a thread
+            Thread {
+                val buffer = ByteArray(bufferSize)
+                while (isRecording) {
+                    val read = audioRecord?.read(buffer, 0, buffer.size) ?: 0
+                    if (read > 0) {
+                        synchronized(audioData) {
+                            audioData.write(buffer, 0, read)
+                        }
+                    }
+                }
+            }.start()
+
+            return true
+        } catch (e: SecurityException) {
+            audioRecord?.release()
+            audioRecord = null
+            return false
+        }
+    }
+
+    fun stopRecording(): ByteArray {
+        isRecording = false
+        audioRecord?.let { record ->
+            if (record.recordingState == AudioRecord.RECORDSTATE_RECORDING) {
+                record.stop()
+            }
+            record.release()
+        }
+        audioRecord = null
+
+        synchronized(audioData) {
+            return audioData.toByteArray()
+        }
+    }


⚠️ Potential issue | 🟠 Major

Reading thread is not joined in stopRecording(), risking incomplete data.

After setting isRecording = false, stopRecording() immediately grabs the audioData lock and returns. The background Thread (Line 81) may still be executing audioRecord?.read(...) — which can block or crash since audioRecord is stopped/released on Line 105-108 before the thread finishes. Consider keeping a reference to the thread and joining it (with a short timeout) before releasing the AudioRecord.

🔧 Proposed fix: join the reading thread before releasing

private class AudioRecorder { private var audioRecord: AudioRecord? = null `@Volatile` private var isRecording = false private val audioData = ByteArrayOutputStream() + private var readingThread: Thread? = null ... - Thread { + readingThread = Thread { val buffer = ByteArray(bufferSize) while (isRecording) { val read = audioRecord?.read(buffer, 0, buffer.size) ?: 0 if (read > 0) { synchronized(audioData) { audioData.write(buffer, 0, read) } } } - }.start() + } + readingThread?.start() ... fun stopRecording(): ByteArray { isRecording = false + readingThread?.join(500) // Wait for the reading thread to exit + readingThread = null audioRecord?.let { record ->

🧰 Tools

🪛 detekt (1.23.8)

[warning] 94-94: The caught exception is swallowed. The original exception could be lost.

(detekt.exceptions.SwallowedException)

🤖 Prompt for AI Agents

In `@Playground/kotlin-starter-app/app/src/main/java/com/runanywhere/kotlin_starter_example/ui/screens/SpeechToTextScreen.kt` around lines 80 - 114, The background read Thread started in the recording method isn’t tracked or joined, so stopRecording() sets isRecording=false and immediately releases audioRecord and returns audioData while the thread may still be in audioRecord?.read(...); fix by storing the Thread reference (e.g., readingThread) when starting it, then in stopRecording() after setting isRecording = false call readingThread.join(<short timeout>) and handle InterruptedException before stopping/releasing audioRecord and returning audioData to ensure the reader thread has finished accessing audioRecord and audioData.

coderabbitai · 2026-02-06T10:53:54Z

...pp/app/src/main/java/com/runanywhere/kotlin_starter_example/ui/screens/TextToSpeechScreen.kt

+    val audioTrack = AudioTrack.Builder()
+        .setAudioAttributes(
+            AudioAttributes.Builder()
+                .setUsage(AudioAttributes.USAGE_MEDIA)
+                .setContentType(AudioAttributes.CONTENT_TYPE_SPEECH)
+                .build()
+        )
+        .setAudioFormat(
+            AudioFormat.Builder()
+                .setSampleRate(sampleRate)
+                .setEncoding(audioFormatConfig)
+                .setChannelMask(channelConfig)
+                .build()
+        )
+        .setBufferSizeInBytes(maxOf(minBufferSize, pcmData.size))
+        .setTransferMode(AudioTrack.MODE_STATIC)
+        .build()
+
+    audioTrack.write(pcmData, 0, pcmData.size)
+    audioTrack.play()
+
+    // Wait for playback to complete
+    val durationMs = (pcmData.size.toLong() * 1000) / (sampleRate * numChannels * (bitsPerSample / 8))
+    delay(durationMs + 100) // Add small buffer
+
+    audioTrack.stop()
+    audioTrack.release()


⚠️ Potential issue | 🟠 Major

AudioTrack leaks if the coroutine is cancelled mid-playback.

If the user navigates away while audio is playing, the delay on Line 108 throws CancellationException, skipping stop() and release(). Wrap the AudioTrack lifecycle in try/finally.

🐛 Proposed fix

val audioTrack = AudioTrack.Builder() .setAudioAttributes( AudioAttributes.Builder() .setUsage(AudioAttributes.USAGE_MEDIA) .setContentType(AudioAttributes.CONTENT_TYPE_SPEECH) .build() ) .setAudioFormat( AudioFormat.Builder() .setSampleRate(sampleRate) .setEncoding(audioFormatConfig) .setChannelMask(channelConfig) .build() ) .setBufferSizeInBytes(maxOf(minBufferSize, pcmData.size)) .setTransferMode(AudioTrack.MODE_STATIC) .build() - audioTrack.write(pcmData, 0, pcmData.size) - audioTrack.play() - - // Wait for playback to complete - val durationMs = (pcmData.size.toLong() * 1000) / (sampleRate * numChannels * (bitsPerSample / 8)) - delay(durationMs + 100) // Add small buffer - - audioTrack.stop() - audioTrack.release() + try { + audioTrack.write(pcmData, 0, pcmData.size) + audioTrack.play() + + // Wait for playback to complete + val durationMs = (pcmData.size.toLong() * 1000) / (sampleRate * numChannels * (bitsPerSample / 8)) + delay(durationMs + 100) // Add small buffer + } finally { + audioTrack.stop() + audioTrack.release() + }

Note: The same issue exists in VoicePipelineScreen.kt's playWavAudio (see relevant snippet, around line 213–220).

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

val audioTrack = AudioTrack.Builder()

.setAudioAttributes(

AudioAttributes.Builder()

.setUsage(AudioAttributes.USAGE_MEDIA)

.setContentType(AudioAttributes.CONTENT_TYPE_SPEECH)

.build()

)

.setAudioFormat(

AudioFormat.Builder()

.setSampleRate(sampleRate)

.setEncoding(audioFormatConfig)

.setChannelMask(channelConfig)

.build()

)

.setBufferSizeInBytes(maxOf(minBufferSize, pcmData.size))

.setTransferMode(AudioTrack.MODE_STATIC)

.build()

audioTrack.write(pcmData, 0, pcmData.size)

audioTrack.play()

// Wait for playback to complete

val durationMs = (pcmData.size.toLong() * 1000) / (sampleRate * numChannels * (bitsPerSample / 8))

delay(durationMs + 100) // Add small buffer

audioTrack.stop()

audioTrack.release()

val audioTrack = AudioTrack.Builder()

.setAudioAttributes(

AudioAttributes.Builder()

.setUsage(AudioAttributes.USAGE_MEDIA)

.setContentType(AudioAttributes.CONTENT_TYPE_SPEECH)

.build()

)

.setAudioFormat(

AudioFormat.Builder()

.setSampleRate(sampleRate)

.setEncoding(audioFormatConfig)

.setChannelMask(channelConfig)

.build()

)

.setBufferSizeInBytes(maxOf(minBufferSize, pcmData.size))

.setTransferMode(AudioTrack.MODE_STATIC)

.build()

try {

audioTrack.write(pcmData, 0, pcmData.size)

audioTrack.play()

// Wait for playback to complete

val durationMs = (pcmData.size.toLong() * 1000) / (sampleRate * numChannels * (bitsPerSample / 8))

delay(durationMs + 100) // Add small buffer

} finally {

audioTrack.stop()

audioTrack.release()

}

🤖 Prompt for AI Agents

In `@Playground/kotlin-starter-app/app/src/main/java/com/runanywhere/kotlin_starter_example/ui/screens/TextToSpeechScreen.kt` around lines 85 - 111, The AudioTrack created with AudioTrack.Builder (assigned to audioTrack) can leak if the coroutine is cancelled during the delay: ensure the play/cleanup sequence is wrapped in a try/finally so stop() and release() always run; specifically move audioTrack.write(...) and audioTrack.play() into a try block and call audioTrack.stop() and audioTrack.release() in the finally block, and apply the same change to the playWavAudio implementation in VoicePipelineScreen.kt to guarantee cleanup on CancellationException.

coderabbitai · 2026-02-06T10:53:54Z

...p/app/src/main/java/com/runanywhere/kotlin_starter_example/ui/screens/VoicePipelineScreen.kt

+                        is VoiceSessionEvent.TurnCompleted -> {
+                            // Play the synthesized audio
+                            event.audio?.let { audio ->
+                                sessionState = VoiceSessionState.SPEAKING
+                                playWavAudio(audio)
+                            }
+                            // Resume listening state
+                            sessionState = VoiceSessionState.LISTENING
+                            audioLevel = 0f
+                        }


⚠️ Potential issue | 🟠 Major

playWavAudio() blocks event collection during audio playback.

playWavAudio(audio) is a suspending call with an internal delay(durationMs + 100). Since it's called inline inside collect { ... }, it blocks the entire Flow collection for the duration of playback. Any events emitted by the SDK during that time (e.g., Listening, Error) will be buffered and delivered late, causing stale UI state.

Launch playback in a separate coroutine so collection continues unblocked.

🔧 Proposed fix

is VoiceSessionEvent.TurnCompleted -> { // Play the synthesized audio event.audio?.let { audio -> sessionState = VoiceSessionState.SPEAKING - playWavAudio(audio) + launch { playWavAudio(audio) } } // Resume listening state sessionState = VoiceSessionState.LISTENING audioLevel = 0f }

🤖 Prompt for AI Agents

In `@Playground/kotlin-starter-app/app/src/main/java/com/runanywhere/kotlin_starter_example/ui/screens/VoicePipelineScreen.kt` around lines 326 - 335, When handling VoiceSessionEvent.TurnCompleted in the Flow collector, don't call the suspending playWavAudio(audio) inline because it blocks collection; instead launch playback in a separate coroutine (e.g., CoroutineScope(Dispatchers.IO).launch or viewModelScope.launch) so the collector can continue receiving events. Set sessionState = VoiceSessionState.SPEAKING before launching the coroutine, call playWavAudio(audio) inside that coroutine, and after playback update sessionState = VoiceSessionState.LISTENING and audioLevel = 0f inside the coroutine to restore UI state without blocking the Flow collector. Ensure you reference the same symbols: VoiceSessionEvent.TurnCompleted, playWavAudio, sessionState, and audioLevel.

coderabbitai · 2026-02-06T10:53:54Z

sdk/runanywhere-commons/include/rac/core/rac_component_types.h

    RAC_COMPONENT_VAD = 3,       /**< Voice Activity Detection */
    RAC_COMPONENT_VOICE = 4,     /**< Voice Agent */
    RAC_COMPONENT_EMBEDDING = 5, /**< Embedding generation */
+    RAC_COMPONENT_MEMORY = 6,    /**< Vector memory / RAG */


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

rg -n "rac_sdk_component_display_name|rac_sdk_component_raw_value" --type=cpp --type=c -B2 -A10

Repository: RunanywhereAI/runanywhere-sdks

Length of output: 7670

🏁 Script executed:

# Also search for where RAC_COMPONENT_MEMORY might be referenced rg -n "RAC_COMPONENT_MEMORY" --type=cpp --type=c -B2 -A2

Repository: RunanywhereAI/runanywhere-sdks

Length of output: 603

🏁 Script executed:

# Find the header file to understand the context find . -name "rac_component_types.h" -type f

Repository: RunanywhereAI/runanywhere-sdks

Length of output: 223

🏁 Script executed:

cat -n sdk/runanywhere-commons/src/core/component_types.cpp | head -60

Repository: RunanywhereAI/runanywhere-sdks

Length of output: 2374

🏁 Script executed:

# Also check if there are any default cases or what happens at the end cat -n sdk/runanywhere-commons/src/core/component_types.cpp

Repository: RunanywhereAI/runanywhere-sdks

Length of output: 4990

Update rac_sdk_component_display_name, rac_sdk_component_raw_value, and rac_component_to_resource_type to handle RAC_COMPONENT_MEMORY.

The enum value was added but the switch statements in component_types.cpp are missing the new case. Currently:

rac_sdk_component_display_name returns "Unknown" for memory

rac_sdk_component_raw_value returns "unknown" for memory

rac_component_to_resource_type returns -1 for memory

Add cases for RAC_COMPONENT_MEMORY in all three functions to match the pattern of existing components.

🤖 Prompt for AI Agents

In `@sdk/runanywhere-commons/include/rac/core/rac_component_types.h` at line 38, Add a switch case for RAC_COMPONENT_MEMORY in component_types.cpp inside rac_sdk_component_display_name, rac_sdk_component_raw_value, and rac_component_to_resource_type: return the human label "Memory" from rac_sdk_component_display_name, the raw identifier "memory" from rac_sdk_component_raw_value, and map to the same resource-type enum/constant used for memory in other components (do not return -1)—use the existing resource-type symbol (e.g., RESOURCE_TYPE_MEMORY or the project’s equivalent) to keep the pattern consistent with other cases.

coderabbitai · 2026-02-06T10:53:55Z

sdk/runanywhere-commons/tests/CMakeLists.txt

+# Memory layer test executable
+add_executable(test_memory test_memory.cpp)
+target_link_libraries(test_memory PRIVATE rac_commons rac_backend_memory)
+target_include_directories(test_memory PRIVATE ${PROJECT_SOURCE_DIR}/include)


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Check if RAC_BUILD_MEMORY option is defined elsewhere echo "=== Searching for RAC_BUILD_MEMORY option ===" rg -n 'RAC_BUILD_MEMORY' --type=cmake echo "" echo "=== Check existing RAC_BUILD_* patterns for reference ===" rg -n 'RAC_BUILD_' --type=cmake -C 1 | head -80 echo "" echo "=== Check how other test targets are defined ===" fd -t f 'CMakeLists.txt' --exec grep -l 'add_executable' {} \; | head -10

Repository: RunanywhereAI/runanywhere-sdks

Length of output: 5650

🏁 Script executed:

#!/bin/bash # Check the actual tests/CMakeLists.txt content echo "=== Contents of tests/CMakeLists.txt ===" cat -n sdk/runanywhere-commons/tests/CMakeLists.txt echo "" echo "=== How tests directory is included in main CMakeLists.txt ===" rg -n 'add_subdirectory.*tests' --type=cmake echo "" echo "=== Check if RAC_BUILD_TESTS guards the tests inclusion ===" rg -n -B2 -A2 'add_subdirectory.*tests' sdk/runanywhere-commons/CMakeLists.txt --type=cmake echo "" echo "=== Check for add_test() calls in the codebase ===" rg -n 'add_test\(' --type=cmake

Repository: RunanywhereAI/runanywhere-sdks

Length of output: 715

🏁 Script executed:

#!/bin/bash # Double-check: verify that rac_backend_memory target is created only when RAC_BACKEND_MEMORY is enabled echo "=== How rac_backend_memory target is created ===" rg -n -B2 -A5 'add_library.*rac_backend_memory' --type=cmake echo "" echo "=== Memory backend CMakeLists.txt structure ===" if [ -f sdk/runanywhere-commons/src/backends/memory/CMakeLists.txt ]; then cat sdk/runanywhere-commons/src/backends/memory/CMakeLists.txt else echo "File not found" fi

Repository: RunanywhereAI/runanywhere-sdks

Length of output: 2364

Guard with RAC_BACKEND_MEMORY to prevent build failures when the memory backend is disabled.

The test target links against rac_backend_memory, which is only created when RAC_BACKEND_MEMORY=ON. If the memory backend is disabled while tests are enabled, the build will fail with undefined reference errors. Wrap the test definition in a RAC_BACKEND_MEMORY guard to match the optional backend pattern used elsewhere in the project.

🔧 Proposed fix

-# Memory layer test executable -add_executable(test_memory test_memory.cpp) -target_link_libraries(test_memory PRIVATE rac_commons rac_backend_memory) -target_include_directories(test_memory PRIVATE ${PROJECT_SOURCE_DIR}/include) +# Memory layer test executable +if(RAC_BACKEND_MEMORY) + add_executable(test_memory test_memory.cpp) + target_link_libraries(test_memory PRIVATE rac_commons rac_backend_memory) + target_include_directories(test_memory PRIVATE ${PROJECT_SOURCE_DIR}/include) +endif()

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

# Memory layer test executable

add_executable(test_memory test_memory.cpp)

target_link_libraries(test_memory PRIVATE rac_commons rac_backend_memory)

target_include_directories(test_memory PRIVATE ${PROJECT_SOURCE_DIR}/include)

# Memory layer test executable

if(RAC_BACKEND_MEMORY)

add_executable(test_memory test_memory.cpp)

target_link_libraries(test_memory PRIVATE rac_commons rac_backend_memory)

target_include_directories(test_memory PRIVATE ${PROJECT_SOURCE_DIR}/include)

endif()

🤖 Prompt for AI Agents

In `@sdk/runanywhere-commons/tests/CMakeLists.txt` around lines 1 - 4, Wrap the test target creation for test_memory in an RAC_BACKEND_MEMORY guard so it is only added when the memory backend is enabled: conditionally add the add_executable(test_memory ...), target_link_libraries(test_memory PRIVATE rac_commons rac_backend_memory) and target_include_directories(test_memory ...) inside an if(RAC_BACKEND_MEMORY) ... endif() block to avoid linking against rac_backend_memory when the backend is disabled.

coderabbitai · 2026-02-06T10:53:55Z

sdk/runanywhere-swift/Sources/RunAnywhere/Features/Memory/MemoryTypes.swift

+public struct MemorySearchResult: Sendable {
+    /// Vector ID
+    public let id: UInt64
+
+    /// Distance/similarity score (lower is closer for L2/cosine)
+    public let score: Float
+
+    /// Associated metadata (parsed from JSON)
+    public let metadata: [String: Any]?
+
+    public init(id: UInt64, score: Float, metadata: [String: Any]? = nil) {
+        self.id = id
+        self.score = score
+        self.metadata = metadata
+    }
+}


⚠️ Potential issue | 🟠 Major

Sendable conformance is invalid — [String: Any]? is not Sendable.

Both MemorySearchResult and MemoryRecallResult declare Sendable conformance but store metadata: [String: Any]?. Any does not conform to Sendable, so this will produce a compiler error under Swift 6 strict concurrency checking.

Consider using [String: any Sendable]? or a dedicated Sendable-conforming metadata type.

Proposed fix

public struct MemorySearchResult: Sendable { public let id: UInt64 public let score: Float - public let metadata: [String: Any]? + public let metadata: [String: any Sendable]? - public init(id: UInt64, score: Float, metadata: [String: Any]? = nil) { + public init(id: UInt64, score: Float, metadata: [String: any Sendable]? = nil) {

Apply the same change to MemoryRecallResult.

As per coding guidelines, **/*.swift: "Use the latest Swift 6 APIs always."

Also applies to: 48-67

🤖 Prompt for AI Agents

In `@sdk/runanywhere-swift/Sources/RunAnywhere/Features/Memory/MemoryTypes.swift` around lines 30 - 45, MemorySearchResult and MemoryRecallResult incorrectly declare Sendable while using metadata: [String: Any]?; change the metadata type to a Sendable-safe form (e.g., metadata: [String: any Sendable]?) and update the corresponding init parameters and stored property types in both structs (MemorySearchResult and MemoryRecallResult) so the structs can legitimately conform to Sendable under Swift 6.

coderabbitai · 2026-02-06T10:53:55Z

sdk/runanywhere-swift/Sources/RunAnywhere/Features/Memory/RAGMemoryService.swift

+    private let bridge: CppBridge.Memory
+    private let embedder: any EmbeddingProvider
+    private let config: MemoryConfiguration
+    private var nextId: UInt64 = 1


⚠️ Potential issue | 🟠 Major

nextId counter resets after load() or app restart — causes silent ID collisions.

nextId starts at 1 and is not persisted. After a save()/load() round-trip (or an app relaunch with load()), new remember() calls will produce IDs starting from 1 again. Since the flat backend treats duplicate IDs as upserts, this silently overwrites previously stored memories.

Consider either:

Persisting nextId alongside the index (e.g., in the metadata sidecar or a dedicated field), or

After load(), querying stats or scanning existing IDs to set nextId = maxExistingId + 1.

🤖 Prompt for AI Agents

In `@sdk/runanywhere-swift/Sources/RunAnywhere/Features/Memory/RAGMemoryService.swift` at line 33, The nextId counter resets because nextId (declared in RAGMemoryService) isn't persisted or recomputed on load, causing ID collisions when remember() generates new entries after load() or restart; fix by persisting nextId with the index (e.g., metadata sidecar or dedicated field) during save() and restoring it in load(), or alternatively implement logic in load() to scan existing memory IDs (or query backend stats) to compute nextId = maxExistingId + 1 before returning, ensuring remember() uses the updated nextId.

coderabbitai · 2026-02-06T10:53:55Z

sdk/runanywhere-swift/Sources/RunAnywhere/Foundation/Bridge/Extensions/CppBridge+Memory.swift

+        public func add(vectors: [[Float]], ids: [UInt64], metadata: [String?]? = nil) throws {
+            let handle = try getHandle()
+            let count = UInt32(vectors.count)
+            guard count > 0 else { return }
+


⚠️ Potential issue | 🟠 Major

Missing count validation between vectors, ids, and metadata.

The C API (rac_memory_add) reads count entries from all three buffers. If ids.count != vectors.count or metadata?.count is non-nil but differs, this causes out-of-bounds reads on the C side (undefined behavior). Additionally, if metadata is non-nil but empty while vectors is non-empty, the force unwrap at line 104 (metaBuf.baseAddress!) will crash.

Proposed validation

public func add(vectors: [[Float]], ids: [UInt64], metadata: [String?]? = nil) throws { let handle = try getHandle() let count = UInt32(vectors.count) guard count > 0 else { return } + + guard vectors.count == ids.count else { + throw SDKError.memory(.invalidInput, + "vectors.count (\(vectors.count)) != ids.count (\(ids.count))") + } + if let metadata = metadata { + guard metadata.count == vectors.count else { + throw SDKError.memory(.invalidInput, + "metadata.count (\(metadata.count)) != vectors.count (\(vectors.count))") + } + }

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

public func add(vectors: [[Float]], ids: [UInt64], metadata: [String?]? = nil) throws {

let handle = try getHandle()

let count = UInt32(vectors.count)

guard count > 0 else { return }

public func add(vectors: [[Float]], ids: [UInt64], metadata: [String?]? = nil) throws {

let handle = try getHandle()

let count = UInt32(vectors.count)

guard count > 0 else { return }

guard vectors.count == ids.count else {

throw SDKError.memory(.invalidInput,

"vectors.count (\(vectors.count)) != ids.count (\(ids.count))")

}

if let metadata = metadata {

guard metadata.count == vectors.count else {

throw SDKError.memory(.invalidInput,

"metadata.count (\(metadata.count)) != vectors.count (\(vectors.count))")

}

}

🤖 Prompt for AI Agents

In `@sdk/runanywhere-swift/Sources/RunAnywhere/Foundation/Bridge/Extensions/CppBridge`+Memory.swift around lines 69 - 73, In add(vectors: [[Float]], ids: [UInt64], metadata: [String?]? = nil) validate that ids.count == vectors.count and if metadata != nil then metadata!.count == vectors.count (and not empty for non-zero vectors) before calling rac_memory_add; if counts mismatch throw an error/return early. Update the metadata handling so you do not force-unwrap metaBuf.baseAddress! when metadata is non-nil — only create and pass a metadata buffer when metadata.count == vectors.count, otherwise pass nil to the C call; reference the add(...) method, the rac_memory_add invocation, and the metaBuf.baseAddress usage when making these changes.

coderabbitai · 2026-02-06T10:53:55Z

sdk/runanywhere-swift/Sources/RunAnywhere/Foundation/Bridge/Extensions/CppBridge+Memory.swift

+            var metadataPointers: [UnsafePointer<CChar>?]?
+            var metadataBuffers: [Data] = []
+
+            if let metadata = metadata {
+                var ptrs = [UnsafePointer<CChar>?]()
+                for meta in metadata {
+                    if let m = meta {
+                        let data = Data(m.utf8 + [0])
+                        metadataBuffers.append(data)
+                        ptrs.append(nil) // Will set in withUnsafeBytes
+                    } else {
+                        ptrs.append(nil)
+                    }
+                }
+                metadataPointers = ptrs
+            }


🛠️ Refactor suggestion | 🟠 Major

Dead code: metadataPointers and metadataBuffers are never used.

These variables are populated but immediately shadowed by the strdup-based approach at lines 96-110. Remove them to avoid confusion.

Remove dead code

- // Prepare metadata C strings - var metadataPointers: [UnsafePointer<CChar>?]? - var metadataBuffers: [Data] = [] - - if let metadata = metadata { - var ptrs = [UnsafePointer<CChar>?]() - for meta in metadata { - if let m = meta { - let data = Data(m.utf8 + [0]) - metadataBuffers.append(data) - ptrs.append(nil) // Will set in withUnsafeBytes - } else { - ptrs.append(nil) - } - } - metadataPointers = ptrs - } - let result: rac_result_t

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

var metadataPointers: [UnsafePointer<CChar>?]?

var metadataBuffers: [Data] = []

if let metadata = metadata {

var ptrs = [UnsafePointer<CChar>?]()

for meta in metadata {

if let m = meta {

let data = Data(m.utf8 + [0])

metadataBuffers.append(data)

ptrs.append(nil) // Will set in withUnsafeBytes

} else {

ptrs.append(nil)

}

}

metadataPointers = ptrs

}

let result: rac_result_t

🤖 Prompt for AI Agents

In `@sdk/runanywhere-swift/Sources/RunAnywhere/Foundation/Bridge/Extensions/CppBridge`+Memory.swift around lines 78 - 93, Remove the dead preallocation for metadataPointers and metadataBuffers in CppBridge+Memory.swift: the variables metadataPointers and metadataBuffers (and the block that builds ptrs/Data from metadata) are never used because the code later uses strdup-based allocation; delete that entire if-let metadata block so only the strdup approach remains (search for metadataPointers, metadataBuffers, and the for meta in metadata loop to locate the code to remove).

shubhammalhotra28 · 2026-02-06T22:33:39Z

Let's close this right now
We will come to this later, adding the support.
@vedantagarwal-web

vedantagarwal-web and others added 2 commits February 4, 2026 01:56

greptile-apps bot reviewed Feb 6, 2026

View reviewed changes

coderabbitai bot reviewed Feb 6, 2026

View reviewed changes

shubhammalhotra28 closed this Feb 6, 2026

Conversation

vedantagarwal-web commented Feb 6, 2026 • edited by greptile-apps bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Architecture

What's included

Test plan

Summary by CodeRabbit

Release Notes

Greptile Overview

Greptile Summary

Confidence Score: 2/5

Important Files Changed

Sequence Diagram

Uh oh!

coderabbitai bot commented Feb 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested labels

Suggested reviewers

Uh oh!

greptile-apps bot left a comment

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Feb 6, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Feb 6, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Feb 6, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Feb 6, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Feb 6, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 6, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 6, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 6, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 6, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 6, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 6, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 6, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 6, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 6, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 6, 2026

Choose a reason for hiding this comment

Uh oh!

shubhammalhotra28 commented Feb 6, 2026

Uh oh!

Reviewers

Assignees

Labels

vedantagarwal-web commented Feb 6, 2026 •

edited by greptile-apps bot

Loading

coderabbitai bot commented Feb 6, 2026 •

edited

Loading