Skip to content

Add memory layer with vector similarity search and RAG pipeline#337

Closed
vedantagarwal-web wants to merge 2 commits intoRunanywhereAI:mainfrom
vedantagarwal-web:feature/memory-layer-rag
Closed

Add memory layer with vector similarity search and RAG pipeline#337
vedantagarwal-web wants to merge 2 commits intoRunanywhereAI:mainfrom
vedantagarwal-web:feature/memory-layer-rag

Conversation

@vedantagarwal-web
Copy link
Copy Markdown
Contributor

@vedantagarwal-web vedantagarwal-web commented Feb 6, 2026

Summary

  • Adds a complete memory/vector similarity search layer to the RunAnywhere C++ core and Swift SDK
  • Implements two backends: Flat (exact brute-force, ideal for <10K vectors) and HNSW (approximate nearest neighbor via hnswlib, scales to 100K+)
  • Provides a full RAG pipeline in Swift: remember() → embed + store, recall() → embed + search, RAGAgent → retrieve + augment prompt + LLM generate
  • All 10 C++ tests pass, including a 10K-vector HNSW benchmark at 0.10ms/query

Architecture

Swift SDK: RAGAgent → RAGMemoryService → CppBridge.Memory (C FFI)
C++ Core:  rac_memory_* C API → vtable dispatch → Flat or HNSW backend
Storage:   .racm binary format (hnswlib native + JSON metadata sidecar)

What's included

C++ core (~1,500 lines):

  • rac_memory_types.h / rac_memory_service.h — types, vtable interface, public C API (create, add, search, remove, save, load)
  • Flat backend — contiguous vector buffer, linear scan, thread-safe with shared_mutex
  • HNSW backend — wraps hnswlib::HierarchicalNSW<float>, auto-resize, cosine/L2/IP metrics
  • Persistence in .racm format with metadata sidecar
  • rac_llm_llamacpp_get_embeddings() — extract embeddings from any GGUF model

Swift SDK (~900 lines):

  • CppBridge.Memory actor — wraps all C FFI calls
  • MemoryConfiguration, MemoryTypes, MemoryModule
  • EmbeddingProvider protocol + LlamaCppEmbeddingProvider
  • RAGMemoryService — remember/recall/forget with automatic embedding
  • RAGAgent — full RAG: retrieve context from memory, augment prompt, generate with LLM
  • TextChunker — configurable text splitting for document ingestion

Tests & example app:

  • Standalone C++ test suite (10 tests covering both backends, persistence, deletion, 10K perf benchmark)
  • iOS example app Memory tab with visual test runner

Test plan

  • C++ build verification — all targets compile (rac_commons, rac_backend_memory, rac_backend_llamacpp)
  • Standalone C++ test — 10/10 tests pass (flat CRUD, HNSW CRUD, save/load, 10K benchmark)
  • iOS simulator build verification
  • Integration test with embedding model + LLM for full RAG pipeline

🤖 Generated with Claude Code

Summary by CodeRabbit

Release Notes

  • New Features

    • Added Kotlin Android starter app with LLM Chat, Speech-to-Text, Text-to-Speech, and Voice Pipeline integration
    • Introduced vector memory and retrieval-augmented generation (RAG) capabilities with multiple index backends
    • Added text embedding extraction from LLM models
    • Implemented RAG agent for context-aware, memory-enhanced responses
    • Added text chunking utility for RAG ingestion
  • Documentation

    • Added comprehensive Kotlin starter app README with setup instructions and architecture overview
  • Tests

    • Added vector memory layer tests with various backend scenarios

Greptile Overview

Greptile Summary

  • Adds a new C++ memory/vector-search feature with a C API (rac_memory_*) and vtable-dispatched backends (Flat brute-force + HNSW via hnswlib), including persistence.
  • Extends the llama.cpp backend with an embedding-extraction API (rac_llm_llamacpp_get_embeddings) for use in retrieval workflows.
  • Introduces a Swift-side RAG pipeline (RAGMemoryService, RAGAgent, TextChunker) and a CppBridge.Memory actor wrapping the C FFI.
  • Registers the new vector-search capability/backend during Swift SDK initialization and adds a standalone C++ test target for the memory layer.

Confidence Score: 2/5

  • This PR is not safe to merge yet due to a definite C++ compile break and a couple of memory-safety/ABI contract issues.
  • Confidence is reduced because memory_backend_hnswlib.cpp uses std::priority_queue without including (compile failure), Swift bridge lacks basic length validation before passing pointers into C (OOB risk), and the new embeddings API allocates with malloc while callers free via rac_free (allocator contract mismatch).
  • sdk/runanywhere-commons/src/backends/memory/memory_backend_hnswlib.cpp; sdk/runanywhere-commons/src/backends/llamacpp/rac_llm_llamacpp.cpp; sdk/runanywhere-swift/Sources/RunAnywhere/Foundation/Bridge/Extensions/CppBridge+Memory.swift

Important Files Changed

Filename Overview
sdk/runanywhere-commons/src/backends/memory/memory_backend_hnswlib.cpp Adds HNSW vector search backend; compile break due to missing include for std::priority_queue, plus general correctness risk if stats/dim mismatches propagate.
sdk/runanywhere-commons/src/backends/llamacpp/rac_llm_llamacpp.cpp Adds embeddings extraction API; output buffer allocated with malloc while API/Swift frees via rac_free, risking allocator mismatch if rac_free changes.
sdk/runanywhere-commons/src/features/memory/rac_memory_service.cpp Adds C memory service dispatch layer; add/search obtain dimension via get_stats but ignore its return value, so dimension may be 0 and cause backend calls to fail.
sdk/runanywhere-swift/Sources/RunAnywhere/Foundation/Bridge/Extensions/CppBridge+Memory.swift Adds Swift memory bridge; missing length validation between vectors/ids/metadata can cause out-of-bounds reads when calling into C.
sdk/runanywhere-swift/Sources/RunAnywhere/Features/Memory/EmbeddingProvider.swift Adds embedding provider protocol and llama.cpp implementation; embed() returns outDim without validating it matches expected dimension, leading to runtime dimension mismatch errors downstream.

Sequence Diagram

sequenceDiagram
    participant App as Swift App
    participant RAG as RAGAgent (Swift actor)
    participant MemSvc as RAGMemoryService (Swift actor)
    participant Bridge as CppBridge.Memory (Swift actor)
    participant CAPI as rac_memory_* (C API)
    participant Core as rac_memory_service (vtable dispatch)
    participant Backend as Flat/HNSW backend
    participant LLM as rac_llm_llamacpp_get_embeddings

    App->>RAG: query(question,k)
    RAG->>MemSvc: recall(question,k)
    MemSvc->>LLM: embed(question)
    LLM-->>MemSvc: [Float] queryEmbedding
    MemSvc->>Bridge: search(queryEmbedding,k)
    Bridge->>CAPI: rac_memory_search(handle, query, k, &out)
    CAPI->>Core: ops->get_stats() (dimension)
    Core->>Backend: ops->search(query, dimension, k)
    Backend-->>CAPI: rac_memory_search_results_t
    CAPI-->>Bridge: results + metadata C strings
    Bridge-->>MemSvc: [MemorySearchResult]
    MemSvc-->>RAG: [MemoryRecallResult]
    RAG->>App: augmented prompt -> LLM generate

    App->>RAG: ingest(text)
    RAG->>MemSvc: remember(chunk, metadata)
    MemSvc->>LLM: embed(chunk)
    LLM-->>MemSvc: [Float] embedding
    MemSvc->>Bridge: add([embedding],[id],[metadataJson])
    Bridge->>CAPI: rac_memory_add(handle, vectors, ids, metadata, count)
    CAPI->>Core: ops->get_stats() (dimension)
    Core->>Backend: ops->add(vectors, ids, metadata, count, dimension)
    Backend-->>CAPI: RAC_SUCCESS
Loading

vedantagarwal-web and others added 2 commits February 4, 2026 01:56
Replace the full production RunAnywhere app with the clean
kotlin-starter-example that matches the blog tutorial series.
Uses io.github.sanchitmonga22 Maven Central SDK coordinates.
Add kotlin-starter-app entry to Playground README.
Implements a full memory/RAG layer using hnswlib for approximate nearest
neighbor search and a flat brute-force backend for small indices. The C++
core provides a vtable-based service abstraction, and the Swift SDK wraps
it with a high-level remember/recall/forget API and a RAGAgent that
composes memory retrieval with LLM generation.

C++ core:
- Flat backend (exact brute-force, <10K vectors)
- HNSW backend (hnswlib v0.8.0, approximate, scalable)
- Vtable dispatch, module/service registration
- Persistence (.racm format) with save/load
- Embedding extraction via llama.cpp
- 10/10 standalone tests passing

Swift SDK:
- CppBridge.Memory actor (C FFI bridge)
- RAGMemoryService (remember/recall/forget)
- RAGAgent (full RAG: retrieve + augment + generate)
- EmbeddingProvider protocol + LlamaCpp implementation
- MemoryConfiguration, MemoryModule, MemoryTypes

Performance: 0.10ms/query at 10K vectors (128-dim, HNSW)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Feb 6, 2026

📝 Walkthrough

Walkthrough

This PR introduces a comprehensive memory/vector search subsystem across the RunAnywhere SDK, adds a new Kotlin Android starter app demonstrating LLM chat, speech-to-text, text-to-speech, and voice pipeline capabilities, and extends iOS test infrastructure with memory testing components.

Changes

Cohort / File(s) Summary
Kotlin Starter App Documentation
Playground/README.md, Playground/kotlin-starter-app/README.md
Added documentation for new Kotlin starter app including features, requirements, and SDK integration guide.
Kotlin Starter App Build & Configuration
Playground/kotlin-starter-app/build.gradle.kts, Playground/kotlin-starter-app/app/build.gradle.kts, Playground/kotlin-starter-app/gradle/..., Playground/kotlin-starter-app/settings.gradle.kts, Playground/kotlin-starter-app/app/proguard-rules.pro, Playground/kotlin-starter-app/.gitignore, Playground/kotlin-starter-app/app/.gitignore
Gradle build configuration, wrapper scripts, version catalog, ProGuard rules, and ignore files for the Kotlin starter app.
Kotlin Starter App Manifest & Resources
Playground/kotlin-starter-app/app/src/main/AndroidManifest.xml, Playground/kotlin-starter-app/app/src/main/res/...
Android manifest with permissions (INTERNET, RECORD_AUDIO, MODIFY_AUDIO_SETTINGS), app theme configuration, drawable resources for launcher icons, and string/color resource definitions.
Kotlin Starter App Core & Services
Playground/kotlin-starter-app/app/src/main/java/com/runanywhere/kotlin_starter_example/MainActivity.kt, Playground/kotlin-starter-app/app/src/main/java/.../services/ModelService.kt
MainActivity entry point with SDK initialization, backend registration, and Compose navigation setup; ModelService ViewModel managing LLM, STT, and TTS model lifecycle with download/load/unload operations.
Kotlin Starter App UI Screens
Playground/kotlin-starter-app/app/src/main/java/.../ui/screens/ChatScreen.kt, .../HomeScreen.kt, .../SpeechToTextScreen.kt, .../TextToSpeechScreen.kt, .../VoicePipelineScreen.kt
Jetpack Compose screens for home navigation, LLM chat, speech-to-text transcription, text-to-speech synthesis, and end-to-end voice pipeline integrating STT→LLM→TTS.
Kotlin Starter App UI Components & Theme
Playground/kotlin-starter-app/app/src/main/java/.../ui/components/FeatureCard.kt, .../ModelLoaderWidget.kt, .../ui/theme/Theme.kt, .../ui/theme/Type.kt
Reusable Compose UI components (FeatureCard, ModelLoaderWidget) and theme/typography definitions for Material 3 design.
Kotlin Starter App Tests
Playground/kotlin-starter-app/app/src/androidTest/java/..., Playground/kotlin-starter-app/app/src/test/java/...
Example instrumented and unit test files for app validation.
Memory Backend C/C++ API Headers
sdk/runanywhere-commons/include/rac/backends/rac_backend_memory.h, .../rac/features/memory/rac_memory.h, .../rac_memory_service.h, .../rac_memory_types.h, sdk/runanywhere-commons/include/rac/core/rac_component_types.h, .../rac_error.h, .../rac_types.h
Public C API headers defining memory backend registration, service operations (create, add, search, remove, save, load, destroy), configuration types (distance metrics, index types, HNSW parameters), and error codes.
Memory Backend C/C++ Implementation
sdk/runanywhere-commons/src/backends/memory/memory_backend_flat.cpp, .../memory_backend_flat.h, .../memory_backend_hnswlib.cpp, .../memory_backend_hnswlib.h, .../rac_backend_memory_register.cpp, .../rac_memory_backend.cpp
Flat (brute-force) and HNSW-based vector search implementations with persistence, metadata storage, thread-safe operations, and backend dispatcher logic.
Memory Backend Integration
sdk/runanywhere-commons/src/features/memory/rac_memory_service.cpp, sdk/runanywhere-commons/CMakeLists.txt
Memory service dispatcher exposing unified C API for backend-agnostic index operations; CMake configuration enabling memory backend build with platform-specific optimizations.
Memory Backend Tests
sdk/runanywhere-commons/tests/test_memory.cpp, .../CMakeLists.txt
Comprehensive test suite for flat and HNSW backends covering lifecycle, operations, persistence, and performance scenarios.
LLaMA Embeddings API
sdk/runanywhere-commons/src/backends/llamacpp/llamacpp_backend.cpp, .../llamacpp_backend.h, .../rac_llm_llamacpp.cpp, sdk/runanywhere-commons/include/rac/backends/rac_llm_llamacpp.h
New embedding extraction functionality in LlamaCPP backend via get_embeddings() and public C API rac_llm_llamacpp_get_embeddings().
Swift SDK Memory Layer - Core
sdk/runanywhere-swift/Sources/RunAnywhere/Features/Memory/MemoryConfiguration.swift, .../MemoryTypes.swift, .../MemoryModule.swift
Swift types and configuration for memory subsystem: distance metrics, index types, search/recall/stats result structures, and module registration.
Swift SDK Memory Layer - High-Level APIs
sdk/runanywhere-swift/Sources/RunAnywhere/Features/Memory/EmbeddingProvider.swift, .../RAGMemoryService.swift, .../RAGAgent.swift, .../TextChunker.swift
Public actors and protocols for text embeddings, RAG-driven memory operations (remember/recall/forget), retrieval-augmented generation with LLM integration, and text chunking utilities.
Swift SDK Memory Bridge
sdk/runanywhere-swift/Sources/RunAnywhere/Foundation/Bridge/Extensions/CppBridge+Memory.swift
Actor-based bridge exposing C memory backend operations (create, add, search, remove, save/load, stats, destroy) with error handling and lifecycle management.
Swift SDK Initialization & Error Handling
sdk/runanywhere-swift/Sources/RunAnywhere/Foundation/Bridge/CppBridge.swift, .../Errors/ErrorCategory.swift, .../Errors/SDKError.swift, sdk/runanywhere-swift/Sources/RunAnywhere/CRACommons/include/CRACommons.h
Backend memory registration during SDK initialization, new memory error category, factory method for memory errors, and inclusion of memory headers in public umbrella header.
Swift SDK C Headers (Memory)
sdk/runanywhere-swift/Sources/RunAnywhere/CRACommons/include/rac_backend_memory.h, .../rac_memory_service.h, .../rac_memory_types.h, .../rac_memory.h, .../rac_llm_llamacpp.h
Mirrored C header definitions for memory backend, service, and types in Swift SDK; embedding function addition to LlamaCPP header.
iOS Memory Testing
examples/ios/RunAnywhereAI/RunAnywhereAI/Features/Memory/MemoryTestView.swift, .../MemoryTestViewModel.swift, .../App/ContentView.swift
New MemoryTestView and viewmodel for comprehensive memory backend testing (flat/HNSW lifecycle, add, search, remove, save/load, HNSW-specific scenarios) with UI for test execution and results display; ContentView updated with Memory test tab.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~75 minutes

Possibly related PRs

  • Both this PR and #239 introduce substantial memory/vector-search subsystem changes affecting Swift SDK memory management, RAG APIs, and high-level service integration across overlapping files and modules.
  • Both this PR and #46 add memory backend registration steps (rac_backend_memory_register()) to the core Swift SDK initialization flow in CppBridge.swift.
  • Both this PR and #203 implement Android starter apps with speech-to-text and text-to-speech screens, model-loading ViewModels, and Compose navigation patterns.

Suggested labels

kotlin-sdk, memory-backend, swift-sdk, vector-search

Suggested reviewers

  • sanchitmonga22
  • shubhammalhotra28

🐰 A flutter of joy, a burst of code,
Memory and voices down the road,
LLMs chat, embeddings align,
RAG agents learn, so fine!
Kotlin hops with screens so bright, 🚀

🚥 Pre-merge checks | ✅ 2 | ❌ 1
❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 15.32% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title 'Add memory layer with vector similarity search and RAG pipeline' clearly summarizes the main change in the PR, which is the addition of a complete memory/vector similarity search layer with RAG capabilities.
Description check ✅ Passed The PR description provides comprehensive details about the changes including architecture, components, testing status, and implementation details. It covers most required sections from the template, though some testing checkboxes remain unchecked.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

5 files reviewed, 5 comments

Edit Code Review Agent Settings | Greptile

Comment on lines +12 to +22
#include <algorithm>
#include <chrono>
#include <cmath>
#include <cstdlib>
#include <cstring>
#include <memory>
#include <mutex>
#include <shared_mutex>
#include <string>
#include <unordered_map>
#include <vector>
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing header include

std::priority_queue is used in this file but <queue> is never included, which will fail compilation on standard toolchains.

Suggested change
#include <algorithm>
#include <chrono>
#include <cmath>
#include <cstdlib>
#include <cstring>
#include <memory>
#include <mutex>
#include <shared_mutex>
#include <string>
#include <unordered_map>
#include <vector>
#include <queue>
Prompt To Fix With AI
This is a comment left during a code review.
Path: sdk/runanywhere-commons/src/backends/memory/memory_backend_hnswlib.cpp
Line: 12:22

Comment:
**Missing header include**

`std::priority_queue` is used in this file but `<queue>` is never included, which will fail compilation on standard toolchains.

```suggestion
#include <queue>
```

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines +282 to +290
// Allocate output buffer (caller frees with rac_free)
float* result = static_cast<float*>(malloc(embeddings.size() * sizeof(float)));
if (!result) {
return RAC_ERROR_OUT_OF_MEMORY;
}

memcpy(result, embeddings.data(), embeddings.size() * sizeof(float));
*out_embeddings = result;
*out_dimension = static_cast<uint32_t>(embeddings.size());
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Allocator mismatch for embeddings

rac_llm_llamacpp_get_embeddings allocates the output buffer with malloc, but the API contract (and Swift caller) frees it via rac_free. If rac_free is ever swapped to a different allocator (as intended by the wrapper), this becomes a real crash/corruption risk. Allocate with rac_alloc (or at minimum ensure the free function matches the allocator used).

Prompt To Fix With AI
This is a comment left during a code review.
Path: sdk/runanywhere-commons/src/backends/llamacpp/rac_llm_llamacpp.cpp
Line: 282:290

Comment:
**Allocator mismatch for embeddings**

`rac_llm_llamacpp_get_embeddings` allocates the output buffer with `malloc`, but the API contract (and Swift caller) frees it via `rac_free`. If `rac_free` is ever swapped to a different allocator (as intended by the wrapper), this becomes a real crash/corruption risk. Allocate with `rac_alloc` (or at minimum ensure the free function matches the allocator used).

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines +72 to +78
// Get dimension from stats
rac_memory_stats_t stats = {};
if (service->ops->get_stats) {
service->ops->get_stats(service->impl, &stats);
}

return service->ops->add(service->impl, vectors, ids, metadata, count, stats.dimension);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dimension can be zero

rac_memory_add (and similarly rac_memory_search) derives dimension from get_stats but ignores the return value; if get_stats fails or isn't implemented correctly, stats.dimension stays 0 and every backend will return RAC_ERROR_MEMORY_DIMENSION_MISMATCH even though the handle is valid. This makes the public API brittle (operations can start failing after an internal stats error). At minimum, check get_stats’s result and treat failure as an error instead of passing dimension=0 into the backend.

Prompt To Fix With AI
This is a comment left during a code review.
Path: sdk/runanywhere-commons/src/features/memory/rac_memory_service.cpp
Line: 72:78

Comment:
**Dimension can be zero**

`rac_memory_add` (and similarly `rac_memory_search`) derives `dimension` from `get_stats` but ignores the return value; if `get_stats` fails or isn't implemented correctly, `stats.dimension` stays 0 and every backend will return `RAC_ERROR_MEMORY_DIMENSION_MISMATCH` even though the handle is valid. This makes the public API brittle (operations can start failing after an internal stats error). At minimum, check `get_stats`’s result and treat failure as an error instead of passing `dimension=0` into the backend.

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines +60 to +67
guard result == RAC_SUCCESS, let emb = embeddings else {
throw SDKError.memory(.processingFailed,
"Failed to extract embeddings: \(result)")
}

defer { rac_free(emb) }

return Array(UnsafeBufferPointer(start: emb, count: Int(outDim)))
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Embedding dimension not validated

LlamaCppEmbeddingProvider stores an expected dimension, but embed() returns outDim without checking it matches _dimension. If the loaded model’s embedding size differs (or changes after reload), the Swift memory bridge will try to add/search with mismatched dimensions and fail at runtime with RAC_ERROR_MEMORY_DIMENSION_MISMATCH. Validate outDim == _dimension and surface a clear error when it doesn’t match.

Prompt To Fix With AI
This is a comment left during a code review.
Path: sdk/runanywhere-swift/Sources/RunAnywhere/Features/Memory/EmbeddingProvider.swift
Line: 60:67

Comment:
**Embedding dimension not validated**

`LlamaCppEmbeddingProvider` stores an expected `dimension`, but `embed()` returns `outDim` without checking it matches `_dimension`. If the loaded model’s embedding size differs (or changes after reload), the Swift memory bridge will try to add/search with mismatched dimensions and fail at runtime with `RAC_ERROR_MEMORY_DIMENSION_MISMATCH`. Validate `outDim == _dimension` and surface a clear error when it doesn’t match.

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines +68 to +76
/// Add vectors with IDs and optional metadata
public func add(vectors: [[Float]], ids: [UInt64], metadata: [String?]? = nil) throws {
let handle = try getHandle()
let count = UInt32(vectors.count)
guard count > 0 else { return }

// Flatten vectors into contiguous array
let flat = vectors.flatMap { $0 }

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Vectors/IDs length mismatch

add(vectors:ids:metadata:) computes count from vectors.count but never verifies ids.count == vectors.count (and same for metadata?.count). If these arrays differ, the bridge will read past the end of ids/metadata buffers when calling rac_memory_add, causing memory-safety issues (crash/corruption). Add explicit count checks before flattening/calling into C.

Prompt To Fix With AI
This is a comment left during a code review.
Path: sdk/runanywhere-swift/Sources/RunAnywhere/Foundation/Bridge/Extensions/CppBridge+Memory.swift
Line: 68:76

Comment:
**Vectors/IDs length mismatch**

`add(vectors:ids:metadata:)` computes `count` from `vectors.count` but never verifies `ids.count == vectors.count` (and same for `metadata?.count`). If these arrays differ, the bridge will read past the end of `ids`/`metadata` buffers when calling `rac_memory_add`, causing memory-safety issues (crash/corruption). Add explicit count checks before flattening/calling into C.

How can I resolve this? If you propose a fix, please make it concise.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 17

Note

Due to the large number of review comments, Critical, Major severity comments were prioritized as inline comments.

🤖 Fix all issues with AI agents
In
`@Playground/kotlin-starter-app/app/src/main/java/com/runanywhere/kotlin_starter_example/services/ModelService.kt`:
- Around line 288-299: unloadAllModels currently wraps
RunAnywhere.unloadLLMModel, unloadSTTModel, and unloadTTSVoice in one try/catch
so a thrown exception stops subsequent unloads and skips refreshModelState;
change it so each unload call (RunAnywhere.unloadLLMModel(),
RunAnywhere.unloadSTTModel(), RunAnywhere.unloadTTSVoice()) is executed in its
own try/catch (or use runCatching) to log/collect individual errors without
aborting the rest, and ensure refreshModelState() is invoked after all attempts
(e.g., in a finally or after all runCatching blocks) so the UI reflects the
final state.

In
`@Playground/kotlin-starter-app/app/src/main/java/com/runanywhere/kotlin_starter_example/ui/screens/SpeechToTextScreen.kt`:
- Around line 80-114: The background read Thread started in the recording method
isn’t tracked or joined, so stopRecording() sets isRecording=false and
immediately releases audioRecord and returns audioData while the thread may
still be in audioRecord?.read(...); fix by storing the Thread reference (e.g.,
readingThread) when starting it, then in stopRecording() after setting
isRecording = false call readingThread.join(<short timeout>) and handle
InterruptedException before stopping/releasing audioRecord and returning
audioData to ensure the reader thread has finished accessing audioRecord and
audioData.

In
`@Playground/kotlin-starter-app/app/src/main/java/com/runanywhere/kotlin_starter_example/ui/screens/TextToSpeechScreen.kt`:
- Around line 85-111: The AudioTrack created with AudioTrack.Builder (assigned
to audioTrack) can leak if the coroutine is cancelled during the delay: ensure
the play/cleanup sequence is wrapped in a try/finally so stop() and release()
always run; specifically move audioTrack.write(...) and audioTrack.play() into a
try block and call audioTrack.stop() and audioTrack.release() in the finally
block, and apply the same change to the playWavAudio implementation in
VoicePipelineScreen.kt to guarantee cleanup on CancellationException.

In
`@Playground/kotlin-starter-app/app/src/main/java/com/runanywhere/kotlin_starter_example/ui/screens/VoicePipelineScreen.kt`:
- Around line 326-335: When handling VoiceSessionEvent.TurnCompleted in the Flow
collector, don't call the suspending playWavAudio(audio) inline because it
blocks collection; instead launch playback in a separate coroutine (e.g.,
CoroutineScope(Dispatchers.IO).launch or viewModelScope.launch) so the collector
can continue receiving events. Set sessionState = VoiceSessionState.SPEAKING
before launching the coroutine, call playWavAudio(audio) inside that coroutine,
and after playback update sessionState = VoiceSessionState.LISTENING and
audioLevel = 0f inside the coroutine to restore UI state without blocking the
Flow collector. Ensure you reference the same symbols:
VoiceSessionEvent.TurnCompleted, playWavAudio, sessionState, and audioLevel.

In `@sdk/runanywhere-commons/include/rac/core/rac_component_types.h`:
- Line 38: Add a switch case for RAC_COMPONENT_MEMORY in component_types.cpp
inside rac_sdk_component_display_name, rac_sdk_component_raw_value, and
rac_component_to_resource_type: return the human label "Memory" from
rac_sdk_component_display_name, the raw identifier "memory" from
rac_sdk_component_raw_value, and map to the same resource-type enum/constant
used for memory in other components (do not return -1)—use the existing
resource-type symbol (e.g., RESOURCE_TYPE_MEMORY or the project’s equivalent) to
keep the pattern consistent with other cases.

In `@sdk/runanywhere-commons/include/rac/core/rac_error.h`:
- Around line 386-395: Add switch cases for the five new memory error macros
inside the rac_error_message() switch (the "OTHER ERRORS (-800 to -899)"
section): handle RAC_ERROR_MEMORY_INDEX_FULL,
RAC_ERROR_MEMORY_DIMENSION_MISMATCH, RAC_ERROR_MEMORY_INDEX_NOT_FOUND,
RAC_ERROR_MEMORY_INVALID_CONFIG, and RAC_ERROR_MEMORY_CORRUPT_INDEX by returning
descriptive string literals (e.g., "Memory index is full (max_elements
reached)", "Vector dimension does not match index dimension", "Memory index not
found at specified path", "Invalid memory index configuration", "Memory index
file is corrupt") so these codes no longer fall through to "Unknown error code".

In `@sdk/runanywhere-commons/src/backends/memory/CMakeLists.txt`:
- Around line 1-46: The CMake file unconditionally fetches hnswlib and builds
the memory backend which violates the RAC_BUILD_* gating guideline; add an
option named RAC_BUILD_MEMORY and early-return if it's OFF, then wrap the
FetchContent_Declare(hnswlib) and the creation of target rac_backend_memory (and
its
target_include_directories/target_link_libraries/target_compile_features/platform-specific
blocks) so they only run when RAC_BUILD_MEMORY is ON; ensure the option text
matches "Build the memory/vector search backend" and that all references to
hnswlib_SOURCE_DIR and target rac_backend_memory are only configured inside that
guarded block.

In `@sdk/runanywhere-commons/src/backends/memory/memory_backend_flat.cpp`:
- Around line 281-296: The save routine currently writes metadata with
fprintf("%llu\t%s\n") which breaks when metadata contains newlines; change the
format to a binary length-prefixed write: for each id in index->ids iterate the
metadata via index->metadata and write id (same type) then a uint32_t length
followed by the raw metadata bytes (use fwrite) instead of the tab/newline text
format, and update the corresponding load code that currently uses fgets (around
the fgets call referenced) to read the id, then read the uint32_t length and
then fread exactly that many bytes to reconstruct meta_it->second (unlike
line-based parsing). Ensure the length type and byte-ordering match the existing
binary header handling for vectors/IDs.
- Around line 316-346: In flat_load, several fread calls (reading version,
index_type, dimension, metric, num_vectors and the bulk reads into
index->vectors.data() and index->ids.data()) are unchecked and can leave the
index in a corrupt state; update flat_load to verify each fread returns the
expected count and on any short read fclose(f) and return
RAC_ERROR_MEMORY_CORRUPT_INDEX, ensuring you validate reads for
sizeof(uint32_t)/sizeof(uint64_t) for the header fields and for (num_vectors *
dimension) floats and num_vectors uint64_t for the bulk reads into
index->vectors.data() and index->ids.data(), and only resize/populate
index->id_to_index and index fields after successful reads.

In `@sdk/runanywhere-commons/src/backends/memory/memory_backend_hnswlib.cpp`:
- Around line 394-401: RAC_DISTANCE_COSINE is currently mapped to
hnswlib::L2Space (in rac_memory_hnsw_create and hnsw_load) which yields
incorrect results for non-normalized vectors; change the mapping so
RAC_DISTANCE_COSINE uses hnswlib::InnerProductSpace and then normalize vectors
to unit length on insertion and before searching: update rac_memory_hnsw_create
and hnsw_load to construct InnerProductSpace when config->metric ==
RAC_DISTANCE_COSINE, and modify hnsw_add and hnsw_search to perform L2
normalization of the input vectors (unit-length) before passing them to hnswlib
so cosine similarity is computed correctly via inner product.
- Around line 262-298: The header fields read after the magic (version,
index_type, dimension, metric, num_vectors, ef_search) are not validated, so
truncated files can leave those locals uninitialized and later crash when used
(e.g., creating hnswlib::L2Space(dimension) or HierarchicalNSW). Fix it by
checking the return value of each fread (or fread into a buffer and validate
total bytes read) and if any read returns fewer items than expected, fclose(f)
and return RAC_ERROR_MEMORY_CORRUPT_INDEX; ensure the checks cover version,
index_type, dimension, metric, num_vectors and ef_search before using them. Also
keep the existing index_type check (index_type != RAC_INDEX_HNSW) after these
validated reads.

In `@sdk/runanywhere-commons/src/features/memory/rac_memory_service.cpp`:
- Around line 128-169: In rac_memory_load: validate the return values of the two
fread calls that read version and index_type and return
RAC_ERROR_MEMORY_CORRUPT_INDEX on failure to avoid using uninitialized locals;
after creating the service (rac_memory_backend_create_service) check that
service->ops is non-null and that service->ops->load is set before calling it,
returning an appropriate error (and cleaning up via rac_memory_destroy(handle)
if needed) — mirror the existing dispatch-pattern used elsewhere in this file
(check service->ops and service->ops->load, then call load(service->impl,
path)).

In `@sdk/runanywhere-commons/tests/CMakeLists.txt`:
- Around line 1-4: Wrap the test target creation for test_memory in an
RAC_BACKEND_MEMORY guard so it is only added when the memory backend is enabled:
conditionally add the add_executable(test_memory ...),
target_link_libraries(test_memory PRIVATE rac_commons rac_backend_memory) and
target_include_directories(test_memory ...) inside an if(RAC_BACKEND_MEMORY) ...
endif() block to avoid linking against rac_backend_memory when the backend is
disabled.

In `@sdk/runanywhere-swift/Sources/RunAnywhere/Features/Memory/MemoryTypes.swift`:
- Around line 30-45: MemorySearchResult and MemoryRecallResult incorrectly
declare Sendable while using metadata: [String: Any]?; change the metadata type
to a Sendable-safe form (e.g., metadata: [String: any Sendable]?) and update the
corresponding init parameters and stored property types in both structs
(MemorySearchResult and MemoryRecallResult) so the structs can legitimately
conform to Sendable under Swift 6.

In
`@sdk/runanywhere-swift/Sources/RunAnywhere/Features/Memory/RAGMemoryService.swift`:
- Line 33: The nextId counter resets because nextId (declared in
RAGMemoryService) isn't persisted or recomputed on load, causing ID collisions
when remember() generates new entries after load() or restart; fix by persisting
nextId with the index (e.g., metadata sidecar or dedicated field) during save()
and restoring it in load(), or alternatively implement logic in load() to scan
existing memory IDs (or query backend stats) to compute nextId = maxExistingId +
1 before returning, ensuring remember() uses the updated nextId.

In
`@sdk/runanywhere-swift/Sources/RunAnywhere/Foundation/Bridge/Extensions/CppBridge`+Memory.swift:
- Around line 78-93: Remove the dead preallocation for metadataPointers and
metadataBuffers in CppBridge+Memory.swift: the variables metadataPointers and
metadataBuffers (and the block that builds ptrs/Data from metadata) are never
used because the code later uses strdup-based allocation; delete that entire
if-let metadata block so only the strdup approach remains (search for
metadataPointers, metadataBuffers, and the for meta in metadata loop to locate
the code to remove).
- Around line 69-73: In add(vectors: [[Float]], ids: [UInt64], metadata:
[String?]? = nil) validate that ids.count == vectors.count and if metadata !=
nil then metadata!.count == vectors.count (and not empty for non-zero vectors)
before calling rac_memory_add; if counts mismatch throw an error/return early.
Update the metadata handling so you do not force-unwrap metaBuf.baseAddress!
when metadata is non-nil — only create and pass a metadata buffer when
metadata.count == vectors.count, otherwise pass nil to the C call; reference the
add(...) method, the rac_memory_add invocation, and the metaBuf.baseAddress
usage when making these changes.
🟡 Minor comments (16)
Playground/kotlin-starter-app/.gitignore-82-84 (1)

82-84: ⚠️ Potential issue | 🟡 Minor

**/models/ is overly broad and may silently ignore source files.

Line 83 already covers the intended app/src/main/assets/models/ path for downloaded AI models. The **/models/ glob on line 84 would also match any Kotlin package directory named models/ (e.g., com/runanywhere/.../models/), causing source files to be silently untracked.

Consider removing the broad pattern and keeping only the specific path:

Proposed fix
 # RunAnywhere downloaded models
 app/src/main/assets/models/
-**/models/
Playground/kotlin-starter-app/app/src/main/java/com/runanywhere/kotlin_starter_example/ui/components/FeatureCard.kt-55-56 (1)

55-56: ⚠️ Potential issue | 🟡 Minor

Empty gradientColors list will crash at runtime.

Brush.linearGradient(gradientColors) throws IllegalArgumentException if fewer than 2 colors are provided. Consider adding a require(gradientColors.size >= 2) precondition or documenting the constraint.

Playground/kotlin-starter-app/app/src/main/java/com/runanywhere/kotlin_starter_example/ui/screens/TextToSpeechScreen.kt-48-49 (1)

48-49: ⚠️ Potential issue | 🟡 Minor

audioFormat is parsed but never validated.

Line 49 reads the audio format code (1 = PCM) but the value is never checked. Non-PCM WAV files (e.g., compressed formats) would be silently misinterpreted as raw PCM, producing garbled audio. A guard like if (audioFormat != 1.toShort().toInt()) return@withContext would make this defensive.

Playground/kotlin-starter-app/app/src/main/java/com/runanywhere/kotlin_starter_example/ui/screens/SpeechToTextScreen.kt-94-98 (1)

94-98: ⚠️ Potential issue | 🟡 Minor

Swallowed SecurityException loses diagnostic context.

When AudioRecord creation throws a SecurityException, the function returns false but the exception message is silently discarded. This makes it difficult to diagnose permission-related failures on specific devices.

🔧 Proposed fix: log before returning
         } catch (e: SecurityException) {
+            android.util.Log.w("AudioRecorder", "Permission denied for audio recording", e)
             audioRecord?.release()
             audioRecord = null
             return false
         }
Playground/kotlin-starter-app/app/src/main/java/com/runanywhere/kotlin_starter_example/ui/screens/HomeScreen.kt-64-109 (1)

64-109: ⚠️ Potential issue | 🟡 Minor

Nested scrollable: LazyVerticalGrid inside a verticalScroll Column is fragile.

LazyVerticalGrid is a scrollable container placed inside a verticalScroll Column (Line 48). This forces a fixed .height(700.dp) (Line 68) which defeats lazy loading and breaks on varying screen sizes (clips on small, wastes space on large).

Since you only have 4 fixed items in a 2-column grid, replace LazyVerticalGrid with a simple Column + Row layout to avoid nested scrollable issues entirely.

💡 Suggested approach: replace with static grid layout
-            LazyVerticalGrid(
-                columns = GridCells.Fixed(2),
-                horizontalArrangement = Arrangement.spacedBy(16.dp),
-                verticalArrangement = Arrangement.spacedBy(16.dp),
-                modifier = Modifier.height(700.dp)
-            ) {
-                item {
-                    FeatureCard(
-                        title = "Chat",
-                        subtitle = "LLM Text Generation",
-                        icon = Icons.Rounded.Chat,
-                        gradientColors = listOf(AccentCyan, Color(0xFF0EA5E9)),
-                        onClick = onNavigateToChat
-                    )
-                }
-                
-                item {
-                    FeatureCard(
-                        title = "Speech",
-                        subtitle = "Speech to Text",
-                        icon = Icons.Rounded.Mic,
-                        gradientColors = listOf(AccentViolet, Color(0xFF7C3AED)),
-                        onClick = onNavigateToSTT
-                    )
-                }
-                
-                item {
-                    FeatureCard(
-                        title = "Voice",
-                        subtitle = "Text to Speech",
-                        icon = Icons.Rounded.VolumeUp,
-                        gradientColors = listOf(AccentPink, Color(0xFFDB2777)),
-                        onClick = onNavigateToTTS
-                    )
-                }
-                
-                item {
-                    FeatureCard(
-                        title = "Pipeline",
-                        subtitle = "Voice Agent",
-                        icon = Icons.Rounded.AutoAwesome,
-                        gradientColors = listOf(AccentGreen, Color(0xFF059669)),
-                        onClick = onNavigateToVoicePipeline
-                    )
-                }
-            }
+            Row(horizontalArrangement = Arrangement.spacedBy(16.dp)) {
+                FeatureCard(
+                    title = "Chat",
+                    subtitle = "LLM Text Generation",
+                    icon = Icons.Rounded.Chat,
+                    gradientColors = listOf(AccentCyan, Color(0xFF0EA5E9)),
+                    onClick = onNavigateToChat,
+                    modifier = Modifier.weight(1f)
+                )
+                FeatureCard(
+                    title = "Speech",
+                    subtitle = "Speech to Text",
+                    icon = Icons.Rounded.Mic,
+                    gradientColors = listOf(AccentViolet, Color(0xFF7C3AED)),
+                    onClick = onNavigateToSTT,
+                    modifier = Modifier.weight(1f)
+                )
+            }
+            Spacer(modifier = Modifier.height(16.dp))
+            Row(horizontalArrangement = Arrangement.spacedBy(16.dp)) {
+                FeatureCard(
+                    title = "Voice",
+                    subtitle = "Text to Speech",
+                    icon = Icons.Rounded.VolumeUp,
+                    gradientColors = listOf(AccentPink, Color(0xFFDB2777)),
+                    onClick = onNavigateToTTS,
+                    modifier = Modifier.weight(1f)
+                )
+                FeatureCard(
+                    title = "Pipeline",
+                    subtitle = "Voice Agent",
+                    icon = Icons.Rounded.AutoAwesome,
+                    gradientColors = listOf(AccentGreen, Color(0xFF059669)),
+                    onClick = onNavigateToVoicePipeline,
+                    modifier = Modifier.weight(1f)
+                )
+            }
Playground/kotlin-starter-app/app/src/main/java/com/runanywhere/kotlin_starter_example/services/ModelService.kt-277-283 (1)

277-283: ⚠️ Potential issue | 🟡 Minor

downloadAndLoadAllModels silently clears errors from earlier model failures.

Each downloadAndLoad*Suspend() starts with errorMessage = null (e.g., line 147, 193, 239). When called sequentially here, a failure in the LLM step sets errorMessage, but the subsequent STT step immediately clears it. The user loses visibility into the first failure.

Consider collecting errors rather than overwriting, or short-circuiting on failure:

Option A: stop on first failure
 fun downloadAndLoadAllModels() {
     viewModelScope.launch {
-        if (!isLLMLoaded) downloadAndLoadLLMSuspend()
-        if (!isSTTLoaded) downloadAndLoadSTTSuspend()
-        if (!isTTSLoaded) downloadAndLoadTTSSuspend()
+        if (!isLLMLoaded) downloadAndLoadLLMSuspend()
+        if (errorMessage != null) return@launch
+        if (!isSTTLoaded) downloadAndLoadSTTSuspend()
+        if (errorMessage != null) return@launch
+        if (!isTTSLoaded) downloadAndLoadTTSSuspend()
     }
 }
sdk/runanywhere-commons/include/rac/core/rac_error.h-383-398 (1)

383-398: ⚠️ Potential issue | 🟡 Minor

Error codes look well-placed; update the top-level range summary to match.

The new -850 to -869 memory error sub-range is correctly carved out of the Other Errors block, but the range table at line 47 still reads Other errors: -800 to -899. Add a Memory entry to keep the summary consistent with the actual layout.

Suggested range table update (around line 47)
 //   - Event errors:             -700 to -799
-//   - Other errors:             -800 to -899
+//   - Other errors:             -800 to -849, -870 to -899
+//   - Memory/Vector Search:     -850 to -869
 //   - Reserved:                 -900 to -999
sdk/runanywhere-swift/Sources/RunAnywhere/Features/Memory/TextChunker.swift-39-56 (1)

39-56: ⚠️ Potential issue | 🟡 Minor

Chunks can silently exceed maxCharacters in two scenarios.

  1. A single sentence longer than maxCharacters is added verbatim at line 41 with no character-level fallback (contrary to the doc at line 19).
  2. After overlap insertion (line 52), the new currentChunk is overlapTail + " " + sentence, which is never re-checked against maxCharacters, so the next append at line 46 emits an oversized chunk.

Both paths can produce chunks significantly larger than the requested limit, which may cause downstream embedding models to silently truncate or error.

Proposed fix sketch: re-check after overlap and add character-level fallback
             } else {
                 // Current chunk is full
                 chunks.append(currentChunk)

                 // Start new chunk with overlap from end of previous
                 if overlap > 0 && currentChunk.count > overlap {
                     let overlapStart = currentChunk.index(currentChunk.endIndex,
                                                           offsetBy: -overlap)
-                    currentChunk = String(currentChunk[overlapStart...]) + " " + sentence
+                    let overlapText = String(currentChunk[overlapStart...])
+                    if overlapText.count + 1 + sentence.count <= maxCharacters {
+                        currentChunk = overlapText + " " + sentence
+                    } else {
+                        currentChunk = sentence
+                    }
                 } else {
                     currentChunk = sentence
                 }
+
+                // If a single sentence exceeds maxCharacters, split by characters
+                while currentChunk.count > maxCharacters {
+                    let splitIdx = currentChunk.index(currentChunk.startIndex,
+                                                      offsetBy: maxCharacters)
+                    chunks.append(String(currentChunk[..<splitIdx]))
+                    currentChunk = String(currentChunk[splitIdx...])
+                }
             }
sdk/runanywhere-commons/include/rac/features/memory/rac_memory.h-11-12 (1)

11-12: ⚠️ Potential issue | 🟡 Minor

Use relative includes to match the Swift bridge implementation and avoid unnecessary path verbosity.

This header is in sdk/runanywhere-commons/include/rac/features/memory/ alongside rac_memory_service.h and rac_memory_types.h. Use relative includes instead of full paths:

`#include` "rac_memory_service.h"
`#include` "rac_memory_types.h"

This matches the pattern used in the Swift bridge version (sdk/runanywhere-swift/Sources/RunAnywhere/CRACommons/include/rac_memory.h) and follows the standard practice for umbrella headers in the same directory.

sdk/runanywhere-swift/Sources/RunAnywhere/Features/Memory/EmbeddingProvider.swift-48-68 (1)

48-68: ⚠️ Potential issue | 🟡 Minor

Missing validation: returned embedding dimension vs. expected _dimension.

outDim from the C API may not match the configured _dimension. If the model produces embeddings of a different size, this silently returns a mismatched vector that will fail downstream at rac_memory_add with a cryptic DIMENSION_MISMATCH error.

Adding an early check here gives a much clearer error to the caller.

Proposed fix
         defer { rac_free(emb) }
 
+        guard outDim == _dimension else {
+            throw SDKError.memory(.processingFailed,
+                "Embedding dimension mismatch: expected \(_dimension), got \(outDim)")
+        }
+
         return Array(UnsafeBufferPointer(start: emb, count: Int(outDim)))
sdk/runanywhere-commons/tests/test_memory.cpp-494-509 (1)

494-509: ⚠️ Potential issue | 🟡 Minor

Benchmark silently ignores search failures.

Line 498 calls rac_memory_search without checking the return value. If the search silently fails (e.g., returns an error), the benchmark will still report 0 ms/query and PASS.

Proposed fix
     for (int q = 0; q < 100; q++) {
         rac_memory_search_results_t results = {};
-        rac_memory_search(handle, &vecs[q * D], 10, &results);
+        rac_result_t r = rac_memory_search(handle, &vecs[q * D], 10, &results);
+        if (r != RAC_SUCCESS) {
+            rac_memory_destroy(handle);
+            FAIL("search failed during benchmark");
+        }
         rac_memory_search_results_free(&results);
     }
sdk/runanywhere-commons/src/backends/memory/memory_backend_hnswlib.cpp-126-136 (1)

126-136: ⚠️ Potential issue | 🟡 Minor

setEf under shared_lock is a data race.

index->hnsw->setEf(...) on line 128 mutates the internal ef_ member of the hnswlib index, but the lock held is a shared_lock (read lock). Concurrent hnsw_search calls will race on this write. Although the value is always the same here, this is UB per the C++ memory model.

Move setEf to index creation and hnsw_load instead (where it's set once under exclusive lock), and remove it from the search path.

sdk/runanywhere-commons/src/backends/memory/memory_backend_hnswlib.cpp-283-298 (1)

283-298: ⚠️ Potential issue | 🟡 Minor

Fixed 65 KB stack buffer for metadata lines may silently truncate large metadata entries.

char line[65536] on line 285 means any metadata JSON longer than ~65 KB is truncated by fgets. This is unlikely for typical use but creates a silent data corruption path. Also, 64 KB on the stack is large for some embedded targets.

sdk/runanywhere-swift/Sources/RunAnywhere/Features/Memory/RAGAgent.swift-110-125 (1)

110-125: ⚠️ Potential issue | 🟡 Minor

Potential resource leak: rac_llm_result_free not called on the error path.

If rac_llm_generate returns RAC_SUCCESS but result.text is nil (or if the C layer partially allocates on failure), rac_llm_result_free is never called. Calling it unconditionally after extracting the text is safer, since the free function should handle zero/nil members gracefully.

Proposed fix
         guard genResult == RAC_SUCCESS, let text = result.text else {
+            rac_llm_result_free(&result)
             throw SDKError.llm(.generationFailed,
                               "LLM generation failed: \(genResult)")
         }
 
         let answer = String(cString: text)
         rac_llm_result_free(&result)
sdk/runanywhere-swift/Sources/RunAnywhere/CRACommons/include/rac_memory_types.h-92-101 (1)

92-101: ⚠️ Potential issue | 🟡 Minor

Document score semantics for RAC_DISTANCE_INNER_PRODUCT.

Line 96 only documents L2/cosine behavior ("lower is closer"). For inner product, higher scores indicate greater similarity, and the polarity is reversed. Callers interpreting scores without knowing which metric is in use could sort or threshold incorrectly.

📝 Proposed documentation fix
     /** Vector ID */
     uint64_t id;

-    /** Distance/similarity score (lower is closer for L2/cosine) */
+    /** Distance/similarity score.
+     *  For L2/cosine: lower is more similar.
+     *  For inner product: higher is more similar. */
     float score;
sdk/runanywhere-swift/Sources/RunAnywhere/CRACommons/include/rac_memory_types.h-106-118 (1)

106-118: ⚠️ Potential issue | 🟡 Minor

Clarify sort order for inner product results.

Line 107 documents ascending sort for L2/cosine but doesn't specify the order for RAC_DISTANCE_INNER_PRODUCT, where "best" results have the highest score. Callers need to know whether results are still ascending (least similar first) or descending (most similar first) when using inner product.

📝 Proposed documentation fix
-    /** Array of results sorted by score (ascending for L2/cosine) */
+    /** Array of results sorted by relevance (best match first).
+     *  For L2/cosine: ascending score (lowest distance first).
+     *  For inner product: descending score (highest similarity first). */
     rac_memory_result_t* results;
🧹 Nitpick comments (33)
Playground/kotlin-starter-app/app/src/main/java/com/runanywhere/kotlin_starter_example/ui/components/FeatureCard.kt (1)

30-34: Prefer the Card(onClick = ...) overload for built-in click semantics.

Using the non-clickable Card + Modifier.clickable works, but the Card(onClick = onClick) overload provides proper semantic role, state handling, and a consistent interaction indication out of the box.

♻️ Suggested refactor
     Card(
-        modifier = modifier
-            .fillMaxWidth()
-            .aspectRatio(0.85f)
-            .clickable(onClick = onClick),
+        onClick = onClick,
+        modifier = modifier
+            .fillMaxWidth()
+            .aspectRatio(0.85f),
         shape = RoundedCornerShape(20.dp),
Playground/kotlin-starter-app/app/src/main/java/com/runanywhere/kotlin_starter_example/ui/components/ModelLoaderWidget.kt (1)

76-102: Both progress indicators can be visible simultaneously if isDownloading and isLoading are both true.

If the caller ever sets both flags to true, two stacked progress bars will appear. Consider whether these states should be mutually exclusive or if the component should enforce precedence (e.g., show only the downloading indicator when both are true).

Playground/kotlin-starter-app/app/src/main/java/com/runanywhere/kotlin_starter_example/ui/screens/TextToSpeechScreen.kt (1)

41-112: Extract playWavAudio to a shared utility — implementations have minor differences to reconcile.

This function exists in both TextToSpeechScreen.kt and VoicePipelineScreen.kt. While the core AudioTrack playback logic is identical, the implementations differ in WAV header parsing: TextToSpeechScreen uses buffer.position(20) without RIFF validation, whereas VoicePipelineScreen uses buffer.position(22) with RIFF magic byte validation and a raw PCM fallback. Consider extracting to a shared utility (e.g., AudioPlaybackUtils.kt) while deciding which parsing approach better suits your needs.

Playground/kotlin-starter-app/app/src/main/java/com/runanywhere/kotlin_starter_example/ui/screens/VoicePipelineScreen.kt (2)

145-221: playWavAudio() is duplicated from TextToSpeechScreen.kt.

This function is nearly identical to the one in TextToSpeechScreen.kt (Lines 40-111 in that file). Extract it into a shared utility (e.g., a utils/AudioPlayer.kt) to avoid maintaining the same WAV-parsing and AudioTrack playback logic in two places.


346-353: Swallowing CancellationException breaks cooperative cancellation.

While the comment says "Expected when stopping," swallowing CancellationException in coroutine code prevents proper cancellation propagation. The idiomatic Kotlin coroutine pattern is to re-throw it.

♻️ Proposed fix
             } catch (e: CancellationException) {
-                // Expected when stopping
+                throw e // Maintain cooperative cancellation
             } catch (e: Exception) {
Playground/kotlin-starter-app/app/src/main/java/com/runanywhere/kotlin_starter_example/ui/screens/ChatScreen.kt (1)

195-197: Redundant fully qualified call — chat is already imported.

Line 59 imports com.runanywhere.sdk.public.extensions.chat. The FQN on Line 196 can be simplified.

♻️ Suggested fix
-                                            val response = com.runanywhere.sdk.public.RunAnywhere.chat(userMessage)
+                                            val response = RunAnywhere.chat(userMessage)
Playground/kotlin-starter-app/app/src/main/java/com/runanywhere/kotlin_starter_example/services/ModelService.kt (1)

139-180: Consider extracting a generic download-and-load helper to reduce duplication.

The LLM, STT, and TTS flows (lines 139–180, 185–226, 231–272) share identical structure. In a playground app this is understandable for readability, but the boilerplate still invites copy-paste divergence over time. A small generic helper parameterized by model ID, state setters, and the load function would keep it maintainable.

sdk/runanywhere-commons/src/backends/llamacpp/llamacpp_backend.cpp (2)

574-587: Creating a temporary llama_context per embedding call is expensive — consider caching it.

Each call to get_embeddings allocates and tears down a full llama_context with embeddings enabled (Lines 575-587, 650-651). For RAG workflows where remember() is called in a loop (e.g., ingesting chunked documents), this repeated allocation/deallocation is a significant bottleneck — context creation involves memory allocation and potentially GPU resource setup.

Consider lazily creating a persistent embedding context (emb_ctx_) alongside the generation context, reusing it across calls, and tearing it down in unload_model_internal().

Sketch of caching approach
 // In the class header (llamacpp_backend.h), add:
+    llama_context* emb_ctx_ = nullptr;
+    void ensure_embedding_context();

 // In get_embeddings(), replace temp context creation with:
-    llama_context* emb_ctx = llama_init_from_model(model_, ctx_params);
-    if (!emb_ctx) { ... }
+    ensure_embedding_context();
+    if (!emb_ctx_) {
+        LOGE("Failed to create embedding context");
+        return {};
+    }

 // Remove per-call llama_free(emb_ctx) at cleanup;
 // Instead, free emb_ctx_ in unload_model_internal():
+    if (emb_ctx_) {
+        llama_free(emb_ctx_);
+        emb_ctx_ = nullptr;
+    }

604-618: Batch is allocated for full context size — could be sized to actual token count.

Line 605 allocates llama_batch_init(n_ctx, 0, 1) using the full context window. For embedding extraction, where inputs are typically short, sizing the batch to tokens.size() would reduce memory waste.

Minimal batch allocation
-    llama_batch batch = llama_batch_init(n_ctx, 0, 1);
+    llama_batch batch = llama_batch_init((int)tokens.size(), 0, 1);
sdk/runanywhere-commons/src/backends/memory/memory_backend_hnswlib.h (1)

1-7: Consider adding lifecycle/ownership documentation.

The Doxygen file header describes what the backend does but doesn't mention that the caller owns the out_handle from rac_memory_hnsw_create and must destroy it via the vtable's destroy op. Given the coding guideline about documenting lifecycle requirements, a brief note on the rac_memory_hnsw_create doc would be helpful. As per coding guidelines, public C API headers must document vtable operations, error codes, and lifecycle requirements.

sdk/runanywhere-swift/Sources/RunAnywhere/Features/Memory/TextChunker.swift (2)

26-30: No validation on maxCharacters and overlap relationship.

If overlap >= maxCharacters or maxCharacters <= 0, the chunking loop can produce degenerate output (infinite growth or empty chunks). A precondition or early clamp would make this safer.

Proposed guard
     public static func chunk(
         _ text: String,
         maxCharacters: Int,
         overlap: Int = 50
     ) -> [String] {
+        precondition(maxCharacters > 0, "maxCharacters must be positive")
+        let overlap = min(overlap, maxCharacters / 2)
         guard !text.isEmpty else { return [] }
         guard text.count > maxCharacters else { return [text] }

68-90: Naive sentence splitting will mis-split on abbreviations and decimals.

Splitting on every . will break on "Dr. Smith", "3.14", "U.S.A.", URLs, etc. For a RAG chunker this can degrade retrieval quality. Consider using NLTokenizer with .sentence unit from NaturalLanguage framework, which handles these cases natively on Apple platforms.

sdk/runanywhere-commons/include/rac/backends/rac_llm_llamacpp.h (1)

184-200: Good API shape and lifecycle documentation.

The ownership contract (caller must free with rac_free) is clear. Per project guidelines, consider enumerating the specific error codes this function may return (NULL_POINTER, INVALID_HANDLE, INFERENCE_FAILED, OUT_OF_MEMORY based on the implementation) to complete the public API documentation.

As per coding guidelines: "Public C API headers in include/rac/ must document vtable operations, error codes, and lifecycle requirements."

sdk/runanywhere-commons/src/backends/memory/CMakeLists.txt (1)

35-38: Accelerate framework linked unconditionally despite "if available" comment.

On macOS/iOS this will typically succeed, but for cross-compilation or minimal toolchains it may fail. Consider using find_library to verify availability first.

Suggested conditional check
 if(CMAKE_SYSTEM_NAME STREQUAL "iOS" OR CMAKE_SYSTEM_NAME STREQUAL "Darwin")
-    # Use Accelerate framework for optimized BLAS operations if available
-    target_link_libraries(rac_backend_memory PUBLIC "-framework Accelerate")
+    include(CheckLibraryExists)
+    find_library(ACCELERATE_FRAMEWORK Accelerate)
+    if(ACCELERATE_FRAMEWORK)
+        target_link_libraries(rac_backend_memory PUBLIC "-framework Accelerate")
+        message(STATUS "    Accelerate framework found")
+    endif()
 endif()
sdk/runanywhere-swift/Sources/RunAnywhere/Foundation/Bridge/Extensions/CppBridge+Memory.swift (1)

143-154: Metadata assumed to be JSON, fails silently for plain strings.

If the stored metadata is a plain string rather than JSON, JSONSerialization fails and metadata becomes nil, silently dropping the value. If the contract is JSON-only, document it; otherwise consider falling back to a {"text": rawString} wrapper.

sdk/runanywhere-swift/Sources/RunAnywhere/Foundation/Bridge/CppBridge.swift (2)

61-61: NSLock usage violates Swift 6 concurrency guidelines.

This NSLock is pre-existing, but new code in this PR (lines 119–127) continues to rely on it. Consider migrating to a Swift 6 concurrency primitive (e.g., an os_unfair_lock-based wrapper, or restructuring CppBridge as an actor) in a follow-up. As per coding guidelines, **/*.swift: "Do not use NSLock as it is outdated. Use Swift 6 concurrency primitives instead."


191-191: Consider logging if memory backend unregistration fails unexpectedly.

rac_backend_memory_unregister() can return RAC_ERROR_MODULE_NOT_FOUND if something went wrong during init. The return value is silently discarded. This is fine for normal shutdown, but a debug-level log on unexpected failure would be consistent with the registration path (line 122).

🔧 Optional: log unexpected unregister failures
-        rac_backend_memory_unregister()
+        let unregResult = rac_backend_memory_unregister()
+        if unregResult != RAC_SUCCESS && unregResult != RAC_ERROR_MODULE_NOT_FOUND {
+            SDKLogger(category: "CppBridge").warning("Memory backend unregistration failed: \(unregResult)")
+        }
sdk/runanywhere-commons/src/backends/llamacpp/llamacpp_backend.h (1)

123-124: Temporary embedding context per call may be a performance bottleneck for batch ingestion.

The implementation (in llamacpp_backend.cpp) creates and destroys a llama_context with embeddings = true on every get_embeddings() invocation. For a RAG remember() flow ingesting many chunks, this context allocation/deallocation overhead will dominate. Consider lazily creating and caching a dedicated embedding context (similar to the existing context_ member for generation) and reusing it across calls.

💡 Sketch: cache the embedding context
 // In llamacpp_backend.h, add a member:
+    llama_context* emb_context_ = nullptr;

 // In get_embeddings(), reuse instead of recreating:
-    llama_context* emb_ctx = llama_init_from_model(model_, ctx_params);
+    if (!emb_context_) {
+        emb_context_ = llama_init_from_model(model_, ctx_params);
+    }
+    llama_context* emb_ctx = emb_context_;

 // Clean up in unload_model_internal() and destructor:
+    if (emb_context_) { llama_free(emb_context_); emb_context_ = nullptr; }
examples/ios/RunAnywhereAI/RunAnywhereAI/Features/Memory/MemoryTestView.swift (1)

11-12: Consider migrating to @Observable macro.

The @StateObject + ObservableObject pattern works but the newer @Observable macro (Observation framework, iOS 17+) with @State is the idiomatic Swift approach, giving better performance by tracking property-level access. As per coding guidelines, **/*.swift: Use the latest Swift 6 APIs always.

sdk/runanywhere-commons/include/rac/backends/rac_backend_memory.h (1)

19-36: Document specific error codes per coding guidelines.

The @return annotations say "RAC_SUCCESS or error code" without listing which error codes can be returned. From the implementation in rac_backend_memory_register.cpp, register may return RAC_ERROR_MODULE_ALREADY_REGISTERED (if called twice) or propagated registry errors, and unregister returns RAC_ERROR_MODULE_NOT_FOUND (if not registered). Listing these helps consumers handle failures correctly. Based on learnings: "Public C API headers in include/rac/ must document vtable operations, error codes, and lifecycle requirements."

📝 Proposed documentation improvement
 /**
  * Registers the Memory backend with the commons module and service registries.
  *
  * Should be called once during SDK initialization.
  * This registers:
  * - Module: "memory" with VECTOR_SEARCH capability
  * - Service provider: Memory vector search provider
  *
- * `@return` RAC_SUCCESS or error code
+ * `@return` RAC_SUCCESS on success, or one of:
+ *   - RAC_ERROR_MODULE_ALREADY_REGISTERED if called more than once
+ *   - Other registry error codes on internal failure
  */
 RAC_API rac_result_t rac_backend_memory_register(void);

 /**
  * Unregisters the Memory backend.
  *
- * `@return` RAC_SUCCESS or error code
+ * `@return` RAC_SUCCESS on success, or one of:
+ *   - RAC_ERROR_MODULE_NOT_FOUND if not currently registered
  */
 RAC_API rac_result_t rac_backend_memory_unregister(void);
sdk/runanywhere-commons/CMakeLists.txt (1)

279-283: Memory backend bypasses the RAC_BUILD_BACKENDS gate — intentional?

The other individual backend options (RAC_BACKEND_LLAMACPP, RAC_BACKEND_ONNX, RAC_BACKEND_WHISPERCPP) are all gated behind RAC_BUILD_BACKENDS (line 285). RAC_BACKEND_MEMORY is independent and ON by default. This means cmake -DRAC_BUILD_BACKENDS=OFF disables ML backends but still builds the memory backend, which may surprise users. If this is intentional (memory has no heavy external deps), the comment on line 279 is helpful but consider also documenting this in the options section.

📝 Suggested documentation enhancement in options block
 option(RAC_BUILD_BACKENDS "Build ML backends (LlamaCPP, ONNX, WhisperCPP)" OFF)
-option(RAC_BACKEND_MEMORY "Build Memory/Vector Search backend (hnswlib)" ON)
+option(RAC_BACKEND_MEMORY "Build Memory/Vector Search backend (hnswlib) - independent of RAC_BUILD_BACKENDS" ON)
sdk/runanywhere-commons/src/backends/memory/rac_memory_backend.cpp (2)

48-57: Defensive cleanup: destroy backend_handle if malloc fails but also if create partially succeeds.

Two concerns with resource management:

  1. Lines 48-50: If a future backend implementation sets *out_handle before returning an error, the handle leaks. Consider destroying the handle when it's non-null regardless of result.
  2. Line 54: malloc leaves memory uninitialized. If rac_memory_service_t gains additional fields, they'll contain garbage. Prefer calloc for zero-initialization.
🛡️ Proposed defensive improvements
     if (result != RAC_SUCCESS || !backend_handle) {
+        if (backend_handle && ops) {
+            ops->destroy(backend_handle);
+        }
         RAC_LOG_ERROR(LOG_CAT, "Failed to create backend: %d", result);
         return nullptr;
     }

     // Allocate service struct
-    auto* service = static_cast<rac_memory_service_t*>(malloc(sizeof(rac_memory_service_t)));
+    auto* service = static_cast<rac_memory_service_t*>(calloc(1, sizeof(rac_memory_service_t)));

25-28: No error code propagation to callers.

rac_memory_backend_create_service returns rac_handle_t (an opaque pointer), so callers only see nullptr on failure with no way to distinguish between a null config, unknown index type, backend creation failure, or OOM. The specific rac_result_t from the backend is logged but lost. If this function is part of the public C API surface, consider an out-parameter for the error code or a different return convention.

sdk/runanywhere-commons/tests/test_memory.cpp (1)

45-60: Resource leak on early FAIL/EXPECT_OK exits in tests.

The FAIL macro issues a bare return, and EXPECT_OK delegates to FAIL. When these trigger mid-test (e.g., Line 102 after rac_memory_create succeeds but before destroy), the rac_handle_t is leaked. This is tolerable for a short-lived test process, but it can mask bugs in the destroy path and makes valgrind/ASAN runs noisy.

A common pattern is a goto cleanup label or an RAII guard:

struct HandleGuard {
    rac_handle_t h = nullptr;
    ~HandleGuard() { if (h) rac_memory_destroy(h); }
};

Not blocking, just a suggestion to improve test robustness.

examples/ios/RunAnywhereAI/RunAnywhereAI/Features/Memory/MemoryTestViewModel.swift (2)

150-182: Loose assertion in testSaveLoad — consider asserting exact vector count.

Line 167 checks stats.numVectors > 0, but at this point in the test sequence (after testRemove deleted id=5), the expected count is exactly 4. A tighter assertion would catch subtle persistence bugs (e.g., duplicates on load, dropped entries).

-        guard stats.numVectors > 0 else {
-            throw TestError("No vectors after load")
+        guard stats.numVectors == 4 else {
+            throw TestError("Expected 4 vectors after load, got \(stats.numVectors)")
         }

35-59: Tests depend on sequential execution order — state leaks between tests.

The test functions (e.g., testAddVectors, testSearchExact) rely on index state created/mutated by prior tests. If any middle test fails, subsequent tests will also fail with confusing errors (e.g., "Memory index not created"). This is acceptable for a visual test runner, but consider adding a note in the status message when a test fails mid-chain, or short-circuiting remaining tests.

sdk/runanywhere-commons/src/backends/memory/memory_backend_flat.cpp (1)

138-209: search_time_us includes lock acquisition time.

The timer at Line 148 starts before the shared_lock on Line 150. Under write contention this inflates the reported search latency. Consider moving start after lock acquisition if the metric is meant to reflect pure computation time.

sdk/runanywhere-commons/src/backends/memory/rac_backend_memory_register.cpp (1)

122-136: Unchecked return values from rac_service_unregister_provider / rac_module_unregister.

Lines 130–131 discard the results of the unregister calls. If either fails (e.g., the provider was already removed externally), state.registered is set to false anyway, which could leave the registry in an inconsistent state — one component unregistered and the other still present.

Not blocking, but consider logging warnings if either returns an unexpected error.

sdk/runanywhere-swift/Sources/RunAnywhere/Features/Memory/RAGAgent.swift (1)

106-108: LLM generation options are hardcoded.

max_tokens and temperature are not configurable per query. Consider accepting an optional configuration parameter for callers who need different generation behavior.

sdk/runanywhere-commons/src/backends/memory/memory_backend_hnswlib.cpp (2)

174-193: Remove silently swallows all exceptions — consider logging.

Line 186 catches all exceptions with catch (...) when marking an element for deletion. An element-not-found scenario is expected, but other exceptions (corruption, etc.) are silently lost. A debug-level log would aid troubleshooting.


195-246: Save mixes binary header with text metadata — verify metadata cannot contain raw newlines/tabs.

The format writes a binary header, then metadata as id\tmetadata\n lines. This relies on the metadata JSON strings never containing literal tab or newline bytes. Standard JSON serializers escape these, but there's no validation here. A corrupted or hand-crafted metadata string could break the load parser.

sdk/runanywhere-swift/Sources/RunAnywhere/CRACommons/include/rac_memory_service.h (1)

63-64: index_id ownership is undocumented.

const char* index_id in rac_memory_service_t is heap-allocated and freed via free() in rac_memory_destroy (see rac_memory_service.cpp line 194), but the header doesn't document this. Consider adding a brief ownership comment. Based on learnings, "Public C API headers in include/rac/ must document vtable operations, error codes, and lifecycle requirements."

sdk/runanywhere-commons/src/features/memory/rac_memory_service.cpp (1)

59-100: Dimension retrieved via get_stats on every add and search — unnecessary overhead.

dimension is immutable after index creation. Calling get_stats on every add/search acquires a shared_lock in the backend just to read the dimension. If get_stats is unset or fails, stats.dimension remains 0, silently passing a zero dimension to the backend.

Consider storing dimension directly in rac_memory_service_t at creation time.

Comment on lines +288 to +299
fun unloadAllModels() {
viewModelScope.launch {
try {
RunAnywhere.unloadLLMModel()
RunAnywhere.unloadSTTModel()
RunAnywhere.unloadTTSVoice()
refreshModelState()
} catch (e: Exception) {
errorMessage = "Failed to unload models: ${e.message}"
}
}
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

unloadAllModels: if the first unload throws, remaining models stay loaded.

The three unload calls are wrapped in a single try/catch. If unloadLLMModel() throws, unloadSTTModel() and unloadTTSVoice() are never invoked, leaving those models resident in memory. refreshModelState() is also skipped, so the UI will not reflect the actual state. For an app managing large on-device models this can silently waste significant memory.

Proposed fix: unload each model independently
 fun unloadAllModels() {
     viewModelScope.launch {
-        try {
-            RunAnywhere.unloadLLMModel()
-            RunAnywhere.unloadSTTModel()
-            RunAnywhere.unloadTTSVoice()
-            refreshModelState()
-        } catch (e: Exception) {
-            errorMessage = "Failed to unload models: ${e.message}"
-        }
+        val errors = mutableListOf<String>()
+        try { RunAnywhere.unloadLLMModel() } catch (e: Exception) { errors += "LLM: ${e.message}" }
+        try { RunAnywhere.unloadSTTModel() } catch (e: Exception) { errors += "STT: ${e.message}" }
+        try { RunAnywhere.unloadTTSVoice() } catch (e: Exception) { errors += "TTS: ${e.message}" }
+        refreshModelState()
+        if (errors.isNotEmpty()) {
+            errorMessage = "Failed to unload: ${errors.joinToString("; ")}"
+        }
     }
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
fun unloadAllModels() {
viewModelScope.launch {
try {
RunAnywhere.unloadLLMModel()
RunAnywhere.unloadSTTModel()
RunAnywhere.unloadTTSVoice()
refreshModelState()
} catch (e: Exception) {
errorMessage = "Failed to unload models: ${e.message}"
}
}
}
fun unloadAllModels() {
viewModelScope.launch {
val errors = mutableListOf<String>()
try { RunAnywhere.unloadLLMModel() } catch (e: Exception) { errors += "LLM: ${e.message}" }
try { RunAnywhere.unloadSTTModel() } catch (e: Exception) { errors += "STT: ${e.message}" }
try { RunAnywhere.unloadTTSVoice() } catch (e: Exception) { errors += "TTS: ${e.message}" }
refreshModelState()
if (errors.isNotEmpty()) {
errorMessage = "Failed to unload: ${errors.joinToString("; ")}"
}
}
}
🤖 Prompt for AI Agents
In
`@Playground/kotlin-starter-app/app/src/main/java/com/runanywhere/kotlin_starter_example/services/ModelService.kt`
around lines 288 - 299, unloadAllModels currently wraps
RunAnywhere.unloadLLMModel, unloadSTTModel, and unloadTTSVoice in one try/catch
so a thrown exception stops subsequent unloads and skips refreshModelState;
change it so each unload call (RunAnywhere.unloadLLMModel(),
RunAnywhere.unloadSTTModel(), RunAnywhere.unloadTTSVoice()) is executed in its
own try/catch (or use runCatching) to log/collect individual errors without
aborting the rest, and ensure refreshModelState() is invoked after all attempts
(e.g., in a finally or after all runCatching blocks) so the UI reflects the
final state.

Comment on lines +80 to +114
// Start reading audio in a thread
Thread {
val buffer = ByteArray(bufferSize)
while (isRecording) {
val read = audioRecord?.read(buffer, 0, buffer.size) ?: 0
if (read > 0) {
synchronized(audioData) {
audioData.write(buffer, 0, read)
}
}
}
}.start()

return true
} catch (e: SecurityException) {
audioRecord?.release()
audioRecord = null
return false
}
}

fun stopRecording(): ByteArray {
isRecording = false
audioRecord?.let { record ->
if (record.recordingState == AudioRecord.RECORDSTATE_RECORDING) {
record.stop()
}
record.release()
}
audioRecord = null

synchronized(audioData) {
return audioData.toByteArray()
}
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Reading thread is not joined in stopRecording(), risking incomplete data.

After setting isRecording = false, stopRecording() immediately grabs the audioData lock and returns. The background Thread (Line 81) may still be executing audioRecord?.read(...) — which can block or crash since audioRecord is stopped/released on Line 105-108 before the thread finishes. Consider keeping a reference to the thread and joining it (with a short timeout) before releasing the AudioRecord.

🔧 Proposed fix: join the reading thread before releasing
 private class AudioRecorder {
     private var audioRecord: AudioRecord? = null
     `@Volatile`
     private var isRecording = false
     private val audioData = ByteArrayOutputStream()
+    private var readingThread: Thread? = null
     
     ...
     
-            Thread {
+            readingThread = Thread {
                 val buffer = ByteArray(bufferSize)
                 while (isRecording) {
                     val read = audioRecord?.read(buffer, 0, buffer.size) ?: 0
                     if (read > 0) {
                         synchronized(audioData) {
                             audioData.write(buffer, 0, read)
                         }
                     }
                 }
-            }.start()
+            }
+            readingThread?.start()
     ...
     
     fun stopRecording(): ByteArray {
         isRecording = false
+        readingThread?.join(500) // Wait for the reading thread to exit
+        readingThread = null
         audioRecord?.let { record ->
🧰 Tools
🪛 detekt (1.23.8)

[warning] 94-94: The caught exception is swallowed. The original exception could be lost.

(detekt.exceptions.SwallowedException)

🤖 Prompt for AI Agents
In
`@Playground/kotlin-starter-app/app/src/main/java/com/runanywhere/kotlin_starter_example/ui/screens/SpeechToTextScreen.kt`
around lines 80 - 114, The background read Thread started in the recording
method isn’t tracked or joined, so stopRecording() sets isRecording=false and
immediately releases audioRecord and returns audioData while the thread may
still be in audioRecord?.read(...); fix by storing the Thread reference (e.g.,
readingThread) when starting it, then in stopRecording() after setting
isRecording = false call readingThread.join(<short timeout>) and handle
InterruptedException before stopping/releasing audioRecord and returning
audioData to ensure the reader thread has finished accessing audioRecord and
audioData.

Comment on lines +85 to +111
val audioTrack = AudioTrack.Builder()
.setAudioAttributes(
AudioAttributes.Builder()
.setUsage(AudioAttributes.USAGE_MEDIA)
.setContentType(AudioAttributes.CONTENT_TYPE_SPEECH)
.build()
)
.setAudioFormat(
AudioFormat.Builder()
.setSampleRate(sampleRate)
.setEncoding(audioFormatConfig)
.setChannelMask(channelConfig)
.build()
)
.setBufferSizeInBytes(maxOf(minBufferSize, pcmData.size))
.setTransferMode(AudioTrack.MODE_STATIC)
.build()

audioTrack.write(pcmData, 0, pcmData.size)
audioTrack.play()

// Wait for playback to complete
val durationMs = (pcmData.size.toLong() * 1000) / (sampleRate * numChannels * (bitsPerSample / 8))
delay(durationMs + 100) // Add small buffer

audioTrack.stop()
audioTrack.release()
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

AudioTrack leaks if the coroutine is cancelled mid-playback.

If the user navigates away while audio is playing, the delay on Line 108 throws CancellationException, skipping stop() and release(). Wrap the AudioTrack lifecycle in try/finally.

🐛 Proposed fix
     val audioTrack = AudioTrack.Builder()
         .setAudioAttributes(
             AudioAttributes.Builder()
                 .setUsage(AudioAttributes.USAGE_MEDIA)
                 .setContentType(AudioAttributes.CONTENT_TYPE_SPEECH)
                 .build()
         )
         .setAudioFormat(
             AudioFormat.Builder()
                 .setSampleRate(sampleRate)
                 .setEncoding(audioFormatConfig)
                 .setChannelMask(channelConfig)
                 .build()
         )
         .setBufferSizeInBytes(maxOf(minBufferSize, pcmData.size))
         .setTransferMode(AudioTrack.MODE_STATIC)
         .build()
 
-    audioTrack.write(pcmData, 0, pcmData.size)
-    audioTrack.play()
-    
-    // Wait for playback to complete
-    val durationMs = (pcmData.size.toLong() * 1000) / (sampleRate * numChannels * (bitsPerSample / 8))
-    delay(durationMs + 100) // Add small buffer
-    
-    audioTrack.stop()
-    audioTrack.release()
+    try {
+        audioTrack.write(pcmData, 0, pcmData.size)
+        audioTrack.play()
+
+        // Wait for playback to complete
+        val durationMs = (pcmData.size.toLong() * 1000) / (sampleRate * numChannels * (bitsPerSample / 8))
+        delay(durationMs + 100) // Add small buffer
+    } finally {
+        audioTrack.stop()
+        audioTrack.release()
+    }

Note: The same issue exists in VoicePipelineScreen.kt's playWavAudio (see relevant snippet, around line 213–220).

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
val audioTrack = AudioTrack.Builder()
.setAudioAttributes(
AudioAttributes.Builder()
.setUsage(AudioAttributes.USAGE_MEDIA)
.setContentType(AudioAttributes.CONTENT_TYPE_SPEECH)
.build()
)
.setAudioFormat(
AudioFormat.Builder()
.setSampleRate(sampleRate)
.setEncoding(audioFormatConfig)
.setChannelMask(channelConfig)
.build()
)
.setBufferSizeInBytes(maxOf(minBufferSize, pcmData.size))
.setTransferMode(AudioTrack.MODE_STATIC)
.build()
audioTrack.write(pcmData, 0, pcmData.size)
audioTrack.play()
// Wait for playback to complete
val durationMs = (pcmData.size.toLong() * 1000) / (sampleRate * numChannels * (bitsPerSample / 8))
delay(durationMs + 100) // Add small buffer
audioTrack.stop()
audioTrack.release()
val audioTrack = AudioTrack.Builder()
.setAudioAttributes(
AudioAttributes.Builder()
.setUsage(AudioAttributes.USAGE_MEDIA)
.setContentType(AudioAttributes.CONTENT_TYPE_SPEECH)
.build()
)
.setAudioFormat(
AudioFormat.Builder()
.setSampleRate(sampleRate)
.setEncoding(audioFormatConfig)
.setChannelMask(channelConfig)
.build()
)
.setBufferSizeInBytes(maxOf(minBufferSize, pcmData.size))
.setTransferMode(AudioTrack.MODE_STATIC)
.build()
try {
audioTrack.write(pcmData, 0, pcmData.size)
audioTrack.play()
// Wait for playback to complete
val durationMs = (pcmData.size.toLong() * 1000) / (sampleRate * numChannels * (bitsPerSample / 8))
delay(durationMs + 100) // Add small buffer
} finally {
audioTrack.stop()
audioTrack.release()
}
🤖 Prompt for AI Agents
In
`@Playground/kotlin-starter-app/app/src/main/java/com/runanywhere/kotlin_starter_example/ui/screens/TextToSpeechScreen.kt`
around lines 85 - 111, The AudioTrack created with AudioTrack.Builder (assigned
to audioTrack) can leak if the coroutine is cancelled during the delay: ensure
the play/cleanup sequence is wrapped in a try/finally so stop() and release()
always run; specifically move audioTrack.write(...) and audioTrack.play() into a
try block and call audioTrack.stop() and audioTrack.release() in the finally
block, and apply the same change to the playWavAudio implementation in
VoicePipelineScreen.kt to guarantee cleanup on CancellationException.

Comment on lines +326 to +335
is VoiceSessionEvent.TurnCompleted -> {
// Play the synthesized audio
event.audio?.let { audio ->
sessionState = VoiceSessionState.SPEAKING
playWavAudio(audio)
}
// Resume listening state
sessionState = VoiceSessionState.LISTENING
audioLevel = 0f
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

playWavAudio() blocks event collection during audio playback.

playWavAudio(audio) is a suspending call with an internal delay(durationMs + 100). Since it's called inline inside collect { ... }, it blocks the entire Flow collection for the duration of playback. Any events emitted by the SDK during that time (e.g., Listening, Error) will be buffered and delivered late, causing stale UI state.

Launch playback in a separate coroutine so collection continues unblocked.

🔧 Proposed fix
                         is VoiceSessionEvent.TurnCompleted -> {
                             // Play the synthesized audio
                             event.audio?.let { audio ->
                                 sessionState = VoiceSessionState.SPEAKING
-                                playWavAudio(audio)
+                                launch { playWavAudio(audio) }
                             }
                             // Resume listening state
                             sessionState = VoiceSessionState.LISTENING
                             audioLevel = 0f
                         }
🤖 Prompt for AI Agents
In
`@Playground/kotlin-starter-app/app/src/main/java/com/runanywhere/kotlin_starter_example/ui/screens/VoicePipelineScreen.kt`
around lines 326 - 335, When handling VoiceSessionEvent.TurnCompleted in the
Flow collector, don't call the suspending playWavAudio(audio) inline because it
blocks collection; instead launch playback in a separate coroutine (e.g.,
CoroutineScope(Dispatchers.IO).launch or viewModelScope.launch) so the collector
can continue receiving events. Set sessionState = VoiceSessionState.SPEAKING
before launching the coroutine, call playWavAudio(audio) inside that coroutine,
and after playback update sessionState = VoiceSessionState.LISTENING and
audioLevel = 0f inside the coroutine to restore UI state without blocking the
Flow collector. Ensure you reference the same symbols:
VoiceSessionEvent.TurnCompleted, playWavAudio, sessionState, and audioLevel.

RAC_COMPONENT_VAD = 3, /**< Voice Activity Detection */
RAC_COMPONENT_VOICE = 4, /**< Voice Agent */
RAC_COMPONENT_EMBEDDING = 5, /**< Embedding generation */
RAC_COMPONENT_MEMORY = 6, /**< Vector memory / RAG */
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

rg -n "rac_sdk_component_display_name|rac_sdk_component_raw_value" --type=cpp --type=c -B2 -A10

Repository: RunanywhereAI/runanywhere-sdks

Length of output: 7670


🏁 Script executed:

# Also search for where RAC_COMPONENT_MEMORY might be referenced
rg -n "RAC_COMPONENT_MEMORY" --type=cpp --type=c -B2 -A2

Repository: RunanywhereAI/runanywhere-sdks

Length of output: 603


🏁 Script executed:

# Find the header file to understand the context
find . -name "rac_component_types.h" -type f

Repository: RunanywhereAI/runanywhere-sdks

Length of output: 223


🏁 Script executed:

cat -n sdk/runanywhere-commons/src/core/component_types.cpp | head -60

Repository: RunanywhereAI/runanywhere-sdks

Length of output: 2374


🏁 Script executed:

# Also check if there are any default cases or what happens at the end
cat -n sdk/runanywhere-commons/src/core/component_types.cpp

Repository: RunanywhereAI/runanywhere-sdks

Length of output: 4990


Update rac_sdk_component_display_name, rac_sdk_component_raw_value, and rac_component_to_resource_type to handle RAC_COMPONENT_MEMORY.

The enum value was added but the switch statements in component_types.cpp are missing the new case. Currently:

  • rac_sdk_component_display_name returns "Unknown" for memory
  • rac_sdk_component_raw_value returns "unknown" for memory
  • rac_component_to_resource_type returns -1 for memory

Add cases for RAC_COMPONENT_MEMORY in all three functions to match the pattern of existing components.

🤖 Prompt for AI Agents
In `@sdk/runanywhere-commons/include/rac/core/rac_component_types.h` at line 38,
Add a switch case for RAC_COMPONENT_MEMORY in component_types.cpp inside
rac_sdk_component_display_name, rac_sdk_component_raw_value, and
rac_component_to_resource_type: return the human label "Memory" from
rac_sdk_component_display_name, the raw identifier "memory" from
rac_sdk_component_raw_value, and map to the same resource-type enum/constant
used for memory in other components (do not return -1)—use the existing
resource-type symbol (e.g., RESOURCE_TYPE_MEMORY or the project’s equivalent) to
keep the pattern consistent with other cases.

Comment on lines +1 to +4
# Memory layer test executable
add_executable(test_memory test_memory.cpp)
target_link_libraries(test_memory PRIVATE rac_commons rac_backend_memory)
target_include_directories(test_memory PRIVATE ${PROJECT_SOURCE_DIR}/include)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Check if RAC_BUILD_MEMORY option is defined elsewhere
echo "=== Searching for RAC_BUILD_MEMORY option ==="
rg -n 'RAC_BUILD_MEMORY' --type=cmake

echo ""
echo "=== Check existing RAC_BUILD_* patterns for reference ==="
rg -n 'RAC_BUILD_' --type=cmake -C 1 | head -80

echo ""
echo "=== Check how other test targets are defined ==="
fd -t f 'CMakeLists.txt' --exec grep -l 'add_executable' {} \; | head -10

Repository: RunanywhereAI/runanywhere-sdks

Length of output: 5650


🏁 Script executed:

#!/bin/bash
# Check the actual tests/CMakeLists.txt content
echo "=== Contents of tests/CMakeLists.txt ==="
cat -n sdk/runanywhere-commons/tests/CMakeLists.txt

echo ""
echo "=== How tests directory is included in main CMakeLists.txt ==="
rg -n 'add_subdirectory.*tests' --type=cmake

echo ""
echo "=== Check if RAC_BUILD_TESTS guards the tests inclusion ==="
rg -n -B2 -A2 'add_subdirectory.*tests' sdk/runanywhere-commons/CMakeLists.txt --type=cmake

echo ""
echo "=== Check for add_test() calls in the codebase ==="
rg -n 'add_test\(' --type=cmake

Repository: RunanywhereAI/runanywhere-sdks

Length of output: 715


🏁 Script executed:

#!/bin/bash
# Double-check: verify that rac_backend_memory target is created only when RAC_BACKEND_MEMORY is enabled
echo "=== How rac_backend_memory target is created ==="
rg -n -B2 -A5 'add_library.*rac_backend_memory' --type=cmake

echo ""
echo "=== Memory backend CMakeLists.txt structure ==="
if [ -f sdk/runanywhere-commons/src/backends/memory/CMakeLists.txt ]; then
    cat sdk/runanywhere-commons/src/backends/memory/CMakeLists.txt
else
    echo "File not found"
fi

Repository: RunanywhereAI/runanywhere-sdks

Length of output: 2364


Guard with RAC_BACKEND_MEMORY to prevent build failures when the memory backend is disabled.

The test target links against rac_backend_memory, which is only created when RAC_BACKEND_MEMORY=ON. If the memory backend is disabled while tests are enabled, the build will fail with undefined reference errors. Wrap the test definition in a RAC_BACKEND_MEMORY guard to match the optional backend pattern used elsewhere in the project.

🔧 Proposed fix
-# Memory layer test executable
-add_executable(test_memory test_memory.cpp)
-target_link_libraries(test_memory PRIVATE rac_commons rac_backend_memory)
-target_include_directories(test_memory PRIVATE ${PROJECT_SOURCE_DIR}/include)
+# Memory layer test executable
+if(RAC_BACKEND_MEMORY)
+  add_executable(test_memory test_memory.cpp)
+  target_link_libraries(test_memory PRIVATE rac_commons rac_backend_memory)
+  target_include_directories(test_memory PRIVATE ${PROJECT_SOURCE_DIR}/include)
+endif()
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
# Memory layer test executable
add_executable(test_memory test_memory.cpp)
target_link_libraries(test_memory PRIVATE rac_commons rac_backend_memory)
target_include_directories(test_memory PRIVATE ${PROJECT_SOURCE_DIR}/include)
# Memory layer test executable
if(RAC_BACKEND_MEMORY)
add_executable(test_memory test_memory.cpp)
target_link_libraries(test_memory PRIVATE rac_commons rac_backend_memory)
target_include_directories(test_memory PRIVATE ${PROJECT_SOURCE_DIR}/include)
endif()
🤖 Prompt for AI Agents
In `@sdk/runanywhere-commons/tests/CMakeLists.txt` around lines 1 - 4, Wrap the
test target creation for test_memory in an RAC_BACKEND_MEMORY guard so it is
only added when the memory backend is enabled: conditionally add the
add_executable(test_memory ...), target_link_libraries(test_memory PRIVATE
rac_commons rac_backend_memory) and target_include_directories(test_memory ...)
inside an if(RAC_BACKEND_MEMORY) ... endif() block to avoid linking against
rac_backend_memory when the backend is disabled.

Comment on lines +30 to +45
public struct MemorySearchResult: Sendable {
/// Vector ID
public let id: UInt64

/// Distance/similarity score (lower is closer for L2/cosine)
public let score: Float

/// Associated metadata (parsed from JSON)
public let metadata: [String: Any]?

public init(id: UInt64, score: Float, metadata: [String: Any]? = nil) {
self.id = id
self.score = score
self.metadata = metadata
}
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Sendable conformance is invalid — [String: Any]? is not Sendable.

Both MemorySearchResult and MemoryRecallResult declare Sendable conformance but store metadata: [String: Any]?. Any does not conform to Sendable, so this will produce a compiler error under Swift 6 strict concurrency checking.

Consider using [String: any Sendable]? or a dedicated Sendable-conforming metadata type.

Proposed fix
 public struct MemorySearchResult: Sendable {
     public let id: UInt64
     public let score: Float
-    public let metadata: [String: Any]?
+    public let metadata: [String: any Sendable]?
 
-    public init(id: UInt64, score: Float, metadata: [String: Any]? = nil) {
+    public init(id: UInt64, score: Float, metadata: [String: any Sendable]? = nil) {

Apply the same change to MemoryRecallResult.

As per coding guidelines, **/*.swift: "Use the latest Swift 6 APIs always."

Also applies to: 48-67

🤖 Prompt for AI Agents
In `@sdk/runanywhere-swift/Sources/RunAnywhere/Features/Memory/MemoryTypes.swift`
around lines 30 - 45, MemorySearchResult and MemoryRecallResult incorrectly
declare Sendable while using metadata: [String: Any]?; change the metadata type
to a Sendable-safe form (e.g., metadata: [String: any Sendable]?) and update the
corresponding init parameters and stored property types in both structs
(MemorySearchResult and MemoryRecallResult) so the structs can legitimately
conform to Sendable under Swift 6.

private let bridge: CppBridge.Memory
private let embedder: any EmbeddingProvider
private let config: MemoryConfiguration
private var nextId: UInt64 = 1
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

nextId counter resets after load() or app restart — causes silent ID collisions.

nextId starts at 1 and is not persisted. After a save()/load() round-trip (or an app relaunch with load()), new remember() calls will produce IDs starting from 1 again. Since the flat backend treats duplicate IDs as upserts, this silently overwrites previously stored memories.

Consider either:

  1. Persisting nextId alongside the index (e.g., in the metadata sidecar or a dedicated field), or
  2. After load(), querying stats or scanning existing IDs to set nextId = maxExistingId + 1.
🤖 Prompt for AI Agents
In
`@sdk/runanywhere-swift/Sources/RunAnywhere/Features/Memory/RAGMemoryService.swift`
at line 33, The nextId counter resets because nextId (declared in
RAGMemoryService) isn't persisted or recomputed on load, causing ID collisions
when remember() generates new entries after load() or restart; fix by persisting
nextId with the index (e.g., metadata sidecar or dedicated field) during save()
and restoring it in load(), or alternatively implement logic in load() to scan
existing memory IDs (or query backend stats) to compute nextId = maxExistingId +
1 before returning, ensuring remember() uses the updated nextId.

Comment on lines +69 to +73
public func add(vectors: [[Float]], ids: [UInt64], metadata: [String?]? = nil) throws {
let handle = try getHandle()
let count = UInt32(vectors.count)
guard count > 0 else { return }

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Missing count validation between vectors, ids, and metadata.

The C API (rac_memory_add) reads count entries from all three buffers. If ids.count != vectors.count or metadata?.count is non-nil but differs, this causes out-of-bounds reads on the C side (undefined behavior). Additionally, if metadata is non-nil but empty while vectors is non-empty, the force unwrap at line 104 (metaBuf.baseAddress!) will crash.

Proposed validation
     public func add(vectors: [[Float]], ids: [UInt64], metadata: [String?]? = nil) throws {
         let handle = try getHandle()
         let count = UInt32(vectors.count)
         guard count > 0 else { return }
+
+        guard vectors.count == ids.count else {
+            throw SDKError.memory(.invalidInput,
+                                   "vectors.count (\(vectors.count)) != ids.count (\(ids.count))")
+        }
+        if let metadata = metadata {
+            guard metadata.count == vectors.count else {
+                throw SDKError.memory(.invalidInput,
+                                       "metadata.count (\(metadata.count)) != vectors.count (\(vectors.count))")
+            }
+        }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
public func add(vectors: [[Float]], ids: [UInt64], metadata: [String?]? = nil) throws {
let handle = try getHandle()
let count = UInt32(vectors.count)
guard count > 0 else { return }
public func add(vectors: [[Float]], ids: [UInt64], metadata: [String?]? = nil) throws {
let handle = try getHandle()
let count = UInt32(vectors.count)
guard count > 0 else { return }
guard vectors.count == ids.count else {
throw SDKError.memory(.invalidInput,
"vectors.count (\(vectors.count)) != ids.count (\(ids.count))")
}
if let metadata = metadata {
guard metadata.count == vectors.count else {
throw SDKError.memory(.invalidInput,
"metadata.count (\(metadata.count)) != vectors.count (\(vectors.count))")
}
}
🤖 Prompt for AI Agents
In
`@sdk/runanywhere-swift/Sources/RunAnywhere/Foundation/Bridge/Extensions/CppBridge`+Memory.swift
around lines 69 - 73, In add(vectors: [[Float]], ids: [UInt64], metadata:
[String?]? = nil) validate that ids.count == vectors.count and if metadata !=
nil then metadata!.count == vectors.count (and not empty for non-zero vectors)
before calling rac_memory_add; if counts mismatch throw an error/return early.
Update the metadata handling so you do not force-unwrap metaBuf.baseAddress!
when metadata is non-nil — only create and pass a metadata buffer when
metadata.count == vectors.count, otherwise pass nil to the C call; reference the
add(...) method, the rac_memory_add invocation, and the metaBuf.baseAddress
usage when making these changes.

Comment on lines +78 to +93
var metadataPointers: [UnsafePointer<CChar>?]?
var metadataBuffers: [Data] = []

if let metadata = metadata {
var ptrs = [UnsafePointer<CChar>?]()
for meta in metadata {
if let m = meta {
let data = Data(m.utf8 + [0])
metadataBuffers.append(data)
ptrs.append(nil) // Will set in withUnsafeBytes
} else {
ptrs.append(nil)
}
}
metadataPointers = ptrs
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major

Dead code: metadataPointers and metadataBuffers are never used.

These variables are populated but immediately shadowed by the strdup-based approach at lines 96-110. Remove them to avoid confusion.

Remove dead code
-            // Prepare metadata C strings
-            var metadataPointers: [UnsafePointer<CChar>?]?
-            var metadataBuffers: [Data] = []
-
-            if let metadata = metadata {
-                var ptrs = [UnsafePointer<CChar>?]()
-                for meta in metadata {
-                    if let m = meta {
-                        let data = Data(m.utf8 + [0])
-                        metadataBuffers.append(data)
-                        ptrs.append(nil) // Will set in withUnsafeBytes
-                    } else {
-                        ptrs.append(nil)
-                    }
-                }
-                metadataPointers = ptrs
-            }
-
             let result: rac_result_t
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
var metadataPointers: [UnsafePointer<CChar>?]?
var metadataBuffers: [Data] = []
if let metadata = metadata {
var ptrs = [UnsafePointer<CChar>?]()
for meta in metadata {
if let m = meta {
let data = Data(m.utf8 + [0])
metadataBuffers.append(data)
ptrs.append(nil) // Will set in withUnsafeBytes
} else {
ptrs.append(nil)
}
}
metadataPointers = ptrs
}
let result: rac_result_t
🤖 Prompt for AI Agents
In
`@sdk/runanywhere-swift/Sources/RunAnywhere/Foundation/Bridge/Extensions/CppBridge`+Memory.swift
around lines 78 - 93, Remove the dead preallocation for metadataPointers and
metadataBuffers in CppBridge+Memory.swift: the variables metadataPointers and
metadataBuffers (and the block that builds ptrs/Data from metadata) are never
used because the code later uses strdup-based allocation; delete that entire
if-let metadata block so only the strdup approach remains (search for
metadataPointers, metadataBuffers, and the for meta in metadata loop to locate
the code to remove).

@shubhammalhotra28
Copy link
Copy Markdown
Contributor

Let's close this right now
We will come to this later, adding the support.
@vedantagarwal-web

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants