Conversation
…amaCPP backends Implement complete on-device RAG with document management, semantic search, and contextual answer generation. Core Implementation (runanywhere-commons): - Implement RAG pipeline with document chunking and vector retrieval - Implement ONNX embedding provider (all-MiniLM-L6-v2) - Implement LlamaCPP and ONNX text generators - Implement USearch-based vector store for fast similarity search - Implement structured logging with metadata support - Update CMake build system for RAG backend integration React Native Integration: - Create @runanywhere/rag package with TypeScript API - Add NitroModules global initialization for proper JSI binding - Generate Nitro bindings for Android and iOS platforms - Improve model path resolution for ONNX single-file models - Implement RAG configuration interface with tunable parameters - Update build scripts for RAG module compilation Example Application: - Implement RAGScreen demo with interactive document Q&A - Implement model selection UI for embedding and LLM models - Implement document management (add, clear, batch operations) - Display retrieval sources with similarity scores - Show timing metrics (retrieval, generation, total) - Implement model catalog entries for embedding models - Update navigation to include RAG tab This enables privacy-preserving RAG with all processing happening on-device, supporting both ONNX and LlamaCPP backends.
- Remove postinstall-postinstall dependency from example app - Update example app package-lock.json with refreshed dependencies
- Resolved 12 conflicting files - Kept RAG implementation as primary tab 4 (vs Vision in origin/main) - Merged package.json: using patch-package for postinstall - Merged build.gradle: kept arm64-v8a ABI filters + added librac_backend_onnx.so - Merged App.tsx: combined error handling with optional LlamaCPP import - Merged react-native.config.js: combined platform configurations - Kept lock files from RAG-improvements branch (HEAD) - Kept SecureStorageService.ts from HEAD (tested version) - Updated build-react-native.sh: combined documentation with RAG notes All conflicts resolved successfully. Branch ready for merge to main.
- Added Vision imports (VisionHubScreen, VLMScreen) to TabNavigator - Extended tab icons with Vision eye icon - Extended tab labels with Vision label - Added VisionStackScreen component for Vision hub → VLM navigation - Updated RootTabParamList type to include Vision tab Navigation now includes: 1. Chat (default) 2. STT (speech-to-text) 3. TTS (text-to-speech) 4. Voice (voice chat) 5. RAG (retrieval-augmented generation) 6. Vision (image understanding) 7. Settings Both RAG and Vision features are now accessible from bottom tab navigation.
- Copied rac_tool_calling.h header to core package includes - Stubbed out ToolCallingBridge function calls in HybridRunAnywhereCore - Tool calling functions (rac_tool_call_*) not yet available in commons v0.1.4 - Excludes ToolCallingBridge.cpp from Android build until commons is updated - RAG functionality unaffected - tool calling is separate feature - Allows Android build to complete successfully with 7-tab navigation
Clean up unused imports in the React Native RunAnywhere example: remove LLMFramework from the @runanywhere/core import in App.tsx, and drop IconSize and ModelRequiredOverlay imports from RAGScreen.tsx (keeping Spacing, Padding, BorderRadius). This removes unused symbols and addresses related import warnings.
Delete the backup file sdk/runanywhere-commons/src/backends/rag/llamacpp_generator.cpp.bak which contained an older LlamaCPP generator implementation. The file was redundant and is removed to clean up the repository and avoid confusion with the active implementation.
Fixes critical bug where ORT API return statuses were silently ignored in onnx_embedding_provider.cpp and onnx_generator.cpp, which could cause undefined behavior and resource leaks on allocation failures. Solution: - Create shared ort_guards.h with RAII wrappers (OrtStatusGuard, OrtValueGuard, OrtMemoryInfoGuard) - Refactor onnx_embedding_provider.cpp to use RAII guards - Refactor onnx_generator.cpp to use RAII guards - Remove duplicate RAII class definitions Benefits: - All ORT API errors now properly checked - Automatic resource cleanup (exception-safe) - Eliminates code duplication - Zero-cost abstraction with compiler optimizations
…RAG backend This commit addresses four critical bugs in the RAG C++ implementation: 1. ORT status guard memory leak (ort_guards.h, onnx_generator.cpp) - Refactored get_address() to be a pure accessor - Changed to explicit reset() pattern to prevent memory leaks - Fixed dangling pointer issues in onnx_generator.cpp from vector reallocation 2. Hash collision vulnerability (vector_store_usearch.cpp) - Replaced std::hash<std::string> with monotonic counter (next_key_) - Ensures collision-free key generation for vector store entries - Added duplicate detection in add_chunks_batch() 3. Metadata persistence bug (vector_store_usearch.cpp) - Implemented JSON serialization for save/load operations - Now persists chunks_, id_to_key_, and next_key_ alongside USearch index - Added proper deserialization with state reconstruction These fixes ensure memory safety, prevent silent data loss, and enable proper state persistence across save/load cycles.
- Implement OrtSessionOptionsGuard class for automatic session options cleanup - Fix GetTensorMutableData null-pointer dereference with proper status checking - Implement error handling for session options creation/configuration - Replace manual cleanup with RAII pattern (eliminates 3 ReleaseSessionOptions calls) - Rename status_guard to output_status_guard to fix variable shadowing This ensures robust error handling, automatic resource cleanup, and proper C++ exception safety in the ONNX embedding provider used by the RAG backend. All fixes verified with successful Android APK build (arm64-v8a).
Comprehensive improvements to RAG module reliability and debuggability: **TypeScript (Frontend)** - Add step-by-step logging in RAG.ts constructor for module initialization - Track NitroModules proxy availability and hybrid object creation - Include detailed error messages with stack traces for troubleshooting - Wrap createRAG() function in try-catch with logging **C++ Backend (Error Handling)** - vector_store_usearch.cpp: Add error checking for USearch operations * Validate return values for add(), remove(), save(), load() * Log detailed error messages for failed vector store operations * Gracefully handle batch operation failures - onnx_generator.cpp: Add robust error checking for ONNX Runtime * Validate OrtStatus for all CreateTensorWithDataAsOrtValue calls * Check for null tensors (input_ids, attention_mask, position_ids) * Release status objects prope Comprehensive improvements to RAG module reliability and debuggability: **TypeScript (Frontend)** - Add step-by-step logging in RAG.ts ceLi **TypeScript (Frontend)** - Add step-by-step logging in RAG.ts constr.sh- Add step-by-step loggiso- Track NitroModules proxy availability and hybrid object creation - Incl0.- Include detailed error messages with stack traces for troubleshnc- Wrap createRAG() function in try-catch with logging **C++ Backend (Ll **C++ Backend (Error Handling)** - vector_store_use - - vector_store_usearch.cpp: Addea * Validate return values for add(), remove(), save(), load() * Loc * Log detailed error messages for failed vector store opera
- fix arm_fp16 include guard precedence for Apple targets - make add_chunks_batch report success only when chunks added - remove noexcept from search to avoid terminate on exceptions - make load() robust with JSON try/catch and atomic state update
- load WordPiece vocab and enforce presence before model init - add tokenization cache for repeat words - pass vocab path via embeddingConfigJson - download vocab.txt as separate ONNX model
- Replace deprecated exclusionList/blacklistRE with blockList API (metro-config) - Fixes React Native 0.83.1 build compatibility - Add dual-method support in SecureStorageService for TypeScript/C++ interop - Enables fallback between secureStorageStore/Retrieve and secureStorageSet/Get
…ctionality Main fix: - Fix RAG package to use TypeScript source instead of compiled JS Changed package.json main entry from 'lib/index.js' to 'src/index.ts' This resolves 'Cannot find module' error when initializing RAG Aligns with core package structure for proper Metro bundling Additional improvements: - Improve Metro module resolution with explicit extraNodeModules mapping - Add safety flag to prevent duplicate NitroModules native install() calls - Add semantic aliases for secure storage (Store/Retrieve) - Remove pre-bundled JS asset to force Metro dev server usage Fixes React Native Android and iOS RAG module initialization.
…stem Replaces direct Xcode generator CMake invocations with commons build-ios.sh to fix CXX compiler identification failures and create proper XCFrameworks. Changes: - Refactored build-ios-libs.sh to call commons/scripts/build-ios.sh instead of manually configuring CMake with Xcode generator - Added lipo step to create universal simulator static libraries (arm64+x86_64) before XCFramework creation to avoid duplicate architecture errors - Removed ~70 lines of duplicate CMake configuration - Fixed "No known features for CXX compiler" error during simulator builds - Fixed "Both ios-arm64-simulator and ios-x86_64-simulator represent two equivalent library definitions" XCFramework error Technical Details: - Leverages proven toolchain-based builds (ios.toolchain.cmake) instead of Xcode generator which had compiler identification issues - Creates universal simulator libs with lipo before bundling into XCFramework - XCFrameworks now properly contain: - ios-arm64/ (device: 1.2M librac_backend_rag.a) - ios-arm64_x86_64-simulator/ (universal: 2.4M librac_backend_rag.a) - Build output verified: iOS 3.6M XCFramework, Android 41M .so libs Impact: - iOS builds: More reliable, reuses commons infrastructure - Android builds: No changes, release APKs unaffected - React Native integration: Podspec correctly references bundled XCFrameworks Testing: - Full --setup --clean build succeeds - All architectures built (OS arm64, SIMULATORARM64, SIMULATOR x86_64) - XCFrameworks created successfully with proper structure - Android JNI libraries unchanged in jniLibs/arm64-v8a/
…amework The iOS build was failing because RACommons.xcframework headers use framework-relative includes (e.g., `<RACommons/rac_error.h>`), but Xcode couldn't locate the framework during compilation. Changes: - Added HEADER_SEARCH_PATHS for both device (ios-arm64) and simulator (ios-arm64_x86_64-simulator) slices of RACommons.xcframework - Added FRAMEWORK_SEARCH_PATHS pointing to ios/Binaries directory This allows both: 1. Direct includes from C++ bridge files (`#include "rac_telemetry_manager.h"`) 2. Framework-relative includes from RACommons headers (`#include <RACommons/rac_error.h>`) to resolve correctly during the build process. Fixes: iOS build errors related to missing RACommons headers Related: PR RunanywhereAI#349 (RAG improvements)
… iOS build - Rebuild RACommons.xcframework with ios-arm64_x86_64-simulator slice and embedding category definitions - Update RAG module to use <RACommons/...> headers on iOS for proper framework resolution - Add nlohmann/json.hpp header search path to RunAnywhereRAG.podspec - Copy RABackendLLAMACPP.xcframework from build artifacts to SDK and example node_modules These changes fix iOS simulator build failures due to missing headers, missing simulator architecture slices, and undefined embedding category symbols.
- Extract config values under lock, minimize critical section - Move expensive operations (search, generation) outside mutex - Change embedding_provider and text_generator to shared_ptr for safe copying - Add search_with_provider() helper to reduce pointer dereferencing - Inline prompt formatting for performance - Set ORT_API_VERSION=17 to match bundled libonnxruntime - Remove redundant library loading in RAGPackage This reduces blocking time for concurrent access and improves throughput
- Add chunker_test.cpp: 24 test cases covering DocumentChunker functionality * Basic text processing (empty, single/multi-line) * Token estimation and proportionality * Configuration customization (chunk size, overlap) * Boundary conditions (punctuation, whitespace, special chars) * Memory efficiency with large text (100KB+) * Move semantics and thread safety - Add simple_tokenizer_test.cpp: Placeholder for SimpleTokenizer tests * Ready for expansion when class is extracted to public interface - Update CMakeLists.txt: Integrate test targets into build system * rac_chunker_test executable with GoogleTest discovery * rac_simple_tokenizer_test executable with GoogleTest discovery * Proper linking to rac_backend_rag, threads, and GTest All 29 tests passing (24 chunker + 1 tokenizer placeholder + 1 thread safety + 3 executable tests) C++17 standard with proper memory management and best practices
Add REACT_NATIVE_ANDROID_RAG_APP_BUILD.md with: - rebuild-android-ndk26.sh reference and execution flow - 4-step build process: clean, build natives, distribute, build APK - Prerequisites and environment configuration - Manual step-by-step alternative - Development iteration cycle (2-3min vs 15+min full rebuild) - Verification and device testing - Troubleshooting common issues - CI/CD integration examples (GitHub Actions, pre-commit hooks) Targets developers working on RAG+embedding+storage features.
Had to change the build scripts a bit to make it work. Llamacpp wasnt included earlier. Currently this matches the android react native and it has hardcoded docs and doesnt allow user to input docs. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
📝 WalkthroughWalkthroughThis PR implements a complete on-device Retrieval-Augmented Generation (RAG) system spanning native C++ backends, React Native SDK integration, and example application. It adds document chunking, ONNX embeddings, LlamaCPP text generation, USearch vector storage, and accompanying build infrastructure with corresponding TypeScript/Nitrogen bindings for cross-platform deployment. Changes
Sequence Diagram(s)sequenceDiagram
actor User
participant RAGScreen as RAGScreen.tsx
participant RAG_API as RAG.ts API
participant Hybrid as HybridRunAnywhereRAG
participant C_API as rac_rag_pipeline.cpp
participant RAGBackend as RAGBackend
participant VectorStore as VectorStoreUSearch
participant EmbedProvider as ONNXEmbeddingProvider
participant TextGen as LlamaCppGenerator
User->>RAGScreen: addDocument(text, metadata)
RAGScreen->>RAG_API: addDocument()
RAG_API->>Hybrid: addDocument(text, metadata)
Hybrid->>C_API: rac_rag_add_document()
C_API->>RAGBackend: add_document()
RAGBackend->>RAGBackend: chunk_document()
loop for each chunk
RAGBackend->>EmbedProvider: embed(chunk_text)
EmbedProvider->>EmbedProvider: tokenize + ONNX inference
EmbedProvider-->>RAGBackend: embedding vector
RAGBackend->>VectorStore: add_chunk()
VectorStore-->>RAGBackend: success
end
RAGBackend-->>C_API: result
C_API-->>Hybrid: status
Hybrid-->>RAG_API: Promise<boolean>
RAG_API-->>RAGScreen: success
RAGScreen->>User: document indexed
sequenceDiagram
actor User
participant RAGScreen as RAGScreen.tsx
participant RAG_API as RAG.ts API
participant Hybrid as HybridRunAnywhereRAG
participant C_API as rac_rag_pipeline.cpp
participant RAGBackend as RAGBackend
participant VectorStore as VectorStoreUSearch
participant EmbedProvider as ONNXEmbeddingProvider
participant TextGen as LlamaCppGenerator
User->>RAGScreen: query(question)
RAGScreen->>RAG_API: query(question, options)
RAG_API->>Hybrid: query(RAGQuery)
Hybrid->>C_API: rac_rag_query()
C_API->>RAGBackend: query()
RAGBackend->>EmbedProvider: embed(question)
EmbedProvider-->>RAGBackend: query_embedding
RAGBackend->>VectorStore: search(query_embedding, top_k)
VectorStore-->>RAGBackend: SearchResult[]
RAGBackend->>RAGBackend: build_context(results)
RAGBackend->>RAGBackend: format_prompt(question, context)
RAGBackend->>TextGen: generate(prompt, options)
TextGen->>TextGen: tokenize + LlamaCPP inference loop
TextGen-->>RAGBackend: GenerationResult
RAGBackend-->>C_API: RAGResult
C_API-->>Hybrid: result + chunks + timing
Hybrid-->>RAG_API: Promise<RAGResult>
RAG_API-->>RAGScreen: result
RAGScreen->>User: answer + sources + timing
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related issues
Possibly related PRs
Suggested labels
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 3❌ Failed checks (2 warnings, 1 inconclusive)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
Generated with ❤️ by ellipsis.dev |
|
@shubhammalhotra28 , as discussed, ios now works. the data itself is still hardcoded. i'll continue working on getting RAG done for swift now. |
There was a problem hiding this comment.
Actionable comments posted: 4
Note
Due to the large number of review comments, Critical severity comments were prioritized as inline comments.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (16)
sdk/runanywhere-react-native/packages/core/android/build.gradle (1)
119-131:⚠️ Potential issue | 🟠 MajorHardcoded
arm64-v8abreaks emulator builds and violates multi-ABI requirementTwo distinct problems introduced by these changes:
Emulator breakage: Dropping
x86_64from bothndk.abiFiltersandcmake.abiFiltersmeans the JNI layer will not be compiled for Android emulators (AVDs). The comment at line 41 explicitly acknowledges x86_64 is required for emulators — the implementation now contradicts that comment.Inconsistency with download task:
reactNativeArchitectures()(lines 43–48) still returns["arm64-v8a", "x86_64"]by default and is still consumed at line 228 (def requestedAbis = reactNativeArchitectures()). The download task will check/fetch x86_64 libs, but the build system won't compile native code for that ABI. Thegradle.propertiesoverride described in line 42's comment is also now silently inert for the compilation step.Both the
ndkandcmakeblocks should be restored to use the dynamic function:🔧 Proposed fix
ndk { - abiFilters 'arm64-v8a' + abiFilters(*reactNativeArchitectures()) } externalNativeBuild { cmake { cppFlags "-frtti -fexceptions -Wall -fstack-protector-all" arguments "-DANDROID_STL=c++_shared", "-DREACT_NATIVE_NITRO_BUILD_DIR=${rootProject.buildDir}" - abiFilters 'arm64-v8a' + abiFilters(*reactNativeArchitectures()) } }If the intent is to permanently ship arm64-v8a only (e.g., as a temporary workaround while the iOS PR stabilises), then
reactNativeArchitectures()and its call site indownloadNativeLibsshould be updated to match so the two don't diverge silently.Based on learnings, Android outputs must include JNI libraries with proper ABI subdirectories (arm64-v8a, x86_64, armeabi-v7a, x86), which this change violates.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@sdk/runanywhere-react-native/packages/core/android/build.gradle` around lines 119 - 131, The change hardcodes abiFilters to 'arm64-v8a' in the ndk and externalNativeBuild.cmake blocks, breaking emulator builds and creating a mismatch with reactNativeArchitectures()/downloadNativeLibs which still expect x86_64; restore dynamic ABI selection by replacing the literal 'arm64-v8a' in ndk.abiFilters and cmake.abiFilters with the function/variable used elsewhere (reactNativeArchitectures() or the same property used to compute requestedAbis) so both compilation and download steps use the same ABI list, or alternatively update reactNativeArchitectures() and requestedAbis in downloadNativeLibs to intentionally and explicitly limit to arm64-v8a if you truly intend a single-ABI build.sdk/runanywhere-react-native/packages/core/src/Public/Extensions/RunAnywhere+Models.ts (3)
246-251:⚠️ Potential issue | 🟡 Minor
generateModelIdregex omits.onnx— double-extension filename when auto-generating an id.If a caller passes an ONNX URL to
registerModelwithout an explicitid, the generated id retains.onnx(e.g.embedding.onnx).downloadModelthen appends the detected extension again, writing the file asembedding.onnx.onnx.🐛 Proposed fix
- return filename.replace(/\.(gguf|bin|safetensors|tar\.gz|zip)$/i, ''); + return filename.replace(/\.(gguf|onnx|bin|safetensors|tar\.gz|tar\.bz2|zip)$/i, '');🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@sdk/runanywhere-react-native/packages/core/src/Public/Extensions/RunAnywhere`+Models.ts around lines 246 - 251, The generateModelId function's regex misses the .onnx extension so filenames like "embedding.onnx" keep the extension and later get double-appended; update the regex in generateModelId (referencing the function name generateModelId and its use in registerModel/downloadModel flows) to include '\.onnx' (case-insensitive) among the alternations so the generated id strips .onnx as well, ensuring downstream downloadModel won't append the extension twice.
367-398:⚠️ Potential issue | 🟠 Major
deleteModelleaks.onnxfiles on disk after download.The extension lookup on line 370-372 only covers
.gguf; all other extensions (.onnx,.tar.bz2,.tar.gz,.zip) resolve to''. WhendownloadModelsaves an ONNX model as${modelId}.onnx,deleteModelattemptsFileSystem.deleteModel("${modelId}")and the bare-id fallback on line 382 — neither of which matches the actual filename — leaving the file on disk while the registry is updated toisDownloaded: false.The fix should mirror the same extension-detection chain used in
downloadModel:🐛 Proposed fix
- const extension = modelInfo?.downloadURL?.includes('.gguf') - ? '.gguf' - : ''; + const url = modelInfo?.downloadURL ?? ''; + let extension = ''; + if (url.includes('.gguf')) { + extension = '.gguf'; + } else if (url.includes('.onnx')) { + extension = '.onnx'; + } else if (url.includes('.tar.bz2')) { + extension = '.tar.bz2'; + } else if (url.includes('.tar.gz')) { + extension = '.tar.gz'; + } else if (url.includes('.zip')) { + extension = '.zip'; + }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@sdk/runanywhere-react-native/packages/core/src/Public/Extensions/RunAnywhere`+Models.ts around lines 367 - 398, deleteModel currently only checks for '.gguf' and thus misses files like '.onnx' saved by downloadModel; update deleteModel (which calls ModelRegistry.getModel, FileSystem.deleteModel and ModelRegistry.registerModel) to detect the same extension set used by downloadModel (at minimum '.gguf', '.onnx', '.tar.bz2', '.tar.gz', '.zip') by inspecting modelInfo.downloadURL, construct the correct fileName (e.g., `${modelId}${extension}`) and call FileSystem.deleteModel for each plausible filename variant (with the detected extension and the bare modelId as fallback) before updating ModelInfo (localPath/isDownloaded) and re-registering via ModelRegistry.registerModel.
220-224:⚠️ Potential issue | 🟠 MajorFix format detection to handle ONNX and other supported formats.
ModelFormat.ONNXexists in the enum, but the ternary at lines 220–224 only checks for.zipand.gguf. Any other URL (including.onnx,.safetensors,.bin) incorrectly falls through toModelFormat.GGUF. The code should detect file extensions more broadly to map them to their correct format enum values, or default toModelFormat.Unknowninstead of hardcodingGGUFas the fallback.Current code (lines 220–224)
format: options.url.includes('.zip') ? ModelFormat.Zip : options.url.includes('.gguf') ? ModelFormat.GGUF : ModelFormat.GGUF,🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@sdk/runanywhere-react-native/packages/core/src/Public/Extensions/RunAnywhere`+Models.ts around lines 220 - 224, The format detection for the model (the format property in RunAnywhere+Models.ts) only checks for '.zip' and '.gguf' and falls back to GGUF, causing formats like .onnx/.safetensors/.bin to be misclassified; change the logic to extract the file extension from options.url (lowercased) and map extensions to the correct ModelFormat enum values (e.g., '.zip' → ModelFormat.Zip, '.gguf' → ModelFormat.GGUF, '.onnx' → ModelFormat.ONNX, '.safetensors' → ModelFormat.Safetensors or appropriate enum member, '.bin' → ModelFormat.Bin if present), using a switch or a lookup map, and make the fallback ModelFormat.Unknown instead of ModelFormat.GGUF so unknown extensions are not misclassified. Ensure you update the code that sets the format property (the object/field named format) to use this centralized extension-to-enum mapping.examples/react-native/RunAnywhereAI/src/navigation/TabNavigator.tsx (1)
1-11:⚠️ Potential issue | 🟡 MinorFile header comment is stale — still lists 6 tabs without RAG.
The docblock enumerates the original 6 tabs but RAG is now Tab 4. Update the list so the header stays in sync with the actual tab set.
Proposed fix
/** * TabNavigator - Bottom Tab Navigation * - * Reference: iOS ContentView.swift with 6 tabs: + * Reference: iOS ContentView.swift with 7 tabs: * - Chat (LLM) * - STT (Speech-to-Text) * - TTS (Text-to-Speech) * - Voice (Voice Assistant - STT + LLM + TTS) + * - RAG (Retrieval-Augmented Generation) * - Vision (VLM only; image generation is Swift sample app only) * - Settings (includes Tool Settings) */🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@examples/react-native/RunAnywhereAI/src/navigation/TabNavigator.tsx` around lines 1 - 11, Update the stale file header in TabNavigator.tsx so the enumerated tab list matches the current tabs (include RAG as Tab 4 instead of the old Voice/ordering): edit the docblock at the top to list the actual tabs and their purposes (Chat (LLM), STT (Speech-to-Text), TTS (Text-to-Speech), RAG (Retrieval-Augmented Generation), Vision (VLM), Settings), ensuring the order and labels reflect the current TabNavigator component.sdk/runanywhere-react-native/packages/core/cpp/HybridRunAnywhereCore.cpp (1)
2680-2735:⚠️ Potential issue | 🟠 MajorStub implementations for
formatToolsForPrompt,buildInitialPrompt, andbuildFollowupPromptsilently bypass TypeScript fallback handlers.The review's core concern is correct. These methods return silent sentinel values that the TypeScript layer treats as valid results, preventing the designed error-recovery path from executing.
The TypeScript layer explicitly expects these C++ calls to fail and has catch handlers with fallback implementations:
Method C++ stub TS fallback formatToolsForPromptreturns ""catches error, returns toolsJsonrawbuildInitialPromptreturns userPromptcatches error, returns ${toolsJson}\n\nUser: ${userPrompt}buildFollowupPromptreturns originalPromptcatches error, returns template with tool result included The C++ stubs bypass these fallbacks entirely. The TS fallbacks are actually superior to the C++ stubs because they preserve tool context and results. Since the file's pattern is to throw
std::runtime_errorfor missing/disabled features (20+ examples throughout), throwing here would trigger the TS error handlers and use the better fallback implementations.The current approach results in silent semantic degradation: tool results are discarded, tool definitions are omitted from prompts, and the LLM never receives the context it needs—all without any signal to callers beyond a dev-only log line.
Throwing
std::runtime_erroraligns with the codebase pattern and allows the designed fallback path to activate.🛡️ Suggested fix — throw to trigger TypeScript fallbacks
std::shared_ptr<Promise<std::string>> HybridRunAnywhereCore::formatToolsForPrompt( const std::string& toolsJson, const std::string& format ) { return Promise<std::string>::async([toolsJson, format]() -> std::string { - // TODO: Re-enable when commons includes rac_tool_call_* functions - LOGW("formatToolsForPrompt: ToolCallingBridge disabled, returning empty string"); - return ""; + // TODO: Re-enable when commons includes rac_tool_call_* functions + throw std::runtime_error("formatToolsForPrompt: ToolCallingBridge not yet available on this build"); }); }Apply the same pattern to
buildInitialPromptandbuildFollowupPrompt.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@sdk/runanywhere-react-native/packages/core/cpp/HybridRunAnywhereCore.cpp` around lines 2680 - 2735, These three stub functions (formatToolsForPrompt, buildInitialPrompt, buildFollowupPrompt) currently return benign sentinel values which bypass TypeScript fallback handlers; modify each to throw a std::runtime_error (matching the file's existing pattern for disabled features) instead of returning "", userPrompt, or originalPrompt so the JS layer catch handlers run and apply the proper fallbacks; locate the async lambdas in HybridRunAnywhereCore::formatToolsForPrompt, ::buildInitialPrompt, and ::buildFollowupPrompt and replace the temporary-return behavior with throwing std::runtime_error containing a short message like "ToolCallingBridge disabled".sdk/runanywhere-react-native/packages/core/src/Foundation/Security/SecureStorageService.ts (1)
94-94:⚠️ Potential issue | 🟡 MinorNon-standard parameter order: consider aligning
store()with JS ecosystem conventions.The method signature uses
store(value, key)rather than the conventional(key, value)order found in Map.set, localStorage.setItem, and the native layer itself (secureStorageStore). While the current internal callers are consistent with this order, the reversal creates unnecessary cognitive load and makes the API less intuitive for developers familiar with standard key-value patterns.For simplicity and consistency with the broader ecosystem, consider swapping the parameter order to
store(key, value)to match both the native interface and standard JS conventions.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@sdk/runanywhere-react-native/packages/core/src/Foundation/Security/SecureStorageService.ts` at line 94, The SecureStorageService.store method currently uses the non-standard parameter order store(value, key); change its signature to store(key: SecureStorageKey | string, value: string) to match JS conventions and the native secureStorageStore API, then update all internal callers and any overloads/usages to pass (key, value) accordingly and adjust any type annotations or tests referencing SecureStorageService.store.sdk/runanywhere-react-native/packages/llamacpp/android/build.gradle (1)
41-48:⚠️ Potential issue | 🟠 MajorHardcoded
arm64-v8asilently drops x86_64 emulator support and leaves misleading dead configurationChanging both
ndk.abiFiltersandcmake.abiFiltersto the literal'arm64-v8a'has two compounding effects:
- Emulator builds break: x86_64 emulators (common in CI and on non-Apple-Silicon machines) can no longer compile the JNI layer. This silently fails without a clear error at the Gradle configuration stage.
reactNativeArchitectures()is now decoupled from the actual build: The helper function is still called bydownloadNativeLibs(line 215) to determine which ABIs to download, so it may attempt to download x86_64 libraries that the NDK build step will never produce. The comment on lines 41–42 explicitly calls out x86_64 as needed for emulators, which directly contradicts the hardcoded value.If the simplification is intentional for this release, at minimum update the comment and consider whether
downloadNativeLibsshould also be constrained to match.🔧 Option A — restore dynamic ABI selection
ndk { - abiFilters 'arm64-v8a' + abiFilters reactNativeArchitectures().toArray(new String[0]) } externalNativeBuild { cmake { cppFlags "-frtti -fexceptions -Wall -fstack-protector-all" arguments "-DANDROID_STL=c++_shared", "-DREACT_NATIVE_NITRO_BUILD_DIR=${rootProject.buildDir}" - abiFilters 'arm64-v8a' + abiFilters reactNativeArchitectures().toArray(new String[0]) } }🔧 Option B — if arm64-only is intentional, update comments and download logic to match
-// Supported ABIs - arm64-v8a for physical devices, x86_64 for emulators -// Can be overridden via gradle.properties: reactNativeArchitectures=arm64-v8a -def reactNativeArchitectures() { - def value = rootProject.hasProperty("reactNativeArchitectures") - ? rootProject.property("reactNativeArchitectures") - : null - return value ? value.split(",").collect { it.trim() } : ["arm64-v8a", "x86_64"] -} +// NOTE: Build is intentionally constrained to arm64-v8a (physical devices only). +// x86_64 emulator builds are not supported in this release. +def supportedAbis() { return ["arm64-v8a"] }And replace all
reactNativeArchitectures()calls withsupportedAbis().Also applies to: 116-116, 125-125
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@sdk/runanywhere-react-native/packages/llamacpp/android/build.gradle` around lines 41 - 48, The build script currently hardcodes ABI selections by setting ndk.abiFilters and cmake.abiFilters to the literal 'arm64-v8a', which breaks x86_64 emulator builds and diverges from the reactNativeArchitectures() helper and comments; either restore dynamic ABI selection by using the reactNativeArchitectures() function where ndk.abiFilters and cmake.abiFilters are configured (so the build matches the helper and downloadNativeLibs), or, if arm64-only is intentional, update the comments and change downloadNativeLibs and all calls to reactNativeArchitectures() to a single authoritative supportedAbis() that returns ["arm64-v8a"] so download and build are consistent. Ensure you update references to reactNativeArchitectures(), ndk.abiFilters, cmake.abiFilters, and downloadNativeLibs accordingly.sdk/runanywhere-react-native/packages/core/src/types/enums.ts (1)
81-95:⚠️ Potential issue | 🔴 CriticalAdd
Embeddingcase to the SwiftModelCategoryenum to maintain SDK parityThe file header states these enums "match the iOS Swift SDK exactly for consistency," but the Swift
ModelCategoryenum only has 7 cases (language, speechRecognition, speechSynthesis, vision, imageGeneration, multimodal, audio) and lacks theEmbeddingcase you've added here. The Swift C++ bridge conversions (bothtoC()andinit(from:)) would also need updating to handle the new case, or they'll silently map it to.audio. Verify with the Swift SDK team whetherEmbeddingshould be added there as well, or roll this change back if not supported by the C++ backend.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@sdk/runanywhere-react-native/packages/core/src/types/enums.ts` around lines 81 - 95, You added a new ModelCategory value Embedding in ModelCategoryDisplayNames but the iOS Swift enum ModelCategory (and its C++ bridge methods toC() and init(from:)) doesn’t include Embedding; update the Swift ModelCategory enum to include embedding (and update the bridge conversions to map the new case) OR remove/rollback the Embedding addition here to keep parity—locate the Swift enum named ModelCategory and its conversion functions toC() and init(from:) and either add the Embedding case and corresponding C++ mapping, or remove the Embedding usage from ModelCategoryDisplayNames to match existing Swift/C++ support.sdk/runanywhere-react-native/scripts/build-react-native.sh (2)
575-591:⚠️ Potential issue | 🟠 Major
librac_commons.sois not synced to the RAG package.The comment on line 53 states "librac_commons.so is synced to ALL packages (core, llamacpp, onnx, rag)" but the sync block at lines 587–590 only copies to
llamacppandonnx, missingrag.Proposed fix
local CORE_RAC="${CORE_ANDROID_JNILIBS}/${ABI}/librac_commons.so" if [[ -f "$CORE_RAC" ]]; then cp "$CORE_RAC" "${LLAMACPP_ANDROID_JNILIBS}/${ABI}/librac_commons.so" cp "$CORE_RAC" "${ONNX_ANDROID_JNILIBS}/${ABI}/librac_commons.so" - log_info "Synced librac_commons.so to llamacpp + onnx packages" + mkdir -p "${RAG_ANDROID_JNILIBS}/${ABI}" + cp "$CORE_RAC" "${RAG_ANDROID_JNILIBS}/${ABI}/librac_commons.so" + log_info "Synced librac_commons.so to llamacpp + onnx + rag packages" fi🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@sdk/runanywhere-react-native/scripts/build-react-native.sh` around lines 575 - 591, The sync block that copies CORE_RAC only copies to LLAMACPP_ANDROID_JNILIBS and ONNX_ANDROID_JNILIBS but omits the RAG package; update the conditional that checks CORE_RAC (local CORE_RAC="${CORE_ANDROID_JNILIBS}/${ABI}/librac_commons.so") to also cp "$CORE_RAC" into the RAG package JNI path (e.g. ${RAG_ANDROID_JNILIBS}/${ABI}/librac_commons.so) and adjust the log_info message to reflect "llamacpp + onnx + rag" so librac_commons.so is truly synced to all packages as intended.
642-668:⚠️ Potential issue | 🟡 MinorRAG
.testlocalmarker not managed inset_mode().
copy_ios_frameworks()creates a.testlocalmarker for the RAG package (line 445), butset_mode()doesn't create or remove it when toggling between local/remote modes. This means RAG will remain in local mode even after--remoteis used, or won't have the marker when using--localwithout--setup.Proposed fix
if [[ "$MODE" == "local" ]]; then export RA_TEST_LOCAL=1 # Create .testlocal markers for iOS touch "${RN_SDK_DIR}/packages/core/ios/.testlocal" touch "${RN_SDK_DIR}/packages/llamacpp/ios/.testlocal" touch "${RN_SDK_DIR}/packages/onnx/ios/.testlocal" + touch "${RN_SDK_DIR}/packages/rag/ios/.testlocal" log_info "Switched to LOCAL mode" @@ ... else unset RA_TEST_LOCAL # Remove .testlocal markers rm -f "${RN_SDK_DIR}/packages/core/ios/.testlocal" rm -f "${RN_SDK_DIR}/packages/llamacpp/ios/.testlocal" rm -f "${RN_SDK_DIR}/packages/onnx/ios/.testlocal" + rm -f "${RN_SDK_DIR}/packages/rag/ios/.testlocal"🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@sdk/runanywhere-react-native/scripts/build-react-native.sh` around lines 642 - 668, set_mode() currently creates/removes .testlocal markers for core/llamacpp/onnx but omits the RAG package that copy_ios_frameworks() writes, causing inconsistent local/remote state; update set_mode() to also touch "${RN_SDK_DIR}/packages/rag/ios/.testlocal" when MODE=="local" and remove that same file (rm -f) when switching to remote so RAG's .testlocal marker is managed consistently with core/llamacpp/onnx.sdk/runanywhere-react-native/packages/onnx/android/build.gradle (1)
115-127:⚠️ Potential issue | 🟡 MinorHardcoded
arm64-v8abreaks x86_64 emulator support, contradicting the documented behavior on Line 41.The comment on Lines 41-42 documents "x86_64 for emulators" as supported, and
reactNativeArchitectures()(Line 43-48) defaults to["arm64-v8a", "x86_64"]. However, Lines 116 and 125 hardcodeabiFilters 'arm64-v8a'only, blocking x86_64 from being built.This inconsistency causes the download task (Line 214+) to attempt downloading x86_64 libraries (as seen in the ABI matching regex on Lines 254, 321), but the NDK/CMake filters prevent them from being compiled. The RAG package (
packages/rag/) demonstrates the correct pattern by using dynamicabiFilters (*reactNativeArchitectures()).Per the learnings, Android outputs must include proper ABI subdirectories (arm64-v8a, x86_64, armeabi-v7a, x86). Use
abiFilters (*reactNativeArchitectures())to maintain multi-ABI support.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@sdk/runanywhere-react-native/packages/onnx/android/build.gradle` around lines 115 - 127, The build.gradle hardcodes abiFilters 'arm64-v8a' in the ndk and externalNativeBuild.cmake blocks which prevents x86_64 emulator builds; replace those hardcoded abiFilters with a dynamic call using reactNativeArchitectures() (e.g., abiFilters (*reactNativeArchitectures())) so ndk and cmake use the same multi-ABI list as reactNativeArchitectures(), ensuring arm64-v8a and x86_64 (and any other configured ABIs) are built and match the download/ABI-matching logic.sdk/runanywhere-commons/scripts/build-android.sh (1)
630-635:⚠️ Potential issue | 🟠 Major
librac_commons.sois not propagated to therag/dist directory.The propagation loop explicitly names
onnx llamacpp whispercpp tflitebut omitsrag. Consequentlydist/android/rag/${ABI}/will be missinglibrac_commons.so, causing a runtime linker failure for any consumer of the RAG package (including the downstream copy inrebuild-android-ndk26.shwhich sources fromDIST_RAG).🐛 Proposed fix
- for backend in onnx llamacpp whispercpp tflite; do + for backend in onnx llamacpp whispercpp tflite rag; do🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@sdk/runanywhere-commons/scripts/build-android.sh` around lines 630 - 635, The loop that copies ${RAC_COMMONS_LIB} into backend dist directories (for backend in onnx llamacpp whispercpp tflite) omits the rag backend, so ${DIST_DIR}/rag/${ABI}/ never gets librac_commons.so; update the loop in build-android.sh to include "rag" in the backend list (or add an explicit copy for ${DIST_DIR}/rag/${ABI}/) so that the library is propagated to the rag dist directory using the same variables ABI, DIST_DIR and RAC_COMMONS_LIB.examples/react-native/RunAnywhereAI/App.tsx (1)
256-256:⚠️ Potential issue | 🟡 Minor
console.warnused for a success pathLine 256 logs a success message (
'All models registered') usingconsole.warn, which will show up as a warning in developer consoles and log aggregation tools. Useconsole.loginstead.🛠️ Proposed fix
- console.warn('[App] All models registered'); + console.log('[App] All models registered');🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@examples/react-native/RunAnywhereAI/App.tsx` at line 256, The success message currently uses console.warn('[App] All models registered'), which will surface as a warning; replace that call with console.log('[App] All models registered') to indicate a normal informational success path. Locate the console.warn usage in App.tsx (the registration completion point) and change the function call to console.log while keeping the same message text so logs reflect success rather than a warning.sdk/runanywhere-commons/scripts/build-ios.sh (1)
240-256:⚠️ Potential issue | 🔴 CriticalAdd missing
-DRAC_BACKEND_RAGCMake flag to iOS build platform functions.The
build_platform(lines 240–256) andbuild_macos(lines 197–207) functions lack aragcase in theirBUILD_BACKENDswitch statements, and neither case enables the RAG backend via CMake. While the script documents--backend ragas a supported option (line 15) and attempts to bundleRABackendRAG.xcframework(line 722), the CMake flag-DRAC_BACKEND_RAG=ONis never passed. When--backend ragis specified, it falls through to theall|*case, which only enables LLAMACPP and ONNX, causing the RAG backend to never be built and the framework creation at line 722 to fail.Proposed fix
case "$BUILD_BACKEND" in llamacpp) BACKEND_FLAGS="$BACKEND_FLAGS -DRAC_BACKEND_LLAMACPP=ON -DRAC_BACKEND_ONNX=OFF -DRAC_BACKEND_WHISPERCPP=OFF" ;; onnx) BACKEND_FLAGS="$BACKEND_FLAGS -DRAC_BACKEND_LLAMACPP=OFF -DRAC_BACKEND_ONNX=ON -DRAC_BACKEND_WHISPERCPP=OFF" ;; + rag) + BACKEND_FLAGS="$BACKEND_FLAGS -DRAC_BACKEND_LLAMACPP=ON -DRAC_BACKEND_ONNX=ON -DRAC_BACKEND_RAG=ON -DRAC_BACKEND_WHISPERCPP=OFF" + ;; all|*) - BACKEND_FLAGS="$BACKEND_FLAGS -DRAC_BACKEND_LLAMACPP=ON -DRAC_BACKEND_ONNX=ON -DRAC_BACKEND_WHISPERCPP=OFF" + BACKEND_FLAGS="$BACKEND_FLAGS -DRAC_BACKEND_LLAMACPP=ON -DRAC_BACKEND_ONNX=ON -DRAC_BACKEND_RAG=ON -DRAC_BACKEND_WHISPERCPP=OFF" ;; esacApply the same change to
build_macos(lines 197–207).🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@sdk/runanywhere-commons/scripts/build-ios.sh` around lines 240 - 256, The BUILD_BACKEND switch in build_platform (and similarly in build_macos) doesn't handle the "rag" case and never sets the CMake flag for the RAG backend; update the BUILD_BACKEND case branches to include a rag case that appends "-DRAC_BACKEND_RAG=ON -DRAC_BACKEND_WHISPERCPP=OFF" (and ensure other backends set -DRAC_BACKEND_RAG=OFF where appropriate), and also add the same explicit rag case to build_macos so BACKEND_FLAGS passed to CMake includes -DRAC_BACKEND_RAG=ON when BUILD_BACKEND=="rag".examples/react-native/RunAnywhereAI/android/app/build.gradle (1)
118-118:⚠️ Potential issue | 🟡 MinorDuplicate
ndkVersiondeclarations — potential conflict.Line 118 uses
rootProject.ext.ndkVersionfrom the rootbuild.gradle, while line 143 hardcodes"26.3.11579264"insidedefaultConfig. The innerndkVersiontakes precedence, silently overriding the centralized version. Pick one source of truth.📝 Suggested fix — remove the hardcoded version
- ndkVersion "26.3.11579264"Also applies to: 143-143
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@examples/react-native/RunAnywhereAI/android/app/build.gradle` at line 118, The project has two ndkVersion declarations (rootProject.ext.ndkVersion and a hardcoded "26.3.11579264" inside defaultConfig) causing the inner value to override the centralized one; remove the hardcoded ndkVersion from the defaultConfig block so the build uses rootProject.ext.ndkVersion as the single source of truth (search for ndkVersion in the defaultConfig section and delete that line).
| class OrtStatusGuard { | ||
| public: | ||
| explicit OrtStatusGuard(const OrtApi* api) : api_(api), status_(nullptr) {} | ||
|
|
||
| ~OrtStatusGuard() { | ||
| if (status_ && api_) { | ||
| api_->ReleaseStatus(status_); | ||
| } | ||
| } | ||
|
|
||
| OrtStatusGuard(const OrtStatusGuard&) = delete; | ||
| OrtStatusGuard& operator=(const OrtStatusGuard&) = delete; | ||
|
|
||
| // Get address for new status assignment | ||
| // IMPORTANT: Only call this once per ORT API call, or use reset() to properly clean up first | ||
| OrtStatus** get_address() { | ||
| return &status_; | ||
| } | ||
|
|
||
| OrtStatus* get() const { return status_; } | ||
| bool is_error() const { return status_ != nullptr; } | ||
| const char* error_message() const { | ||
| return (status_ && api_) ? api_->GetErrorMessage(status_) : "Unknown error"; | ||
| } | ||
|
|
||
| // Reset to new status (releases old status first if present) | ||
| // Use this for sequential ORT calls: status_guard.reset(api->Function(...)) | ||
| void reset(OrtStatus* new_status = nullptr) { | ||
| if (status_ && api_) { | ||
| api_->ReleaseStatus(status_); | ||
| } | ||
| status_ = new_status; | ||
| } | ||
|
|
||
| private: | ||
| const OrtApi* api_; | ||
| OrtStatus* status_; | ||
| }; | ||
|
|
||
| // RAII guard for OrtValue - automatically releases tensor on scope exit | ||
| class OrtValueGuard { | ||
| public: | ||
| explicit OrtValueGuard(const OrtApi* api) : api_(api), value_(nullptr) {} | ||
|
|
||
| ~OrtValueGuard() { | ||
| if (value_ && api_) { | ||
| api_->ReleaseValue(value_); | ||
| } | ||
| } | ||
|
|
||
| // Non-copyable | ||
| OrtValueGuard(const OrtValueGuard&) = delete; | ||
| OrtValueGuard& operator=(const OrtValueGuard&) = delete; | ||
|
|
||
| // Movable (for storing in containers) | ||
| OrtValueGuard(OrtValueGuard&& other) noexcept | ||
| : api_(other.api_), value_(other.value_) { | ||
| other.value_ = nullptr; | ||
| } | ||
|
|
||
| OrtValueGuard& operator=(OrtValueGuard&& other) noexcept { | ||
| if (this != &other) { | ||
| if (value_ && api_) { | ||
| api_->ReleaseValue(value_); | ||
| } | ||
| api_ = other.api_; | ||
| value_ = other.value_; | ||
| other.value_ = nullptr; | ||
| } | ||
| return *this; | ||
| } | ||
|
|
||
| OrtValue** ptr() { return &value_; } | ||
| OrtValue* get() const { return value_; } | ||
| OrtValue* release() { | ||
| OrtValue* tmp = value_; | ||
| value_ = nullptr; | ||
| return tmp; | ||
| } | ||
|
|
||
| private: | ||
| const OrtApi* api_; | ||
| OrtValue* value_; | ||
| }; | ||
|
|
||
| // RAII guard for OrtMemoryInfo - automatically releases memory info on scope exit | ||
| class OrtMemoryInfoGuard { | ||
| public: | ||
| explicit OrtMemoryInfoGuard(const OrtApi* api) : api_(api), memory_info_(nullptr) {} | ||
|
|
||
| ~OrtMemoryInfoGuard() { | ||
| if (memory_info_ && api_) { | ||
| api_->ReleaseMemoryInfo(memory_info_); | ||
| } | ||
| } | ||
|
|
||
| // Non-copyable | ||
| OrtMemoryInfoGuard(const OrtMemoryInfoGuard&) = delete; | ||
| OrtMemoryInfoGuard& operator=(const OrtMemoryInfoGuard&) = delete; | ||
|
|
||
| OrtMemoryInfo** ptr() { return &memory_info_; } | ||
| OrtMemoryInfo* get() const { return memory_info_; } | ||
|
|
||
| private: | ||
| const OrtApi* api_; | ||
| OrtMemoryInfo* memory_info_; | ||
| }; | ||
|
|
||
| // RAII guard for OrtSessionOptions - automatically releases session options on scope exit | ||
| class OrtSessionOptionsGuard { | ||
| public: | ||
| explicit OrtSessionOptionsGuard(const OrtApi* api) : api_(api), options_(nullptr) {} | ||
|
|
||
| ~OrtSessionOptionsGuard() { | ||
| if (options_ && api_) { | ||
| api_->ReleaseSessionOptions(options_); | ||
| } | ||
| } | ||
|
|
||
| // Non-copyable (session options are not trivially copyable) | ||
| OrtSessionOptionsGuard(const OrtSessionOptionsGuard&) = delete; | ||
| OrtSessionOptionsGuard& operator=(const OrtSessionOptionsGuard&) = delete; | ||
|
|
||
| // Movable | ||
| OrtSessionOptionsGuard(OrtSessionOptionsGuard&& other) noexcept | ||
| : api_(other.api_), options_(other.options_) { | ||
| other.options_ = nullptr; | ||
| } | ||
|
|
||
| OrtSessionOptionsGuard& operator=(OrtSessionOptionsGuard&& other) noexcept { | ||
| if (this != &other) { | ||
| if (options_ && api_) { | ||
| api_->ReleaseSessionOptions(options_); | ||
| } | ||
| api_ = other.api_; | ||
| options_ = other.options_; | ||
| other.options_ = nullptr; | ||
| } | ||
| return *this; | ||
| } | ||
|
|
||
| OrtSessionOptions** ptr() { return &options_; } | ||
| OrtSessionOptions* get() const { return options_; } | ||
| OrtSessionOptions* release() { | ||
| OrtSessionOptions* tmp = options_; | ||
| options_ = nullptr; | ||
| return tmp; | ||
| } | ||
|
|
||
| private: | ||
| const OrtApi* api_; | ||
| OrtSessionOptions* options_; | ||
| }; |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# Check if the file exists and examine its content
cat -n sdk/runanywhere-commons/include/backends/rag/ort_guards.h | head -200Repository: RunanywhereAI/runanywhere-sdks
Length of output: 5957
🌐 Web query:
ONNX Runtime C API OrtApi struct function pointers Release methods ReleaseStatus ReleaseValue documentation
💡 Result:
Where the OrtApi “Release*” methods are documented
-
OrtApistruct (Doxygen / official C API docs): theOrtApistruct lists the function-pointer entries for C API calls, including:void ReleaseStatus(OrtStatus* input)void ReleaseValue(OrtValue* input)
Documented on theOrtApistruct reference page. [1]
-
Header source (
onnxruntime_c_api.h) (authoritative signatures): the same API is defined in the public header, which is what the Doxygen pages are generated from. [2]
What they mean / when to call them
-
ReleaseStatus: Every C API call that returns anOrtStatus*uses the convention:NULL⇒ success- non-
NULL⇒ error object you must free withapi->ReleaseStatus(status)after reading it (e.g., viaapi->GetErrorMessage(status)). [1]
-
ReleaseValue: Frees anOrtValue*you own (tensors, maps, sequences, sparse tensors, outputs fromRun, etc.). Many creation APIs explicitly say “Must be freed withReleaseValue”. [3]
Important ownership note: for tensors created withCreateTensorWithDataAsOrtValue, the underlyingp_databuffer is owned by the caller, andReleaseValuewill not freep_data. [3]
Minimal usage pattern (C)
const OrtApi* api = OrtGetApiBase()->GetApi(ORT_API_VERSION);
OrtStatus* st = api->Run(session, NULL,
input_names, input_values, input_count,
output_names, output_count,
output_values);
if (st) {
fprintf(stderr, "ORT error: %s\n", api->GetErrorMessage(st));
api->ReleaseStatus(st);
return;
}
api->ReleaseValue(output_values[0]); // and any other OrtValue* you ownSources:
[1] ONNX Runtime C API docs: OrtApi struct (includes ReleaseStatus, ReleaseValue) (onnxruntime.ai)
[2] onnxruntime_c_api.h source (public C API header) (onnxruntime.ai)
[3] OrtApi docs describing ReleaseValue requirements + ownership notes for CreateTensorWithDataAsOrtValue (onnxruntime.ai)
Citations:
- 1: https://onnxruntime.ai/docs/api/c/struct_ort_api.html
- 2: https://onnxruntime.ai/docs/api/c/onnxruntime__c__api_8h_source.html?utm_source=openai
- 3: https://onnxruntime.ai/docs/api/c/struct_ort_api.html
Apply rac_ prefix to all public class names to comply with coding guidelines.
The classes OrtStatusGuard, OrtValueGuard, OrtMemoryInfoGuard, and OrtSessionOptionsGuard must be renamed to rac_OrtStatusGuard, rac_OrtValueGuard, rac_OrtMemoryInfoGuard, and rac_OrtSessionOptionsGuard respectively, as required by the guideline that all public symbols in sdk/runanywhere-commons/**/*.h must be prefixed with rac_.
Additionally, consider a generic OrtGuard<T, ReleaseFn> template to reduce duplication across the four classes. All follow the same RAII pattern — store a const OrtApi*, store a T*, and release via api_->ReleaseX(ptr) in the destructor. A single class template parameterized on the resource type and release member-function pointer would eliminate ~120 lines of near-identical code and make it trivial to add future guards.
♻️ Proposed template sketch
template <typename T, void (OrtApi::*ReleaseFn)(T*) const>
class rac_OrtGuard {
public:
explicit rac_OrtGuard(const OrtApi* api) : api_(api), ptr_(nullptr) {}
~rac_OrtGuard() { if (ptr_ && api_) (api_->*ReleaseFn)(ptr_); }
rac_OrtGuard(const rac_OrtGuard&) = delete;
rac_OrtGuard& operator=(const rac_OrtGuard&) = delete;
rac_OrtGuard(rac_OrtGuard&& o) noexcept : api_(o.api_), ptr_(o.ptr_) { o.ptr_ = nullptr; }
rac_OrtGuard& operator=(rac_OrtGuard&& o) noexcept {
if (this != &o) {
if (ptr_ && api_) (api_->*ReleaseFn)(ptr_);
api_ = o.api_; ptr_ = o.ptr_; o.ptr_ = nullptr;
}
return *this;
}
T** ptr() { return &ptr_; }
T* get() const { return ptr_; }
T* release() { T* t = ptr_; ptr_ = nullptr; return t; }
private:
const OrtApi* api_;
T* ptr_;
};
// Type aliases
using rac_OrtValueGuard = rac_OrtGuard<OrtValue, &OrtApi::ReleaseValue>;
using rac_OrtMemoryInfoGuard = rac_OrtGuard<OrtMemoryInfo, &OrtApi::ReleaseMemoryInfo>;
using rac_OrtSessionOptionsGuard = rac_OrtGuard<OrtSessionOptions, &OrtApi::ReleaseSessionOptions>;rac_OrtStatusGuard has extra methods (is_error, error_message, reset) so it could remain a standalone class or inherit from the template. This is entirely optional.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@sdk/runanywhere-commons/include/backends/rag/ort_guards.h` around lines 9 -
161, Rename the four public RAII classes to add the required rac_ prefix:
OrtStatusGuard -> rac_OrtStatusGuard, OrtValueGuard -> rac_OrtValueGuard,
OrtMemoryInfoGuard -> rac_OrtMemoryInfoGuard, and OrtSessionOptionsGuard ->
rac_OrtSessionOptionsGuard; update all references to these class names in this
header. To reduce duplication, replace the repeated RAII implementations for
value/memoryinfo/sessionoptions with a single template rac_OrtGuard<T,
ReleaseFn> (as sketched in the review) and provide type aliases
rac_OrtValueGuard, rac_OrtMemoryInfoGuard, rac_OrtSessionOptionsGuard using the
appropriate OrtApi release member pointers; keep rac_OrtStatusGuard as either a
standalone class (preserving is_error, error_message, reset, get_address) or
refactor it to inherit/compose the template while retaining its extra methods.
Ensure constructors, destructors, move semantics and ptr()/get()/release() APIs
remain equivalent so call sites are unaffected.
| std::vector<OrtValue*> input_tensors; | ||
| std::vector<const char*> input_names; | ||
| OrtStatus* status = nullptr; | ||
|
|
||
| // 1. input_ids: [batch_size, sequence_length] | ||
| std::vector<int64_t> current_input_ids; | ||
| if (is_first_step) { | ||
| current_input_ids = input_ids; // Full sequence on first step | ||
| } else { | ||
| current_input_ids = {input_ids.back()}; // Only last token | ||
| } | ||
|
|
||
| std::vector<int64_t> input_ids_shape = {1, static_cast<int64_t>(current_input_ids.size())}; | ||
| OrtValue* input_ids_tensor = nullptr; | ||
| status = cached_api->CreateTensorWithDataAsOrtValue( | ||
| memory_info, | ||
| current_input_ids.data(), | ||
| current_input_ids.size() * sizeof(int64_t), | ||
| input_ids_shape.data(), | ||
| input_ids_shape.size(), | ||
| ONNX_TENSOR_ELEMENT_DATA_TYPE_INT64, | ||
| &input_ids_tensor | ||
| ); | ||
| if (status != nullptr) { | ||
| LOGE("Failed to create input_ids tensor: %s", cached_api->GetErrorMessage(status)); | ||
| cached_api->ReleaseStatus(status); | ||
| result.success = false; | ||
| result.stop_reason = "error"; | ||
| return result; | ||
| } | ||
| if (input_ids_tensor == nullptr) { | ||
| LOGE("input_ids_tensor is null after creation"); | ||
| result.success = false; | ||
| result.stop_reason = "error"; | ||
| return result; | ||
| } | ||
| input_tensors.push_back(input_ids_tensor); | ||
| input_names.push_back("input_ids"); | ||
|
|
||
| // 2. attention_mask: [batch_size, past_seq_len + current_seq_len] | ||
| const size_t total_seq_len = past_seq_len + current_seq_len; | ||
| std::vector<int64_t> attention_mask(total_seq_len, 1); | ||
| std::vector<int64_t> attention_mask_shape = {1, static_cast<int64_t>(total_seq_len)}; | ||
|
|
||
| OrtValue* attention_mask_tensor = nullptr; | ||
| status = cached_api->CreateTensorWithDataAsOrtValue( | ||
| memory_info, | ||
| attention_mask.data(), | ||
| attention_mask.size() * sizeof(int64_t), | ||
| attention_mask_shape.data(), | ||
| attention_mask_shape.size(), | ||
| ONNX_TENSOR_ELEMENT_DATA_TYPE_INT64, | ||
| &attention_mask_tensor | ||
| ); | ||
| if (status != nullptr) { | ||
| LOGE("Failed to create attention_mask tensor: %s", cached_api->GetErrorMessage(status)); | ||
| cached_api->ReleaseStatus(status); | ||
| result.success = false; | ||
| result.stop_reason = "error"; | ||
| return result; | ||
| } | ||
| if (attention_mask_tensor == nullptr) { | ||
| LOGE("attention_mask_tensor is null after creation"); | ||
| result.success = false; | ||
| result.stop_reason = "error"; | ||
| return result; | ||
| } | ||
| input_tensors.push_back(attention_mask_tensor); | ||
| input_names.push_back("attention_mask"); | ||
|
|
||
| // 3. position_ids: [batch_size, current_seq_len] | ||
| std::vector<int64_t> position_ids(current_seq_len); | ||
| for (size_t i = 0; i < current_seq_len; ++i) { | ||
| position_ids[i] = past_seq_len + i; | ||
| } | ||
| std::vector<int64_t> position_ids_shape = {1, static_cast<int64_t>(current_seq_len)}; | ||
|
|
||
| OrtValue* position_ids_tensor = nullptr; | ||
| status = cached_api->CreateTensorWithDataAsOrtValue( | ||
| memory_info, | ||
| position_ids.data(), | ||
| position_ids.size() * sizeof(int64_t), | ||
| position_ids_shape.data(), | ||
| position_ids_shape.size(), | ||
| ONNX_TENSOR_ELEMENT_DATA_TYPE_INT64, | ||
| &position_ids_tensor | ||
| ); | ||
| if (status != nullptr) { | ||
| LOGE("Failed to create position_ids tensor: %s", cached_api->GetErrorMessage(status)); | ||
| cached_api->ReleaseStatus(status); | ||
| result.success = false; | ||
| result.stop_reason = "error"; | ||
| return result; | ||
| } | ||
| if (position_ids_tensor == nullptr) { | ||
| LOGE("position_ids_tensor is null after creation"); | ||
| result.success = false; | ||
| result.stop_reason = "error"; | ||
| return result; | ||
| } | ||
| input_tensors.push_back(position_ids_tensor); | ||
| input_names.push_back("position_ids"); | ||
|
|
||
| // 4. past_key_values: [batch_size, num_heads, past_seq_len, head_dim] | ||
| std::vector<std::string> kv_names; | ||
| kv_names.reserve(num_layers * 2); // Reserve space to prevent reallocation (2 per layer: key + value) | ||
| for (size_t layer = 0; layer < num_layers; ++layer) { | ||
| // past_key | ||
| std::vector<int64_t> kv_shape = {1, static_cast<int64_t>(num_heads), | ||
| static_cast<int64_t>(past_seq_len), | ||
| static_cast<int64_t>(head_dim)}; | ||
|
|
||
| OrtValue* past_key_tensor = nullptr; | ||
| if (past_seq_len == 0) { | ||
| // First step: create empty tensors | ||
| std::vector<float> empty; | ||
| status = cached_api->CreateTensorWithDataAsOrtValue( | ||
| memory_info, | ||
| empty.data(), | ||
| 0, | ||
| kv_shape.data(), | ||
| kv_shape.size(), | ||
| ONNX_TENSOR_ELEMENT_DATA_TYPE_FLOAT, | ||
| &past_key_tensor | ||
| ); | ||
| if (status != nullptr) { | ||
| LOGE("Failed to create empty past_key tensor: %s", cached_api->GetErrorMessage(status)); | ||
| cached_api->ReleaseStatus(status); | ||
| result.success = false; | ||
| result.stop_reason = "error"; | ||
| return result; | ||
| } | ||
| } else { | ||
| status = cached_api->CreateTensorWithDataAsOrtValue( | ||
| memory_info, | ||
| past_keys[layer].data(), | ||
| past_keys[layer].size() * sizeof(float), | ||
| kv_shape.data(), | ||
| kv_shape.size(), | ||
| ONNX_TENSOR_ELEMENT_DATA_TYPE_FLOAT, | ||
| &past_key_tensor | ||
| ); | ||
| if (status != nullptr) { | ||
| LOGE("Failed to create past_key tensor: %s", cached_api->GetErrorMessage(status)); | ||
| cached_api->ReleaseStatus(status); | ||
| result.success = false; | ||
| result.stop_reason = "error"; | ||
| return result; | ||
| } | ||
| } | ||
| if (past_key_tensor == nullptr) { | ||
| LOGE("past_key_tensor is null after creation"); | ||
| result.success = false; | ||
| result.stop_reason = "error"; | ||
| return result; | ||
| } | ||
| input_tensors.push_back(past_key_tensor); | ||
| kv_names.push_back("past_key_values." + std::to_string(layer) + ".key"); | ||
| input_names.push_back(kv_names.back().c_str()); | ||
|
|
||
| // past_value | ||
| OrtValue* past_value_tensor = nullptr; | ||
| if (past_seq_len == 0) { | ||
| std::vector<float> empty; | ||
| status = cached_api->CreateTensorWithDataAsOrtValue( | ||
| memory_info, | ||
| empty.data(), | ||
| 0, | ||
| kv_shape.data(), | ||
| kv_shape.size(), | ||
| ONNX_TENSOR_ELEMENT_DATA_TYPE_FLOAT, | ||
| &past_value_tensor | ||
| ); | ||
| if (status != nullptr) { | ||
| LOGE("Failed to create empty past_value tensor: %s", cached_api->GetErrorMessage(status)); | ||
| cached_api->ReleaseStatus(status); | ||
| result.success = false; | ||
| result.stop_reason = "error"; | ||
| return result; | ||
| } | ||
| } else { | ||
| status = cached_api->CreateTensorWithDataAsOrtValue( | ||
| memory_info, | ||
| past_values[layer].data(), | ||
| past_values[layer].size() * sizeof(float), | ||
| kv_shape.data(), | ||
| kv_shape.size(), | ||
| ONNX_TENSOR_ELEMENT_DATA_TYPE_FLOAT, | ||
| &past_value_tensor | ||
| ); | ||
| if (status != nullptr) { | ||
| LOGE("Failed to create past_value tensor: %s", cached_api->GetErrorMessage(status)); | ||
| cached_api->ReleaseStatus(status); | ||
| result.success = false; | ||
| result.stop_reason = "error"; | ||
| return result; | ||
| } | ||
| } | ||
| if (past_value_tensor == nullptr) { | ||
| LOGE("past_value_tensor is null after creation"); | ||
| result.success = false; | ||
| result.stop_reason = "error"; | ||
| return result; | ||
| } | ||
| input_tensors.push_back(past_value_tensor); | ||
| kv_names.push_back("past_key_values." + std::to_string(layer) + ".value"); | ||
| input_names.push_back(kv_names.back().c_str()); | ||
| } |
There was a problem hiding this comment.
Resource leak: early returns don't clean up already-created OrtValue* tensors.
Every early return between lines 464–648 leaks OrtValue* objects already pushed into input_tensors. For example, if attention_mask creation fails at line 495, input_ids_tensor (already in input_tensors) is never released. The cleanup loop at lines 674–677 only runs on the success/Run-failure path.
A scope guard or RAII wrapper over input_tensors would eliminate all leak paths:
Proposed fix sketch
+ // RAII cleanup for input tensors on all exit paths
+ auto cleanup_inputs = [&]() {
+ for (auto* tensor : input_tensors) {
+ if (tensor) cached_api->ReleaseValue(tensor);
+ }
+ input_tensors.clear();
+ };
+
// ... tensor creation code ...
- if (status != nullptr) {
- LOGE("Failed to create attention_mask tensor: %s", cached_api->GetErrorMessage(status));
- cached_api->ReleaseStatus(status);
- result.success = false;
- result.stop_reason = "error";
- return result;
- }
+ if (status != nullptr) {
+ LOGE("Failed to create attention_mask tensor: %s", cached_api->GetErrorMessage(status));
+ cached_api->ReleaseStatus(status);
+ cleanup_inputs();
+ result.success = false;
+ result.stop_reason = "error";
+ return result;
+ }Alternatively, wrap the tensor vector in a small RAII type whose destructor releases all held OrtValue* pointers.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@sdk/runanywhere-commons/src/backends/rag/onnx_generator.cpp` around lines 441
- 648, Multiple early-return paths in the block that builds input_tensors (using
CreateTensorWithDataAsOrtValue) leak OrtValue* stored in input_tensors; ensure
all error returns release previously-created tensors. Fix by adding a cleanup
routine (or small RAII wrapper) that iterates input_tensors and calls the ONNX
release function (e.g., cached_api->ReleaseValue or the appropriate
ReleaseOrtValue) for each pointer before every early return, or prefer wrapping
input_tensors in a RAII type whose destructor releases all entries; update the
error branches around CreateTensorWithDataAsOrtValue (for input_ids_tensor,
attention_mask_tensor, position_ids_tensor, past_key_tensor, past_value_tensor)
to invoke that cleanup before returning.
| // Define f16_native_t based on platform capabilities | ||
| // USearch expects this type to be defined when FP16LIB and SIMSIMD are disabled | ||
| #if defined(__ARM_ARCH) || defined(__aarch64__) || defined(_M_ARM64) | ||
| // Try to use native ARM FP16 if available (device builds) | ||
| #if __has_include(<arm_fp16.h>) && (!defined(__APPLE__) || (defined(__APPLE__) && !TARGET_OS_SIMULATOR)) | ||
| #include <arm_fp16.h> | ||
| using f16_native_t = __fp16; | ||
| #else | ||
| // Fallback for ARM without native FP16 (e.g., iOS Simulator on Apple Silicon) | ||
| #include <cstdint> | ||
| using f16_native_t = uint16_t; // Use binary16 representation | ||
| #endif | ||
| #else | ||
| // Non-ARM platforms (x86, x86_64) | ||
| #include <cstdint> | ||
| using f16_native_t = uint16_t; // Use binary16 representation | ||
| #endif |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# Check if the file currently includes TargetConditionals.h
cat -n sdk/runanywhere-commons/src/backends/rag/vector_store_usearch.cpp | head -40Repository: RunanywhereAI/runanywhere-sdks
Length of output: 1665
🏁 Script executed:
# Check for TargetConditionals usage in the codebase
rg -n 'TargetConditionals' --type cpp --type h --type hppRepository: RunanywhereAI/runanywhere-sdks
Length of output: 101
🏁 Script executed:
# Check for other usages of TARGET_OS_SIMULATOR in the codebase to see how it's handled
rg -n 'TARGET_OS_SIMULATOR' --type cpp --type h --type hpp -B 2 -A 2Repository: RunanywhereAI/runanywhere-sdks
Length of output: 101
🌐 Web query:
Apple TargetConditionals.h TARGET_OS_SIMULATOR requirements
💡 Result:
To use TARGET_OS_SIMULATOR you must meet these TargetConditionals.h usage requirements:
-
Include the header that defines it:
Add#include <TargetConditionals.h>(or#import <TargetConditionals.h>) in any C/C++/Obj‑C header/source that referencesTARGET_OS_SIMULATOR. Apple specifically recommends adding this include when target conditionals aren’t defined. [1] -
Use
#if, not#ifdef:
TARGET_OS_SIMULATORis always defined (to0or1) inTargetConditionals.h, so you should write:
#if TARGET_OS_SIMULATOR…#else…#endif(not#ifdef TARGET_OS_SIMULATOR). [2] -
Prefer
TARGET_OS_SIMULATORover deprecated macros:
TARGET_IPHONE_SIMULATORis deprecated; useTARGET_OS_SIMULATORinstead. [2] -
Swift equivalent (if relevant):
Use#if targetEnvironment(simulator)instead of C macros. [3]
Sources:
[1] Apple Developer Documentation – Identifying and addressing framework module issues (Fix target conditionals: include TargetConditionals.h)
[2] Stack Overflow – Target iPhone Simulator Macro Not Working (include TargetConditionals.h; use #if TARGET_OS_SIMULATOR; notes on deprecation)
[3] Stack Overflow – How to detect if app is being built for device or simulator in Swift (targetEnvironment(simulator))
TARGET_OS_SIMULATOR is undefined without #include <TargetConditionals.h> — simulator builds will incorrectly use __fp16.
On Apple platforms, TARGET_OS_SIMULATOR must be defined by including <TargetConditionals.h>. Without this header, the macro is treated as undefined (evaluating to 0 in preprocessor conditionals), making !TARGET_OS_SIMULATOR always true. This causes iOS Simulator builds on Apple Silicon to incorrectly attempt using __fp16 instead of the uint16_t fallback, resulting in compilation failures or incorrect behavior.
Add #include <TargetConditionals.h> before the platform detection check on line 14:
Proposed fix
`#define` USEARCH_USE_SIMSIMD 0
// Define f16_native_t based on platform capabilities
// USearch expects this type to be defined when FP16LIB and SIMSIMD are disabled
`#if` defined(__ARM_ARCH) || defined(__aarch64__) || defined(_M_ARM64)
+ `#if` defined(__APPLE__)
+ `#include` <TargetConditionals.h>
+ `#endif`
// Try to use native ARM FP16 if available (device builds)
`#if` __has_include(<arm_fp16.h>) && (!defined(__APPLE__) || (defined(__APPLE__) && !TARGET_OS_SIMULATOR))🧰 Tools
🪛 Clang (14.0.6)
[error] 24-24: 'cstdint' file not found
(clang-diagnostic-error)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@sdk/runanywhere-commons/src/backends/rag/vector_store_usearch.cpp` around
lines 10 - 26, The preprocessor check for TARGET_OS_SIMULATOR is used without
including TargetConditionals.h, causing simulator builds to mis-detect
availability of __fp16; add a platform header include before the ARM branch so
the existing __has_include(<arm_fp16.h>) && (!defined(__APPLE__) ||
(defined(__APPLE__) && !TARGET_OS_SIMULATOR)) logic works correctly — ensure
TargetConditionals.h is included prior to that check so f16_native_t selection
(the __fp16 vs uint16_t fallback) is correct for iOS simulator builds.
| <?xml version="1.0" encoding="UTF-8"?> | ||
| <!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd"> | ||
| <plist version="1.0"> | ||
| <dict> | ||
| <key>AvailableLibraries</key> | ||
| <array> | ||
| <dict> | ||
| <key>BinaryPath</key> | ||
| <string>librac_backend_rag.a</string> | ||
| <key>LibraryIdentifier</key> | ||
| <string>ios-arm64_x86_64-simulator</string> | ||
| <key>LibraryPath</key> | ||
| <string>librac_backend_rag.a</string> | ||
| <key>SupportedArchitectures</key> | ||
| <array> | ||
| <string>arm64</string> | ||
| <string>x86_64</string> | ||
| </array> | ||
| <key>SupportedPlatform</key> | ||
| <string>ios</string> | ||
| <key>SupportedPlatformVariant</key> | ||
| <string>simulator</string> | ||
| </dict> | ||
| <dict> | ||
| <key>BinaryPath</key> | ||
| <string>librac_backend_rag.a</string> | ||
| <key>LibraryIdentifier</key> | ||
| <string>ios-arm64</string> | ||
| <key>LibraryPath</key> | ||
| <string>librac_backend_rag.a</string> | ||
| <key>SupportedArchitectures</key> | ||
| <array> | ||
| <string>arm64</string> | ||
| </array> | ||
| <key>SupportedPlatform</key> | ||
| <string>ios</string> | ||
| </dict> | ||
| </array> | ||
| <key>CFBundlePackageType</key> | ||
| <string>XFWK</string> | ||
| <key>XCFrameworkFormatVersion</key> | ||
| <string>1.0</string> | ||
| </dict> | ||
| </plist> |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# Check if the .a files exist in the xcframework directory
find sdk/runanywhere-react-native/packages/rag/ios -name "*.a" -o -name "librac_backend_rag*"Repository: RunanywhereAI/runanywhere-sdks
Length of output: 56
🏁 Script executed:
# List the actual directory structure of the xcframework
ls -la sdk/runanywhere-react-native/packages/rag/ios/Libraries/rac_backend_rag.xcframework/Repository: RunanywhereAI/runanywhere-sdks
Length of output: 244
🏁 Script executed:
# Check if .a files or xcframework are in .gitignore
find . -name ".gitignore" -exec grep -l "\.a\|xcframework\|librac" {} \;Repository: RunanywhereAI/runanywhere-sdks
Length of output: 443
🏁 Script executed:
# Show the full structure including subdirectories of the xcframework
find sdk/runanywhere-react-native/packages/rag/ios/Libraries/rac_backend_rag.xcframework -type fRepository: RunanywhereAI/runanywhere-sdks
Length of output: 168
🏁 Script executed:
# Check the relevant gitignore files for .a and xcframework patterns
cat sdk/runanywhere-react-native/.gitignoreRepository: RunanywhereAI/runanywhere-sdks
Length of output: 1547
🏁 Script executed:
# Also check the root gitignore
cat .gitignoreRepository: RunanywhereAI/runanywhere-sdks
Length of output: 10124
XCFramework is missing required binary slices—only Info.plist is committed.
The Info.plist references librac_backend_rag.a under ios-arm64/ and ios-arm64_x86_64-simulator/ subdirectories, but these .a files do not exist in the repository. The XCFramework is structurally incomplete and will fail to link in Xcode.
Either:
- Commit the complete XCFramework including the
.abinary slices generated byxcodebuild -create-xcframeworkinbuild-ios-libs.sh, or - Add the entire
rac_backend_rag.xcframeworkdirectory to.gitignoreso it is regenerated during the build process instead of being a stale partial artifact.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In
`@sdk/runanywhere-react-native/packages/rag/ios/Libraries/rac_backend_rag.xcframework/Info.plist`
around lines 1 - 44, The Info.plist inside rac_backend_rag.xcframework
references librac_backend_rag.a for ios-arm64 and ios-arm64_x86_64-simulator but
the actual .a binary slices are missing, making the XCFramework incomplete; fix
by either committing the full XCFramework (ensure the binary slices produced by
xcodebuild -create-xcframework in build-ios-libs.sh are placed alongside
Info.plist under rac_backend_rag.xcframework) or remove/ignore the partial
rac_backend_rag.xcframework (add the directory to .gitignore) so the framework
is regenerated at build time.
|
@greptile please do a detailed review |
|
@coderabbitai please do a detailed review |
|
Tip For best results, initiate chat on the files or code changes.
🎯 PR Scope & ArchitectureConcern: Mixed responsibilities
This would make reviews more focused and rollbacks safer. 🔴 Critical Issues1. Memory Management in RAG Pipeline (
|
| lipo -create \ | ||
| "$SIMULATOR_ARM64_LIB_DIR/librac_backend_rag.a" \ | ||
| "$SIMULATOR_X86_LIB_DIR/librac_backend_rag.a" \ | ||
| -output "$UNIVERSAL_DIR/librac_backend_rag.a" | ||
| xcodebuild -create-xcframework \ | ||
| -library "$DEVICE_LIB_DIR/librac_backend_rag.a" \ | ||
| -library "$UNIVERSAL_DIR/librac_backend_rag.a" \ | ||
| -output "$OUTPUT_DIR/rac_backend_rag.xcframework" |
There was a problem hiding this comment.
Inconsistent approach with build-ios.sh at sdk/runanywhere-commons/scripts/build-ios.sh:426.
The main build script comments that "SIMULATOR already contains universal binary (arm64 + x86_64)" and avoids creating a fat binary with lipo, but this script still uses lipo -create to combine SIMULATORARM64 and SIMULATOR builds.
If SIMULATOR truly contains a universal binary (which is what the updated build-ios.sh assumes), then this lipo step would fail or be redundant. Check whether the SIMULATOR build actually produces a universal binary or if this script needs to be updated to match the approach in build-ios.sh.
Prompt To Fix With AI
This is a comment left during a code review.
Path: sdk/runanywhere-react-native/packages/rag/scripts/build-ios-libs.sh
Line: 89-96
Comment:
Inconsistent approach with `build-ios.sh` at `sdk/runanywhere-commons/scripts/build-ios.sh:426`.
The main build script comments that "SIMULATOR already contains universal binary (arm64 + x86_64)" and avoids creating a fat binary with `lipo`, but this script still uses `lipo -create` to combine SIMULATORARM64 and SIMULATOR builds.
If SIMULATOR truly contains a universal binary (which is what the updated `build-ios.sh` assumes), then this `lipo` step would fail or be redundant. Check whether the SIMULATOR build actually produces a universal binary or if this script needs to be updated to match the approach in `build-ios.sh`.
How can I resolve this? If you propose a fix, please make it concise.
Fixes #349 in which react-native-ios example app wasnt working. Just had to update the build scripts a bit. The docs themselves are still hardcoded, as the other person had left them, this pr makes the react-native-ios buildable.
Summary by CodeRabbit
Release Notes
New Features
Improvements
Documentation
Greptile Summary
This PR fixes the React Native iOS example app build issues by updating build scripts and configuration. The changes enable the new RAG (Retrieval-Augmented Generation) package to work on iOS and improve overall build reliability.
Key Changes:
@runanywhere/ragpackage with iOS build scripts and XCFramework bundlingIssues Found:
build-ios.shandbuild-ios-libs.shregarding simulator binary handling (see inline comment)Confidence Score: 3/5
build-ios.shassumes SIMULATOR produces a universal binary and skipslipo, whilebuild-ios-libs.shstill useslipo -createto combine SIMULATORARM64 and SIMULATOR. This needs verification before merging.sdk/runanywhere-react-native/packages/rag/scripts/build-ios-libs.sh- verify the simulator binary approach matchesbuild-ios.shImportant Files Changed
Flowchart
%%{init: {'theme': 'neutral'}}%% flowchart TD A[React Native iOS App] --> B[Initialize NitroModules Globally] B --> C{Check Cached Proxy} C -->|Exists| D[Return Cached] C -->|Not Exists| E[Call NitroModules.install] E --> F[Cache Proxy] F --> G[Register Packages] G --> H[Register @runanywhere/core] G --> I[Register @runanywhere/rag] G --> J[Register @runanywhere/onnx] G --> K{LlamaCPP Available?} K -->|Yes| L[Register @runanywhere/llamacpp] K -->|No| M[Skip LlamaCPP] N[Build Process] --> O[build-ios.sh] O --> P[Build Commons + Backends] P --> Q[Create XCFrameworks] Q --> R[iOS Device arm64] Q --> S[iOS Simulator universal] T[RAG Package Build] --> U[build-ios-libs.sh] U --> V[Call build-ios.sh] V --> W[Create RAG XCFramework] W --> X[lipo SIMULATORARM64 + SIMULATOR] X --> Y[Bundle in npm package] style X fill:#ffcccc style S fill:#ccffccLast reviewed commit: c012679