WIP | Fix parametre piping through sdk#340
WIP | Fix parametre piping through sdk#340shubhammalhotra28 merged 7 commits intoRunanywhereAI:mainfrom
Conversation
📝 WalkthroughWalkthroughAdds persisted generation settings (temperature, maxTokens, optional systemPrompt) across samples and SDKs, propagates them into LLM generation calls, moves LlamaCpp sampler configuration to per-request, replaces many JNI ad-hoc JSON paths with nlohmann::json, and removes UI pitch controls in TTS samples. Changes
Sequence Diagram(s)sequenceDiagram
participant UI as User/UI
participant SettingsVM as SettingsViewModel / ViewModels
participant Storage as SharedPrefs / UserDefaults / AsyncStorage
participant SDK as RunAnywhere SDK (platform bridge)
participant JNI as JNI/native bridge (nlohmann::json)
participant Backend as LlamaCpp Backend
UI->>SettingsVM: Set generation settings (temp, maxTokens, systemPrompt)
SettingsVM->>Storage: Save generation settings
Storage-->>SettingsVM: Ack
UI->>SDK: Request generation (prompt)
SDK->>SettingsVM: Request current generation options
SettingsVM->>Storage: Read generation settings
Storage-->>SettingsVM: Return settings
SettingsVM-->>SDK: Return LLMGenerationOptions (temp, maxTokens, systemPrompt)
SDK->>JNI: Serialize request (JSON includes system_prompt)
JNI->>Backend: Forward parsed request (system_prompt forwarded)
Backend->>Backend: Build per-request sampler from temperature/top_p/top_k
Backend-->>JNI: Return generated tokens/text
JNI-->>SDK: Return result
SDK-->>UI: Display generated output
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Suggested labels
🚥 Pre-merge checks | ✅ 2 | ❌ 2❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
For the example ios app, removed the tts pitch slider
| // Voice Settings | ||
| @Published var speechRate: Double = 1.0 | ||
| @Published var pitch: Double = 1.0 | ||
| @Published var pitch: Double = 1.0 // while removed from the UI, the backend still supports pitch, so maintinaining it here. |
There was a problem hiding this comment.
maintinaining is mis-spelled
| @Published var pitch: Double = 1.0 // while removed from the UI, the backend still supports pitch, so maintinaining it here. | |
| @Published var pitch: Double = 1.0 // while removed from the UI, the backend still supports pitch, so maintaining it here. |
Prompt To Fix With AI
This is a comment left during a code review.
Path: examples/ios/RunAnywhereAI/RunAnywhereAI/Features/Voice/TTSViewModel.swift
Line: 29:29
Comment:
`maintinaining` is mis-spelled
```suggestion
@Published var pitch: Double = 1.0 // while removed from the UI, the backend still supports pitch, so maintaining it here.
```
How can I resolve this? If you propose a fix, please make it concise.There was a problem hiding this comment.
Actionable comments posted: 9
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (4)
examples/flutter/RunAnywhereAI/lib/features/models/model_selection_sheet.dart (1)
504-515:⚠️ Potential issue | 🟡 Minor
_isLoadingModelnot reset on the success path.In contrast to
_selectSystemTTS(line 483), the success path here never sets_isLoadingModel = falsebefore popping. If the widget isn't immediately disposed afterNavigator.pop(or ifmountedbecomesfalsebefore the pop), the loading overlay state is left dangling.Suggested fix
debugPrint('🎯 Model selected: ${model.id}, calling callback to load'); await widget.onModelSelected(model); if (mounted) { + setState(() { + _isLoadingModel = false; + }); Navigator.pop(context); }sdk/runanywhere-commons/src/backends/llamacpp/rac_llm_llamacpp.cpp (1)
155-178:⚠️ Potential issue | 🔴 Critical
top_kandrepetition_penaltyparameters are not propagated from the C API to the backend, and the C struct must be updated first.The backend's
generate_stream()usesrequest.top_kandrequest.repetition_penaltyto configure the sampler, but the C API layer does not copy these from options. More critically, therac_llm_options_tstruct itself lacks these fields entirely, preventing callers from setting them at all.This requires three changes:
- Add
int32_t top_kandfloat repetition_penaltytorac_llm_options_tinrac_llm_types.h- Update
RAC_LLM_OPTIONS_DEFAULTwith appropriate defaults- Copy these fields in both
rac_llm_llamacpp_generate(line 155) andrac_llm_llamacpp_generate_stream(line 217)The same issue affects both functions.
Example of the propagation addition needed
if (options != nullptr) { request.max_tokens = options->max_tokens; request.temperature = options->temperature; request.top_p = options->top_p; + request.top_k = options->top_k; + request.repetition_penalty = options->repetition_penalty; if (options->system_prompt != nullptr) { request.system_prompt = options->system_prompt; }examples/flutter/RunAnywhereAI/lib/features/chat/chat_interface_view.dart (1)
135-136:⚠️ Potential issue | 🟠 Major
systemPromptis not passed to the tool-calling path.When
useToolCallingis true on line 136,_generateWithToolCallingis called with onlymaxTokensandtemperature. ThesystemPromptretrieved on line 126 is silently dropped. The method signature (line 175) andToolCallingOptionsinstantiation (line 204) do not include it, even thoughToolCallingOptionssupports thesystemPromptparameter (defined in tool_calling_types.dart:291).This means users who set a system prompt won't have it applied when tool calling is enabled, creating inconsistency with the non-tool-calling path which correctly passes
systemPrompttoLLMGenerationOptionson line 142.Proposed fix
Update the method signature and call site:
if (useToolCalling) { - await _generateWithToolCalling(userMessage, maxTokens, temperature); + await _generateWithToolCalling(userMessage, maxTokens, temperature, systemPrompt); } else {And update
_generateWithToolCalling:Future<void> _generateWithToolCalling( String prompt, int maxTokens, double temperature, + String? systemPrompt, ) async { ... final result = await RunAnywhereTools.generateWithTools( prompt, options: ToolCallingOptions( maxToolCalls: 3, autoExecute: true, formatName: format, maxTokens: maxTokens, temperature: temperature, + systemPrompt: systemPrompt, ), );sdk/runanywhere-kotlin/src/jvmAndroidMain/kotlin/com/runanywhere/sdk/foundation/bridge/extensions/CppBridgeLLM.kt (1)
223-258:⚠️ Potential issue | 🟠 MajorMove
GenerationConfigtocommonMainThis change extends a data structure in
jvmAndroidMain. Per KMP guidelines, business logic/structures should live incommonMain, with platform source sets limited to platform glue. Please relocateGenerationConfig(and related config types) tocommonMainand keep this file platform-only.As per coding guidelines, "All business logic, protocols, interfaces, and structures MUST be defined in
commonMain/for Kotlin Multiplatform projects."
🤖 Fix all issues with AI agents
In
`@examples/android/RunAnywhereAI/app/src/main/java/com/runanywhere/runanywhereai/presentation/chat/ChatViewModel.kt`:
- Around line 853-869: The getGenerationOptions function currently logs the full
systemPrompt value which may leak sensitive content; update the Log.i call in
getGenerationOptions to avoid printing the raw systemPrompt and instead log only
its presence or length (e.g., systemPrompt?.length ?: 0) or a redacted
placeholder, keeping maxTokens and temperature as-is; ensure the signature and
returned com.runanywhere.sdk.public.extensions.LLM.LLMGenerationOptions remain
unchanged and replace only the logged systemPrompt usage.
In
`@examples/flutter/RunAnywhereAI/lib/features/settings/combined_settings_view.dart`:
- Around line 582-594: The TextEditingController is being recreated inline in
_buildGenerationSettingsCard causing cursor/focus loss and leakage; fix by
adding a persistent controller field (e.g., late final TextEditingController
_systemPromptController) on the _CombinedSettingsViewState, initialize it in
initState (setting its text from _systemPrompt or calling
_loadGenerationSettings to sync), update _loadGenerationSettings to write the
saved prompt into _systemPromptController.text whenever settings load, use
_systemPromptController (instead of creating a new controller) in the TextField
inside _buildGenerationSettingsCard, handle onChanged by updating _systemPrompt,
and dispose _systemPromptController in dispose.
In
`@examples/ios/RunAnywhereAI/RunAnywhereAI/Features/Chat/ViewModels/LLMViewModel.swift`:
- Around line 327-342: The logger currently interpolates the raw system prompt
(effectiveSystemPrompt) in LLMViewModel which can leak sensitive data; change
the logging at the logger.info call to avoid including the prompt text and
instead log a safe indicator such as whether a system prompt exists and its
length (or a short secure hash like SHA-256) computed from
effectiveSystemPrompt, e.g., log systemPromptPresent=(effectiveSystemPrompt !=
nil) and systemPromptLength=effectiveSystemPrompt?.count or
systemPromptHash=<short-hash>, and apply the same redaction approach to the
other logging site referenced (lines around 359-365) so no raw prompt text is
written to logs.
In
`@examples/ios/RunAnywhereAI/RunAnywhereAI/Features/Voice/TextToSpeechView.swift`:
- Around line 236-252: Remove the accidental "claud" token and the entire
commented-out Pitch UI block in TextToSpeechView (the TODO for pitch can
remain); specifically delete the stray "{claud" text and the commented VStack
that references viewModel.pitch so the code is clean—keep only the existing TODO
comment about implementing pitch support in TTS.
In `@examples/ios/RunAnywhereAI/RunAnywhereAI/Features/Voice/TTSViewModel.swift`:
- Line 29: Fix the typo in the comment for the published property pitch in
TTSViewModel (the line with "@Published var pitch: Double = 1.0") by changing
"maintinaining" to "maintaining" so the comment reads correctly that the backend
still supports pitch; no code logic changes required—only update the inline
comment text.
In `@Package.swift`:
- Line 43: The constant useLocalBinaries is currently set to true which enables
local Binaries/ paths and will break downstream SPM consumers; change
useLocalBinaries to false so the Package.swift uses the production binary
resolution (disable local Binaries usage), and verify any conditional logic that
references useLocalBinaries (the variable named useLocalBinaries and the code
that selects Binaries/ paths) will select the non-local/remote binary URLs
instead before merging to main.
In `@sdk/runanywhere-commons/src/backends/llamacpp/llamacpp_backend.cpp`:
- Around line 548-571: Add top_k (int) and repetition_penalty (float) to the
public rac_llm_options_t struct and ensure they are initialized with sensible
defaults; then modify the C API conversion in rac_llm_llamacpp.cpp to copy
options.top_k -> request.top_k and options.repetition_penalty ->
request.repetition_penalty in both generate() and generate_stream() call paths
so the existing llama_cpp backend code (which reads request.top_k and
request.repetition_penalty in llamacpp_backend.cpp) can honor caller-specified
values. Ensure the field names match exactly (top_k, repetition_penalty) and
preserve existing behavior when options are unset by using the new defaults.
In
`@sdk/runanywhere-kotlin/src/commonMain/kotlin/com/runanywhere/sdk/public/extensions/LLM/LLMTypes.kt`:
- Around line 107-108: The Kotlin LLMGenerationOptions currently claims to
"mirror Swift LLMGenerationOptions exactly" but its defaults for maxTokens and
temperature differ; update the defaults in LLMGenerationOptions so maxTokens =
100 and temperature = 0.8f to match the Swift implementation, ensuring the
property declarations for maxTokens and temperature in that class reflect these
values (or alternatively remove the "mirrors exactly" phrase if you
intentionally want different defaults).
In `@sdk/runanywhere-react-native/packages/core/cpp/HybridRunAnywhereCore.cpp`:
- Around line 1426-1430: In generate(), after calling the LLM and extracting
values from the local llmResult (the rac_llm_result_t returned when using
rac_llm_options_t options), add a call to rac_llm_result_free(&llmResult) before
the function returns to free resources; mirror the cleanup present in
generateStructured() so llmResult is freed after use and before any return path.
🧹 Nitpick comments (10)
examples/android/RunAnywhereAI/app/src/main/java/com/runanywhere/runanywhereai/presentation/tts/TextToSpeechScreen.kt (1)
485-514: Consider removing the dead code instead of commenting it out, or suppress the unused-parameter warnings.The pitch slider is commented out, but
VoiceSettingsSection(line 437-439) still acceptspitchandonPitchChangeparameters that are now unused. The KDoc on line 49 also still references "pitch sliders." Since this is a WIP PR, this is understandable, but before merging tomain:
- Either remove the commented block and the unused parameters entirely (the code is in version control if needed later).
- Or, if you intend to re-enable it soon, add
@Suppress("UNUSED_PARAMETER")annotations (like you did formaxCharacterson line 367) topitchandonPitchChangeto keep the build warning-free, and update the KDoc.examples/flutter/RunAnywhereAI/lib/features/voice/text_to_speech_view.dart (1)
447-463:_pitchfield (line 55) is now dead code.With the slider commented out,
_pitchis always1.0and is still passed tosynthesizeon line 189. Consider also commenting out (or removing) the field and thepitch:argument to keep things consistent, or add a brief// TODOnear the field so it's easy to find when pitch support is re-enabled.sdk/runanywhere-react-native/packages/core/cpp/HybridRunAnywhereCore.cpp (1)
1432-1447: Pre-existing:rac_llm_result_freenot called ingenerate().
generateStructured()correctly callsrac_llm_result_free(&llmResult)on line 1624, butgenerate()does not freellmResultafter extracting the text. If the C API allocates memory forllmResult.text, this is a resource leak.Not introduced by this PR, but worth addressing since you're already modifying this function.
🔧 Proposed fix
std::string text = llmResult.text ? llmResult.text : ""; int tokensUsed = llmResult.completion_tokens; + rac_llm_result_free(&llmResult); return buildJsonObject({sdk/runanywhere-commons/src/backends/llamacpp/llamacpp_backend.cpp (1)
573-579: Logging looks good for observability.Per-request parameter logging will help diagnose parameter-piping issues. Consider also logging the
repetition_penaltydefault whentemperature <= 0(greedy path) since users may not realize it's being ignored.sdk/runanywhere-commons/src/backends/llamacpp/rac_llm_llamacpp.cpp (1)
170-178: Parameter logging is helpful, but should includetop_kandrepetition_penaltyonce they're propagated.Currently the log lines only report
max_tokens,temperature,top_p, andsystem_prompt. Once the missing fields are wired through, update the log format to include them for complete diagnostics.Also applies to: 231-239
sdk/runanywhere-commons/src/jni/runanywhere_commons_jni.cpp (3)
1780-1786: Inconsistent JSON array construction — manual string concatenation remains.
modelInfoToJsonwas migrated to nlohmann::json, but the array assembly inracModelRegistryGetAllandracModelRegistryGetDownloadedstill manually concatenates"[",",", and"]"around the results. Since each element is already adump()'d string, re-parsing it into anlohmann::jsonarray would be wasteful — but the current approach is technically fragile ifmodelInfoToJsonever returns a string with unescaped characters (it won't with nlohmann, but the pattern is inconsistent with the rest of the migration).Consider having
modelInfoToJsonreturn anlohmann::jsonobject directly, then build the array withnlohmann::json::array()andpush_back, consistent with howracModelAssignmentFetch(Line 2049) does it.Also applies to: 1810-1816
3182-3228: Tool-call JSON response still uses manual string escaping — inconsistent and fragile.The
racToolCallParsefunction hand-rolls JSON escaping forclean_text(Lines 3189–3199) and does ad-hoc validation ofarguments_json(Lines 3208–3219). This is the exact kind of code the nlohmann migration was meant to replace. The manual escaping misses characters like\b,\f, and Unicode control characters, which could produce invalid JSON.Consider migrating this to nlohmann::json for safety and consistency with the rest of the file.
536-553: Extract duplicated LLM config parsing into a helper function.The exact same config-parsing block is copy-pasted three times: lines 538–553 (racLlmComponentGenerate), lines 813–828 (racLlmComponentGenerateStream), and lines 926–941 (racLlmComponentGenerateStreamWithCallback). This DRY violation makes it easy for future changes to diverge across implementations.
RAC_NULLis properly defined asnullptrin C++ (seerac_types.h), so that usage is correct.Consider extracting the common logic into a static helper:
static void parseLlmConfigJson(const char* config, rac_llm_options_t& options, std::string& sys_prompt_storage) { if (config == nullptr) return; try { auto j = nlohmann::json::parse(config); options.max_tokens = j.value("max_tokens", 512); options.temperature = j.value("temperature", 0.7f); options.top_p = j.value("top_p", 1.0f); sys_prompt_storage = j.value("system_prompt", std::string("")); if (!sys_prompt_storage.empty()) { options.system_prompt = sys_prompt_storage.c_str(); } } catch (const nlohmann::json::exception& e) { LOGe("Failed to parse LLM config JSON: %s", e.what()); } }Then replace each block with
std::string sys_prompt_storage; parseLlmConfigJson(config, options, sys_prompt_storage);sdk/runanywhere-swift/Sources/RunAnywhere/Public/Extensions/LLM/RunAnywhere+TextGeneration.swift (2)
226-231: Pre-existing:LLMStreamCallbackContextis never released on the success or callback-error path.Line 196 retains the context via
passRetained, and line 227 correctly releases it on the early-failure path. However, neither thecompleteCallback(line 269) nor theerrorCallback(line 276) callstakeRetainedValue()to balance the retain — they both usetakeUnretainedValue(). This leaks the context object on every completed stream.Not introduced by this PR, but since you're actively modifying this function, it's a good opportunity to fix it.
🔧 Proposed fix in the callbacks
In the
completeCallbackanderrorCallbackclosures, switch totakeRetainedValue()to release the context:let completeCallback: CompleteFn = { _, userData in guard let userData = userData else { return } - let ctx = Unmanaged<LLMStreamCallbackContext>.fromOpaque(userData).takeUnretainedValue() + let ctx = Unmanaged<LLMStreamCallbackContext>.fromOpaque(userData).takeRetainedValue() ctx.continuation.finish() Task { await ctx.collector.markComplete() } } let errorCallback: ErrorFn = { _, errorMsg, userData in guard let userData = userData else { return } - let ctx = Unmanaged<LLMStreamCallbackContext>.fromOpaque(userData).takeUnretainedValue() + let ctx = Unmanaged<LLMStreamCallbackContext>.fromOpaque(userData).takeRetainedValue() let message = errorMsg.map { String(cString: $0) } ?? "Unknown error" let error = SDKError.llm(.generationFailed, message) ctx.continuation.finish(throwing: error) Task { await ctx.collector.markFailed(error) } }
61-62: Logging may emit unexpectedly verbose output in production.These
[PARAMS]log lines fire on every single generation request and include parameter values. This is useful during development but could add noise for SDK consumers in production. Consider gating behind.debuglevel instead of.info.Also applies to: 164-165
...nywhereAI/app/src/main/java/com/runanywhere/runanywhereai/presentation/chat/ChatViewModel.kt
Show resolved
Hide resolved
examples/flutter/RunAnywhereAI/lib/features/settings/combined_settings_view.dart
Outdated
Show resolved
Hide resolved
examples/ios/RunAnywhereAI/RunAnywhereAI/Features/Chat/ViewModels/LLMViewModel.swift
Outdated
Show resolved
Hide resolved
|
|
||
| // TODO: Find a model for TTS that supports pitch, or manually implement a good quality pitch adjustment | ||
|
|
||
| // Pitch (not implemented in the current TTS models. Once supported, we can have this back.) | ||
| // VStack(alignment: .leading, spacing: 10) {claud | ||
| // HStack { | ||
| // Text("Pitch") | ||
| // .font(.subheadline) | ||
| // .foregroundColor(.secondary) | ||
| // Spacer() | ||
| // Text(String(format: "%.1fx", viewModel.pitch)) | ||
| // .font(.system(size: 15, weight: .medium, design: .rounded)) | ||
| // .foregroundColor(.primary) | ||
| // } | ||
| // Slider(value: $viewModel.pitch, in: 0.5...2.0, step: 0.1) | ||
| // .tint(AppColors.primaryPurple) | ||
| // } |
There was a problem hiding this comment.
Stray text "claud" on line 240 and unnecessary commented-out code block.
Line 240 has {claud appended — looks like an accidental paste. Also, consider removing the commented-out code entirely and keeping only the TODO on line 237. The git history preserves the old implementation if needed later.
Suggested cleanup
- // TODO: Find a model for TTS that supports pitch, or manually implement a good quality pitch adjustment
-
- // Pitch (not implemented in the current TTS models. Once supported, we can have this back.)
- // VStack(alignment: .leading, spacing: 10) {claud
- // HStack {
- // Text("Pitch")
- // .font(.subheadline)
- // .foregroundColor(.secondary)
- // Spacer()
- // Text(String(format: "%.1fx", viewModel.pitch))
- // .font(.system(size: 15, weight: .medium, design: .rounded))
- // .foregroundColor(.primary)
- // }
- // Slider(value: $viewModel.pitch, in: 0.5...2.0, step: 0.1)
- // .tint(AppColors.primaryPurple)
- // }
+ // TODO: Find a model for TTS that supports pitch, or manually implement a good quality pitch adjustment📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| // TODO: Find a model for TTS that supports pitch, or manually implement a good quality pitch adjustment | |
| // Pitch (not implemented in the current TTS models. Once supported, we can have this back.) | |
| // VStack(alignment: .leading, spacing: 10) {claud | |
| // HStack { | |
| // Text("Pitch") | |
| // .font(.subheadline) | |
| // .foregroundColor(.secondary) | |
| // Spacer() | |
| // Text(String(format: "%.1fx", viewModel.pitch)) | |
| // .font(.system(size: 15, weight: .medium, design: .rounded)) | |
| // .foregroundColor(.primary) | |
| // } | |
| // Slider(value: $viewModel.pitch, in: 0.5...2.0, step: 0.1) | |
| // .tint(AppColors.primaryPurple) | |
| // } | |
| // TODO: Find a model for TTS that supports pitch, or manually implement a good quality pitch adjustment |
🤖 Prompt for AI Agents
In
`@examples/ios/RunAnywhereAI/RunAnywhereAI/Features/Voice/TextToSpeechView.swift`
around lines 236 - 252, Remove the accidental "claud" token and the entire
commented-out Pitch UI block in TextToSpeechView (the TODO for pitch can
remain); specifically delete the stray "{claud" text and the commented VStack
that references viewModel.pitch so the code is clean—keep only the existing TODO
comment about implementing pitch support in TTS.
examples/ios/RunAnywhereAI/RunAnywhereAI/Features/Voice/TTSViewModel.swift
Outdated
Show resolved
Hide resolved
| val maxTokens: Int = 1000, | ||
| val temperature: Float = 0.7f, |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
find . -type f -name "*.swift" | head -20Repository: RunanywhereAI/runanywhere-sdks
Length of output: 1515
🏁 Script executed:
find . -type f -name "*LLMTypes*" -o -name "*LLMGenerationOptions*"Repository: RunanywhereAI/runanywhere-sdks
Length of output: 322
🏁 Script executed:
rg -l "LLMGenerationOptions" --type swiftRepository: RunanywhereAI/runanywhere-sdks
Length of output: 1219
🏁 Script executed:
rg -n "maxTokens|temperature" --type swift -C 3 -B 2 -A 2Repository: RunanywhereAI/runanywhere-sdks
Length of output: 50388
Defaults do not match Swift SDK as claimed.
The LLMGenerationOptions init comment states this "Mirrors Swift LLMGenerationOptions exactly," but the Swift implementation (line 178-179 in sdk/runanywhere-swift/Sources/RunAnywhere/Public/Extensions/LLM/LLMTypes.swift) has maxTokens: Int = 100 and temperature: Float = 0.8. The Kotlin version differs with maxTokens = 1000 and temperature = 0.7f. Either align the Kotlin defaults to match Swift (100, 0.8) or remove the "mirrors exactly" claim.
🤖 Prompt for AI Agents
In
`@sdk/runanywhere-kotlin/src/commonMain/kotlin/com/runanywhere/sdk/public/extensions/LLM/LLMTypes.kt`
around lines 107 - 108, The Kotlin LLMGenerationOptions currently claims to
"mirror Swift LLMGenerationOptions exactly" but its defaults for maxTokens and
temperature differ; update the defaults in LLMGenerationOptions so maxTokens =
100 and temperature = 0.8f to match the Swift implementation, ensuring the
property declarations for maxTokens and temperature in that class reflect these
values (or alternatively remove the "mirrors exactly" phrase if you
intentionally want different defaults).
| rac_llm_options_t options = {}; | ||
| options.max_tokens = maxTokens; | ||
| options.temperature = temperature; | ||
| options.top_p = 0.9f; | ||
| options.system_prompt = systemPrompt.empty() ? nullptr : systemPrompt.c_str(); |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Find header files related to rac_llm
fd "rac_llm" --type f -e h -e hpp | head -20Repository: RunanywhereAI/runanywhere-sdks
Length of output: 1496
🏁 Script executed:
#!/bin/bash
# Search for system_prompt field definition
rg "system_prompt" --type c --type cpp --type h -B 2 -A 2 | head -50Repository: RunanywhereAI/runanywhere-sdks
Length of output: 5416
🏁 Script executed:
#!/bin/bash
# Look at the context around lines 1426-1430 and also check the generate methods
sed -n '1420,1450p' sdk/runanywhere-react-native/packages/core/cpp/HybridRunAnywhereCore.cpp | cat -nRepository: RunanywhereAI/runanywhere-sdks
Length of output: 1528
🏁 Script executed:
#!/bin/bash
# Check for rac_llm_result_free calls in the file
rg "rac_llm_result_free" sdk/runanywhere-react-native/packages/core/cpp/HybridRunAnywhereCore.cpp -B 2 -A 2Repository: RunanywhereAI/runanywhere-sdks
Length of output: 230
🏁 Script executed:
#!/bin/bash
# Look at the rac_llm_types.h in the commons directory
cat sdk/runanywhere-commons/include/rac/features/llm/rac_llm_types.h | grep -A 20 "typedef.*rac_llm_options"Repository: RunanywhereAI/runanywhere-sdks
Length of output: 660
🏁 Script executed:
#!/bin/bash
# Check the exact context around line 1432 to see if rac_llm_result_free is called
sed -n '1430,1450p' sdk/runanywhere-react-native/packages/core/cpp/HybridRunAnywhereCore.cpp | cat -nRepository: RunanywhereAI/runanywhere-sdks
Length of output: 978
Add rac_llm_result_free() call in the generate() method.
The system_prompt field in rac_llm_options_t is correctly defined and properly set. However, the generate() method (lines 1432-1449) does not call rac_llm_result_free(&llmResult) after using the result, while generateStructured() does. This creates a resource leak. Add the cleanup call after extracting values from llmResult and before the return statement.
🤖 Prompt for AI Agents
In `@sdk/runanywhere-react-native/packages/core/cpp/HybridRunAnywhereCore.cpp`
around lines 1426 - 1430, In generate(), after calling the LLM and extracting
values from the local llmResult (the rac_llm_result_t returned when using
rac_llm_options_t options), add a call to rac_llm_result_free(&llmResult) before
the function returns to free resources; mirror the cleanup present in
generateStructured() so llmResult is freed after use and before any return path.
|
addressing the bot issues |
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
examples/android/RunAnywhereAI/app/src/main/java/com/runanywhere/runanywhereai/presentation/chat/ChatViewModel.kt (1)
264-270:⚠️ Potential issue | 🟡 MinorTool-calling path ignores user generation settings.
generateWithToolCallinghardcodestemperature = 0.7fandmaxTokens = 1024, bypassing the user's persisted preferences and system prompt thatgetGenerationOptions()provides to the other two generation paths. This means changing settings in the UI has no effect when tool calling is active.Consider reading from
getGenerationOptions()here as well (or at least forwarding the user's temperature, maxTokens, and systemPrompt intoToolCallingOptions).
🤖 Fix all issues with AI agents
In
`@examples/android/RunAnywhereAI/app/src/main/java/com/runanywhere/runanywhereai/presentation/chat/ChatViewModel.kt`:
- Around line 83-85: Replace the hardcoded preference file name
"generation_settings" used when initializing generationPrefs with a shared
constant so both view models reference the same identifier; either create a
PreferencesConstants.kt with a public val (e.g., PREFS_NAME) and import it into
ChatViewModel (used in generationPrefs) and SettingsViewModel, or make
SettingsViewModel.PREFS_NAME public and reference that constant from
ChatViewModel.getSharedPreferences to avoid drift.
🧹 Nitpick comments (2)
examples/flutter/RunAnywhereAI/lib/features/settings/combined_settings_view.dart (1)
506-619: Generation settings card is well-structured.Sliders, labels, and the system prompt field are cleanly laid out. The
onChangedonTextField(line 597-598) correctly updates_systemPromptwithout needingsetStatesince the controller handles display — good.One thing to consider: if the user modifies sliders/prompt and navigates away without pressing "Save Settings", changes are silently discarded. A minor UX improvement would be to show a discard-confirmation dialog or auto-save, but this is fine for a settings screen in a WIP state.
examples/ios/RunAnywhereAI/RunAnywhereAI/Features/Chat/ViewModels/LLMViewModel.swift (1)
336-350: Inconsistent indentation ingetGenerationOptions().Lines 336–350 are flush-left while the rest of the function body (lines 325–334) uses standard indentation. This appears to be an accidental formatting regression—likely from a bad merge or paste.
🔧 Proposed indentation fix
- let systemPromptInfo: String = { - guard let prompt = effectiveSystemPrompt else { return "nil" } - return "set(\(prompt.count) chars)" - }() - - logger.info( - "[PARAMS] App getGenerationOptions: temperature=\(effectiveSettings.temperature), maxTokens=\(effectiveSettings.maxTokens), systemPrompt=\(systemPromptInfo)" - ) - - return LLMGenerationOptions( - maxTokens: effectiveSettings.maxTokens, - temperature: Float(effectiveSettings.temperature), - systemPrompt: effectiveSystemPrompt - ) -} + let systemPromptInfo: String = { + guard let prompt = effectiveSystemPrompt else { return "nil" } + return "set(\(prompt.count) chars)" + }() + + logger.info( + "[PARAMS] App getGenerationOptions: temperature=\(effectiveSettings.temperature), maxTokens=\(effectiveSettings.maxTokens), systemPrompt=\(systemPromptInfo)" + ) + + return LLMGenerationOptions( + maxTokens: effectiveSettings.maxTokens, + temperature: Float(effectiveSettings.temperature), + systemPrompt: effectiveSystemPrompt + ) + }
| private val generationPrefs by lazy { | ||
| getApplication<Application>().getSharedPreferences("generation_settings", android.content.Context.MODE_PRIVATE) | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Search for all occurrences of "generation_settings" and related SharedPreferences usage
echo "=== Searching for 'generation_settings' string ==="
rg -n "generation_settings" --type=kt
echo -e "\n=== Searching for SharedPreferences writes in SettingsViewModel ==="
fd -t f "SettingsViewModel.kt" --exec rg -l "SharedPreferences\|putFloat\|putInt\|putString" {} \;
echo -e "\n=== Checking if SettingsViewModel exists and what SharedPreferences keys it uses ==="
fd -t f "SettingsViewModel.kt" -x cat -n {}Repository: RunanywhereAI/runanywhere-sdks
Length of output: 46453
🏁 Script executed:
#!/bin/bash
# Find ChatViewModel.kt
echo "=== Finding ChatViewModel.kt ==="
fd "ChatViewModel" --type f
echo -e "\n=== Reading ChatViewModel.kt around lines 83-85 ==="
fd "ChatViewModel.kt" --type f -x cat -n {} | head -100
echo -e "\n=== Search for all uses of 'generation_settings' with correct syntax ==="
rg "generation_settings" -t kotlin --no-heading
echo -e "\n=== Check if ChatViewModel imports or uses SettingsViewModel constants ==="
fd "ChatViewModel.kt" --type f -x rg "PREFS_NAME\|generation_settings" -n {}Repository: RunanywhereAI/runanywhere-sdks
Length of output: 5890
🏁 Script executed:
#!/bin/bash
# Find what ChatViewModel does with generationPrefs - look for getGenerationOptions
echo "=== Find getGenerationOptions method in ChatViewModel ==="
fd "ChatViewModel.kt" --type f -x sed -n '850,900p' {}
echo -e "\n=== Check if ChatViewModel could use SettingsViewModel.getGenerationSettings() ==="
fd "ChatViewModel.kt" --type f -x grep -n "getGenerationSettings\|generationPrefs" {}
echo -e "\n=== Check full context of generation prefs usage ==="
fd "ChatViewModel.kt" --type f -x sed -n '83,400p' {} | head -200Repository: RunanywhereAI/runanywhere-sdks
Length of output: 10463
Extract SharedPreferences file name to a shared constant.
The preference file name "generation_settings" is hardcoded as a string literal. If the name defined in SettingsViewModel changes, ChatViewModel will silently use stale preferences. Extract this into a constants file (e.g., PreferencesConstants.kt) and import it in both view models, or make SettingsViewModel.PREFS_NAME public so ChatViewModel can reference it directly.
🤖 Prompt for AI Agents
In
`@examples/android/RunAnywhereAI/app/src/main/java/com/runanywhere/runanywhereai/presentation/chat/ChatViewModel.kt`
around lines 83 - 85, Replace the hardcoded preference file name
"generation_settings" used when initializing generationPrefs with a shared
constant so both view models reference the same identifier; either create a
PreferencesConstants.kt with a public val (e.g., PREFS_NAME) and import it into
ChatViewModel (used in generationPrefs) and SettingsViewModel, or make
SettingsViewModel.PREFS_NAME public and reference that constant from
ChatViewModel.getSharedPreferences to avoid drift.
|
@coderabbitai please do a detailed review |
|
@greptile please do a detailed review |
|
✅ Actions performedFull review triggered. |
| val showRestartDialog: Boolean = false, | ||
| // Generation Settings | ||
| val temperature: Float = 0.7f, | ||
| val maxTokens: Int = 1000, |
There was a problem hiding this comment.
Inconsistent default maxTokens across platforms: Android uses 1000, iOS uses 10000. Consider standardizing.
Prompt To Fix With AI
This is a comment left during a code review.
Path: examples/android/RunAnywhereAI/app/src/main/java/com/runanywhere/runanywhereai/presentation/settings/SettingsViewModel.kt
Line: 54:54
Comment:
Inconsistent default `maxTokens` across platforms: Android uses 1000, iOS uses 10000. Consider standardizing.
How can I resolve this? If you propose a fix, please make it concise.| @@ -18,6 +18,7 @@ class SettingsViewModel: ObservableObject { | |||
| // Generation Settings | |||
| @Published var temperature: Double = 0.7 | |||
| @Published var maxTokens: Int = 10000 | |||
There was a problem hiding this comment.
Inconsistent default maxTokens: iOS uses 10000 while Android, Flutter, and React Native use 1000. Consider standardizing to 1000 for consistency.
Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
Prompt To Fix With AI
This is a comment left during a code review.
Path: examples/ios/RunAnywhereAI/RunAnywhereAI/Features/Settings/SettingsViewModel.swift
Line: 20:20
Comment:
Inconsistent default `maxTokens`: iOS uses 10000 while Android, Flutter, and React Native use 1000. Consider standardizing to 1000 for consistency.
<sub>Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!</sub>
How can I resolve this? If you propose a fix, please make it concise.There was a problem hiding this comment.
Actionable comments posted: 7
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
examples/flutter/RunAnywhereAI/lib/features/chat/chat_interface_view.dart (1)
124-142:⚠️ Potential issue | 🟡 MinorSystem prompt is not propagated through the tool-calling path.
When
useToolCallingis true (Line 135),_generateWithToolCallingis called with onlymaxTokensandtemperature— thesystemPromptread on Lines 124-126 is silently dropped. If the user configures a system prompt in settings, it won't take effect during tool-calling generation.Consider passing
systemPromptinto_generateWithToolCallingand incorporating it intoToolCallingOptions, or document that system prompts are intentionally unsupported for tool-calling mode.examples/android/RunAnywhereAI/app/src/main/java/com/runanywhere/runanywhereai/presentation/chat/ChatViewModel.kt (1)
245-270:⚠️ Potential issue | 🟡 MinorUse
getGenerationOptions()for consistency with other generation paths.
generateWithToolCalling()hardcodestemperature = 0.7fandmaxTokens = 1024(lines 267-268), whilegenerateStream()andgenerate()read fromSharedPreferencesviagetGenerationOptions(). This means users' custom temperature/maxTokens settings are ignored when tool calling is active.Apply the same pattern here:
Proposed fix
+ val genOptions = getGenerationOptions() val toolOptions = ToolCallingOptions( maxToolCalls = 3, autoExecute = true, - temperature = 0.7f, - maxTokens = 1024, + temperature = genOptions.temperature, + maxTokens = genOptions.maxTokens, format = format )
🤖 Fix all issues with AI agents
In
`@examples/android/RunAnywhereAI/app/src/main/java/com/runanywhere/runanywhereai/presentation/settings/SettingsScreen.kt`:
- Line 757: In the KDoc for SettingsScreen (SettingsScreen.kt) there is a
duplicated asterisk in the list description ("* * Allows users to:"); edit the
KDoc comment so the line reads a single leading list marker ("* Allows users
to:") to remove the extra `* ` and keep the list formatting consistent.
- Around line 185-191: The Slider configuration for max tokens (the Slider
composable using uiState.maxTokens and viewModel.updateMaxTokens) currently uses
valueRange = 50f..4096f and steps = 80 which yields non‑integer 50-token
increments; adjust the range and steps so the interval divides evenly (for
example change valueRange to 50f..4050f and steps to 79 or pick a max that is 50
* N + 50 from 50) so the slider snap points map to exact 50-token multiples and
onValueChange.toInt() produces clean values.
In `@examples/react-native/RunAnywhereAI/src/screens/ChatScreen.tsx`:
- Line 177: Fix the stray asterisk in the JSDoc comment above the reference to
detectToolCallFormat(): change the line that currently reads "* * Matches iOS:
LLMViewModel+ToolCalling.swift detectToolCallFormat()" to "* Matches iOS:
LLMViewModel+ToolCalling.swift detectToolCallFormat()" so the comment is
properly formatted; look for this JSDoc in ChatScreen.tsx near the
detectToolCallFormat() reference to apply the single-character edit.
- Around line 270-282: In getGenerationOptions, guard against parseInt producing
NaN for maxTokens and add error handling around AsyncStorage reads: wrap the
AsyncStorage.getItem calls in a try/catch, return default values
(temperature=0.7, maxTokens=1000, systemPrompt=undefined) on error, and when
parsing maxStr use a Number.isNaN check (similar to temperature) before
accepting parseInt(maxStr, 10) so maxTokens never becomes NaN; refer to
getGenerationOptions, GENERATION_SETTINGS_KEYS, temperature, maxTokens,
systemPrompt, and AsyncStorage to locate and update the code.
In `@sdk/runanywhere-commons/src/backends/llamacpp/llamacpp_backend.cpp`:
- Around line 548-571: The sampler initialization code calls
llama_sampler_chain_init(sparams) and then unconditionally uses sampler_ with
llama_sampler_chain_add(...), which can dereference null if init fails; after
calling llama_sampler_chain_init(sparams) check whether sampler_ is non-null and
if it is null, handle the failure cleanly (e.g., log an error via the existing
logger, avoid calling llama_sampler_chain_add, free any previously allocated
state if needed, and return or set an error status from the enclosing function)
so that subsequent calls (llama_sampler_chain_add, llama_sampler_init_* uses)
are only executed when sampler_ is valid.
In `@sdk/runanywhere-commons/src/jni/CMakeLists.txt`:
- Around line 25-39: The CMakeLists is missing the centralized versions include:
add include(LoadVersions) before the FetchContent_* calls so
NLOHMANN_JSON_VERSION is sourced from the shared VERSIONS file; update the area
around the existing NLOHMANN_JSON_VERSION check/FetchContent_Declare and ensure
the include(LoadVersions) appears prior to the
FetchContent_Declare/FetchContent_MakeAvailable lines that reference
nlohmann_json.
In `@sdk/runanywhere-react-native/packages/core/cpp/HybridRunAnywhereCore.cpp`:
- Around line 1419-1424: The code in HybridRunAnywhereCore.cpp is using
extractStringValue to read system_prompt which fails on escaped quotes/newlines;
replace the ad-hoc string extraction by parsing optionsJson with a proper JSON
parser or an existing helper (instead of extractStringValue) so system_prompt
can contain arbitrary escaped characters; update the blocks around
maxTokens/temperature/systemPrompt (the one starting at the current use and the
similar occurrences where extractStringValue is used at the other mentioned
spots) to read system_prompt via the parsed JSON object API (or reuse the
project’s JSON helper) and return the full, untruncated string; ensure you
adjust the calls where extractStringValue is used (and keep
max_tokens/temperature extraction consistent) so all three occurrences are
fixed.
🧹 Nitpick comments (10)
examples/android/RunAnywhereAI/app/src/main/java/com/runanywhere/runanywhereai/presentation/tts/TextToSpeechScreen.kt (1)
485-514: Consider removing unused parameters fromVoiceSettingsSectionor suppressing the warning.The
pitchandonPitchChangeparameters (lines 436-439) are still accepted byVoiceSettingsSectionand passed from the caller (lines 114-119), but are no longer used since the slider is commented out. If the pitch slider won't return soon, consider cleaning up the signature to avoid dead code. If it's coming back, a@Suppress("UNUSED_PARAMETER")annotation (likemaxCharacterson line 367) would make the intent clearer.examples/flutter/RunAnywhereAI/lib/features/models/model_selection_sheet.dart (1)
504-531: Delegation pattern looks good; minor robustness note on loading state reset.The refactored flow cleanly separates selection state (
_viewModel.selectModel) from the actual loading (callback), and thetry/catchproperly resets state on failure.One subtle point: on the success path,
_isLoadingModelis never reset tofalse— you rely onNavigator.popdestroying the widget. This works today, but if the sheet's lifecycle ever changes (e.g., it stays mounted after selection), the loading overlay would persist. Consider adding a_isLoadingModel = falsebefore thepopfor defensiveness:Suggested improvement
await widget.onModelSelected(model); if (mounted) { + setState(() { + _isLoadingModel = false; + }); Navigator.pop(context); }sdk/runanywhere-kotlin/src/jvmAndroidMain/kotlin/com/runanywhere/sdk/foundation/bridge/extensions/CppBridgeLLM.kt (1)
1241-1248: Consider deduplicatingescapeJson— it's copy-pasted across 7+ bridge files.The relevant code snippets show identical
escapeJsonimplementations inCppBridgeModelAssignment,CppBridgeDevice,CppBridgeDownload,CppBridgeSTT,CppBridgeStorage, andCppBridgeTTS. Extract it into a shared utility (e.g., a top-level function or aJsonUtilsobject in the bridge package) to reduce maintenance burden. Not blocking for this PR.sdk/runanywhere-kotlin/src/jvmAndroidMain/kotlin/com/runanywhere/sdk/public/extensions/RunAnywhere+TextGeneration.jvmAndroid.kt (1)
104-115: UnstructuredCoroutineScopeleaks if the caller cancels collection.This is pre-existing and not introduced by this PR, but since you're touching this area:
CoroutineScope(Dispatchers.IO)on line 105 (and similarly line 155) is not tied to any parent scope or job. If the downstream collector cancels (e.g., the UI disappears), the nativegenerateStreamcall continues running in the background until it naturally completes. Consider usingcoroutineScope { ... }orcallbackFlowto make the generation structurally cancellable.examples/react-native/RunAnywhereAI/src/types/settings.ts (1)
126-135:GENERATION_SETTINGS_KEYSincludesSYSTEM_PROMPTbutGenerationSettingsinterface (line 20-32) lacks asystemPromptfield.The persistence keys and the typed interface are out of sync. Consider adding
systemPrompt?: stringtoGenerationSettings(and optionally toDEFAULT_SETTINGS) so consumers can work with a single, consistent type rather than reading the prompt separately via rawAsyncStoragekeys.♻️ Suggested addition to GenerationSettings
export interface GenerationSettings { /** Temperature (0.0 - 2.0) */ temperature: number; /** Max tokens (500 - 20000) */ maxTokens: number; /** Top P (optional) */ topP?: number; /** Top K (optional) */ topK?: number; + + /** System prompt (optional) */ + systemPrompt?: string; }examples/react-native/RunAnywhereAI/src/screens/ChatScreen.tsx (1)
546-546:getGenerationOptionsis missing from theuseCallbackdependency array.
getGenerationOptionsis defined inside the component body but isn't listed in thehandleSenddependency array. While it currently only reads fromAsyncStorage(so staleness isn't a practical issue), the ESLintreact-hooks/exhaustive-depsrule will flag this. Either wrapgetGenerationOptionsin its ownuseCallback, or extract it outside the component since it doesn't depend on any component state.examples/android/RunAnywhereAI/app/src/main/java/com/runanywhere/runanywhereai/presentation/settings/SettingsScreen.kt (1)
151-155:String.formatis locale-sensitive — temperature may display with a comma on some locales.
String.format("%.1f", ...)uses the device's default locale, so e.g. German locale renders0,7instead of0.7. If a dot separator is intended, useString.format(java.util.Locale.US, "%.1f", uiState.temperature).examples/android/RunAnywhereAI/app/src/main/java/com/runanywhere/runanywhereai/presentation/settings/SettingsViewModel.kt (2)
393-409: Consider adding basic input validation for generation parameters.
updateTemperatureandupdateMaxTokensaccept any value without bounds checking. Extreme or negative values (e.g., temperature < 0 or maxTokens ≤ 0) could propagate to the backend and cause unexpected behavior. Even in an example app, a simple clamp is cheap insurance.♻️ Suggested validation
fun updateTemperature(value: Float) { - _uiState.update { it.copy(temperature = value) } + _uiState.update { it.copy(temperature = value.coerceIn(0f, 2f)) } } fun updateMaxTokens(value: Int) { - _uiState.update { it.copy(maxTokens = value) } + _uiState.update { it.copy(maxTokens = value.coerceIn(1, 4096)) } }
53-54: Default values fortemperatureandmaxTokensare duplicated in three places.The defaults
0.7fand1000appear inSettingsUiState,getGenerationSettings(), andloadGenerationSettings(). If these ever need to change, all three must be updated in lockstep. Consider extracting them as companion constants (e.g.,DEFAULT_TEMPERATURE,DEFAULT_MAX_TOKENS).♻️ Suggested extraction
Add to the companion object:
private const val DEFAULT_TEMPERATURE = 0.7f private const val DEFAULT_MAX_TOKENS = 1000Then reference these constants in all three locations:
// In SettingsUiState -val temperature: Float = 0.7f, -val maxTokens: Int = 1000, +val temperature: Float = SettingsViewModel.DEFAULT_TEMPERATURE, +val maxTokens: Int = SettingsViewModel.DEFAULT_MAX_TOKENS,// In getGenerationSettings() -val temperature = prefs.getFloat(KEY_TEMPERATURE, 0.7f) -val maxTokens = prefs.getInt(KEY_MAX_TOKENS, 1000) +val temperature = prefs.getFloat(KEY_TEMPERATURE, DEFAULT_TEMPERATURE) +val maxTokens = prefs.getInt(KEY_MAX_TOKENS, DEFAULT_MAX_TOKENS)// In loadGenerationSettings() -val temperature = generationPrefs.getFloat(KEY_TEMPERATURE, 0.7f) -val maxTokens = generationPrefs.getInt(KEY_MAX_TOKENS, 1000) +val temperature = generationPrefs.getFloat(KEY_TEMPERATURE, DEFAULT_TEMPERATURE) +val maxTokens = generationPrefs.getInt(KEY_MAX_TOKENS, DEFAULT_MAX_TOKENS)Also applies to: 187-188, 373-374
sdk/runanywhere-swift/Sources/RunAnywhere/Public/Extensions/LLM/RunAnywhere+StructuredOutput.swift (1)
22-50: Consider extracting the duplicatedwithCStringsystem-prompt scoping into a shared helper.The conditional
withCStringpattern forsystem_promptlifetime management is now repeated in three places:generate(),createTokenStream(), andgenerateForStructuredOutput(). A small helper (e.g.,withSystemPromptScope(systemPrompt:options:body:)) could reduce duplication.That said, keeping FFI lifetime scoping explicit is a defensible choice — just noting the repetition.
| Slider( | ||
| value = uiState.maxTokens.toFloat(), | ||
| onValueChange = { viewModel.updateMaxTokens(it.toInt()) }, | ||
| valueRange = 50f..4096f, | ||
| steps = 80, // 50-token increments | ||
| modifier = Modifier.fillMaxWidth(), | ||
| ) |
There was a problem hiding this comment.
Max Tokens slider steps don't produce clean 50-token increments.
With valueRange = 50f..4096f and steps = 80, the Slider creates 81 intervals of size (4096 - 50) / 81 ≈ 49.95, so the snap points won't land on round numbers. Users will see values like 150, 250, 349, 449… after toInt().
If clean 50-token increments are desired, consider changing the range to 50f..4050f with steps = 79, or use a different max that's reachable in exact 50-step increments from 50.
Suggested fix (clean 50-token steps)
Slider(
value = uiState.maxTokens.toFloat(),
onValueChange = { viewModel.updateMaxTokens(it.toInt()) },
- valueRange = 50f..4096f,
- steps = 80, // 50-token increments
+ valueRange = 50f..4050f,
+ steps = 79, // 50-token increments: 50, 100, 150, ..., 4050
modifier = Modifier.fillMaxWidth(),
)📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| Slider( | |
| value = uiState.maxTokens.toFloat(), | |
| onValueChange = { viewModel.updateMaxTokens(it.toInt()) }, | |
| valueRange = 50f..4096f, | |
| steps = 80, // 50-token increments | |
| modifier = Modifier.fillMaxWidth(), | |
| ) | |
| Slider( | |
| value = uiState.maxTokens.toFloat(), | |
| onValueChange = { viewModel.updateMaxTokens(it.toInt()) }, | |
| valueRange = 50f..4050f, | |
| steps = 79, // 50-token increments: 50, 100, 150, ..., 4050 | |
| modifier = Modifier.fillMaxWidth(), | |
| ) |
🤖 Prompt for AI Agents
In
`@examples/android/RunAnywhereAI/app/src/main/java/com/runanywhere/runanywhereai/presentation/settings/SettingsScreen.kt`
around lines 185 - 191, The Slider configuration for max tokens (the Slider
composable using uiState.maxTokens and viewModel.updateMaxTokens) currently uses
valueRange = 50f..4096f and steps = 80 which yields non‑integer 50-token
increments; adjust the range and steps so the interval divides evenly (for
example change valueRange to 50f..4050f and steps to 79 or pick a max that is 50
* N + 50 from 50) so the slider snap points map to exact 50-token multiples and
onValueChange.toInt() produces clean values.
| * Tool Calling Settings Section | ||
| * | ||
| * Allows users to: | ||
| * * Allows users to: |
There was a problem hiding this comment.
Minor KDoc typo: extra *.
Line 757 reads * * Allows users to: — the leading * is duplicated.
- * * Allows users to:
+ * Allows users to:📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| * * Allows users to: | |
| * Allows users to: |
🤖 Prompt for AI Agents
In
`@examples/android/RunAnywhereAI/app/src/main/java/com/runanywhere/runanywhereai/presentation/settings/SettingsScreen.kt`
at line 757, In the KDoc for SettingsScreen (SettingsScreen.kt) there is a
duplicated asterisk in the list description ("* * Allows users to:"); edit the
KDoc comment so the line reads a single leading list marker ("* Allows users
to:") to remove the extra `* ` and keep the list formatting consistent.
| * LFM2-Tool models use Pythonic format, others use JSON format | ||
| * | ||
| * Matches iOS: LLMViewModel+ToolCalling.swift detectToolCallFormat() | ||
| * * Matches iOS: LLMViewModel+ToolCalling.swift detectToolCallFormat() |
There was a problem hiding this comment.
Minor comment typo: * * → *
There's a stray asterisk in the JSDoc comment.
- * * Matches iOS: LLMViewModel+ToolCalling.swift detectToolCallFormat()
+ * Matches iOS: LLMViewModel+ToolCalling.swift detectToolCallFormat()📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| * * Matches iOS: LLMViewModel+ToolCalling.swift detectToolCallFormat() | |
| * Matches iOS: LLMViewModel+ToolCalling.swift detectToolCallFormat() |
🤖 Prompt for AI Agents
In `@examples/react-native/RunAnywhereAI/src/screens/ChatScreen.tsx` at line 177,
Fix the stray asterisk in the JSDoc comment above the reference to
detectToolCallFormat(): change the line that currently reads "* * Matches iOS:
LLMViewModel+ToolCalling.swift detectToolCallFormat()" to "* Matches iOS:
LLMViewModel+ToolCalling.swift detectToolCallFormat()" so the comment is
properly formatted; look for this JSDoc in ChatScreen.tsx near the
detectToolCallFormat() reference to apply the single-character edit.
| const getGenerationOptions = async (): Promise<GenerationOptions> => { | ||
| const tempStr = await AsyncStorage.getItem(GENERATION_SETTINGS_KEYS.TEMPERATURE); | ||
| const maxStr = await AsyncStorage.getItem(GENERATION_SETTINGS_KEYS.MAX_TOKENS); | ||
| const sysStr = await AsyncStorage.getItem(GENERATION_SETTINGS_KEYS.SYSTEM_PROMPT); | ||
|
|
||
| const temperature = tempStr !== null && !Number.isNaN(parseFloat(tempStr)) ? parseFloat(tempStr) : 0.7; | ||
| const maxTokens = maxStr ? parseInt(maxStr, 10) : 1000; | ||
| const systemPrompt = sysStr && sysStr.trim() !== '' ? sysStr : undefined; | ||
|
|
||
| console.log(`[PARAMS] App getGenerationOptions: temperature=${temperature}, maxTokens=${maxTokens}, systemPrompt=${systemPrompt ? `set(${systemPrompt.length} chars)` : 'nil'}`); | ||
|
|
||
| return { temperature, maxTokens, systemPrompt }; | ||
| }; |
There was a problem hiding this comment.
Missing NaN guard for maxTokens and no error handling around AsyncStorage reads.
Two issues in getGenerationOptions:
maxTokenscan silently becomeNaN—parseInton a non-numeric string returnsNaN, but there's no guard (unliketemperatureon line 275 which checks!Number.isNaN(...)).AsyncStorage.getItemcan throw (e.g. storage corruption, iOS Keychain issues). An unhandled rejection here will bubble up and fail the entire generation call. Wrap in try/catch and fall back to defaults.
🐛 Proposed fix
const getGenerationOptions = async (): Promise<GenerationOptions> => {
- const tempStr = await AsyncStorage.getItem(GENERATION_SETTINGS_KEYS.TEMPERATURE);
- const maxStr = await AsyncStorage.getItem(GENERATION_SETTINGS_KEYS.MAX_TOKENS);
- const sysStr = await AsyncStorage.getItem(GENERATION_SETTINGS_KEYS.SYSTEM_PROMPT);
-
- const temperature = tempStr !== null && !Number.isNaN(parseFloat(tempStr)) ? parseFloat(tempStr) : 0.7;
- const maxTokens = maxStr ? parseInt(maxStr, 10) : 1000;
- const systemPrompt = sysStr && sysStr.trim() !== '' ? sysStr : undefined;
+ let tempStr: string | null = null;
+ let maxStr: string | null = null;
+ let sysStr: string | null = null;
+ try {
+ tempStr = await AsyncStorage.getItem(GENERATION_SETTINGS_KEYS.TEMPERATURE);
+ maxStr = await AsyncStorage.getItem(GENERATION_SETTINGS_KEYS.MAX_TOKENS);
+ sysStr = await AsyncStorage.getItem(GENERATION_SETTINGS_KEYS.SYSTEM_PROMPT);
+ } catch (e) {
+ console.warn('[PARAMS] Failed to read generation settings from AsyncStorage, using defaults:', e);
+ }
+
+ const parsedTemp = tempStr !== null ? parseFloat(tempStr) : NaN;
+ const temperature = !Number.isNaN(parsedTemp) ? parsedTemp : 0.7;
+ const parsedMax = maxStr !== null ? parseInt(maxStr, 10) : NaN;
+ const maxTokens = !Number.isNaN(parsedMax) ? parsedMax : 1000;
+ const systemPrompt = sysStr && sysStr.trim() !== '' ? sysStr : undefined;📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| const getGenerationOptions = async (): Promise<GenerationOptions> => { | |
| const tempStr = await AsyncStorage.getItem(GENERATION_SETTINGS_KEYS.TEMPERATURE); | |
| const maxStr = await AsyncStorage.getItem(GENERATION_SETTINGS_KEYS.MAX_TOKENS); | |
| const sysStr = await AsyncStorage.getItem(GENERATION_SETTINGS_KEYS.SYSTEM_PROMPT); | |
| const temperature = tempStr !== null && !Number.isNaN(parseFloat(tempStr)) ? parseFloat(tempStr) : 0.7; | |
| const maxTokens = maxStr ? parseInt(maxStr, 10) : 1000; | |
| const systemPrompt = sysStr && sysStr.trim() !== '' ? sysStr : undefined; | |
| console.log(`[PARAMS] App getGenerationOptions: temperature=${temperature}, maxTokens=${maxTokens}, systemPrompt=${systemPrompt ? `set(${systemPrompt.length} chars)` : 'nil'}`); | |
| return { temperature, maxTokens, systemPrompt }; | |
| }; | |
| const getGenerationOptions = async (): Promise<GenerationOptions> => { | |
| let tempStr: string | null = null; | |
| let maxStr: string | null = null; | |
| let sysStr: string | null = null; | |
| try { | |
| tempStr = await AsyncStorage.getItem(GENERATION_SETTINGS_KEYS.TEMPERATURE); | |
| maxStr = await AsyncStorage.getItem(GENERATION_SETTINGS_KEYS.MAX_TOKENS); | |
| sysStr = await AsyncStorage.getItem(GENERATION_SETTINGS_KEYS.SYSTEM_PROMPT); | |
| } catch (e) { | |
| console.warn('[PARAMS] Failed to read generation settings from AsyncStorage, using defaults:', e); | |
| } | |
| const parsedTemp = tempStr !== null ? parseFloat(tempStr) : NaN; | |
| const temperature = !Number.isNaN(parsedTemp) ? parsedTemp : 0.7; | |
| const parsedMax = maxStr !== null ? parseInt(maxStr, 10) : NaN; | |
| const maxTokens = !Number.isNaN(parsedMax) ? parsedMax : 1000; | |
| const systemPrompt = sysStr && sysStr.trim() !== '' ? sysStr : undefined; | |
| console.log(`[PARAMS] App getGenerationOptions: temperature=${temperature}, maxTokens=${maxTokens}, systemPrompt=${systemPrompt ? `set(${systemPrompt.length} chars)` : 'nil'}`); | |
| return { temperature, maxTokens, systemPrompt }; | |
| }; |
🤖 Prompt for AI Agents
In `@examples/react-native/RunAnywhereAI/src/screens/ChatScreen.tsx` around lines
270 - 282, In getGenerationOptions, guard against parseInt producing NaN for
maxTokens and add error handling around AsyncStorage reads: wrap the
AsyncStorage.getItem calls in a try/catch, return default values
(temperature=0.7, maxTokens=1000, systemPrompt=undefined) on error, and when
parsing maxStr use a Number.isNaN check (similar to temperature) before
accepting parseInt(maxStr, 10) so maxTokens never becomes NaN; refer to
getGenerationOptions, GENERATION_SETTINGS_KEYS, temperature, maxTokens,
systemPrompt, and AsyncStorage to locate and update the code.
| // Configure sampler with request parameters | ||
| if (sampler_) { | ||
| llama_sampler_free(sampler_); | ||
| } | ||
|
|
||
| auto sparams = llama_sampler_chain_default_params(); | ||
| sparams.no_perf = true; | ||
| sampler_ = llama_sampler_chain_init(sparams); | ||
|
|
||
| if (request.temperature > 0.0f) { | ||
| // Use default penalties (1.2f repetition) or request params if added later | ||
| llama_sampler_chain_add(sampler_, | ||
| llama_sampler_init_penalties(64, request.repetition_penalty, 0.0f, 0.0f)); | ||
|
|
||
| if (request.top_k > 0) { | ||
| llama_sampler_chain_add(sampler_, llama_sampler_init_top_k(request.top_k)); | ||
| } | ||
|
|
||
| llama_sampler_chain_add(sampler_, llama_sampler_init_top_p(request.top_p, 1)); | ||
| llama_sampler_chain_add(sampler_, llama_sampler_init_temp(request.temperature)); | ||
| llama_sampler_chain_add(sampler_, llama_sampler_init_dist(LLAMA_DEFAULT_SEED)); | ||
| } else { | ||
| llama_sampler_chain_add(sampler_, llama_sampler_init_greedy()); | ||
| } |
There was a problem hiding this comment.
Guard against sampler init failure before chaining.
If llama_sampler_chain_init() fails (OOM, etc.), the subsequent llama_sampler_chain_add() calls will dereference a null pointer. Add a null check and bail out cleanly.
🔧 Suggested fix
auto sparams = llama_sampler_chain_default_params();
sparams.no_perf = true;
sampler_ = llama_sampler_chain_init(sparams);
+ if (!sampler_) {
+ LOGE("Failed to initialize sampler chain");
+ llama_batch_free(batch);
+ return false;
+ }🤖 Prompt for AI Agents
In `@sdk/runanywhere-commons/src/backends/llamacpp/llamacpp_backend.cpp` around
lines 548 - 571, The sampler initialization code calls
llama_sampler_chain_init(sparams) and then unconditionally uses sampler_ with
llama_sampler_chain_add(...), which can dereference null if init fails; after
calling llama_sampler_chain_init(sparams) check whether sampler_ is non-null and
if it is null, handle the failure cleanly (e.g., log an error via the existing
logger, avoid calling llama_sampler_chain_add, free any previously allocated
state if needed, and return or set an error status from the enclosing function)
so that subsequent calls (llama_sampler_chain_add, llama_sampler_init_* uses)
are only executed when sampler_ is valid.
| include(FetchContent) | ||
|
|
||
| # needed for string parsing | ||
| if(NOT DEFINED NLOHMANN_JSON_VERSION) | ||
| set(NLOHMANN_JSON_VERSION "3.11.3") | ||
| endif() | ||
|
|
||
| FetchContent_Declare( | ||
| nlohmann_json | ||
| GIT_REPOSITORY https://github.com/nlohmann/json.git | ||
| GIT_TAG v${NLOHMANN_JSON_VERSION} | ||
| GIT_SHALLOW TRUE | ||
| ) | ||
| set(JSON_BuildTests OFF CACHE BOOL "" FORCE) | ||
| FetchContent_MakeAvailable(nlohmann_json) |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Locate VERSIONS files
echo "=== Locating VERSIONS files ==="
fd -t f VERSIONS
echo -e "\n=== Searching for nlohmann/json version entries in VERSIONS ==="
fd -t f VERSIONS -x cat {} \; 2>/dev/null
echo -e "\n=== Searching for NLOHMANN_JSON_VERSION references across repo ==="
rg -n "NLOHMANN_JSON_VERSION" --max-count=20Repository: RunanywhereAI/runanywhere-sdks
Length of output: 7194
🏁 Script executed:
#!/bin/bash
echo "=== Checking if jni/CMakeLists.txt includes LoadVersions ==="
cat -n sdk/runanywhere-commons/src/jni/CMakeLists.txt | head -40
echo -e "\n=== Checking LoadVersions.cmake to understand variable naming ==="
cat sdk/runanywhere-commons/cmake/LoadVersions.cmake | head -80
echo -e "\n=== Checking how other CMakeLists files use LoadVersions ==="
rg -A5 "include.*LoadVersions" sdk/runanywhere-commons/Repository: RunanywhereAI/runanywhere-sdks
Length of output: 8631
Add include(LoadVersions) to source the centralized NLOHMANN_JSON_VERSION.
This CMakeLists.txt is the only one in the commons library missing the LoadVersions include. Other backends (onnx, whispercpp) include it before dependency declarations. Add include(LoadVersions) near line 25 (after the cmake_minimum_required and project declarations, before FetchContent) to pull versions from the centralized VERSIONS file. The version is already defined there as 3.11.3.
🤖 Prompt for AI Agents
In `@sdk/runanywhere-commons/src/jni/CMakeLists.txt` around lines 25 - 39, The
CMakeLists is missing the centralized versions include: add
include(LoadVersions) before the FetchContent_* calls so NLOHMANN_JSON_VERSION
is sourced from the shared VERSIONS file; update the area around the existing
NLOHMANN_JSON_VERSION check/FetchContent_Declare and ensure the
include(LoadVersions) appears prior to the
FetchContent_Declare/FetchContent_MakeAvailable lines that reference
nlohmann_json.
| std::string systemPrompt; | ||
| if (optionsJson.has_value()) { | ||
| maxTokens = extractIntValue(optionsJson.value(), "max_tokens", 256); | ||
| temperature = static_cast<float>(extractDoubleValue(optionsJson.value(), "temperature", 0.7)); | ||
| systemPrompt = extractStringValue(optionsJson.value(), "system_prompt", ""); | ||
| } |
There was a problem hiding this comment.
System prompt parsing will break on escaped quotes/newlines.
extractStringValue() stops at the first " and doesn’t handle escapes, so realistic system prompts (quotes, newlines) will be truncated or dropped. Please parse options JSON with a real JSON parser (or reuse an existing parser helper) so system_prompt can safely carry arbitrary text.
Also applies to: 1520-1526, 1599-1610
🤖 Prompt for AI Agents
In `@sdk/runanywhere-react-native/packages/core/cpp/HybridRunAnywhereCore.cpp`
around lines 1419 - 1424, The code in HybridRunAnywhereCore.cpp is using
extractStringValue to read system_prompt which fails on escaped quotes/newlines;
replace the ad-hoc string extraction by parsing optionsJson with a proper JSON
parser or an existing helper (instead of extractStringValue) so system_prompt
can contain arbitrary escaped characters; update the blocks around
maxTokens/temperature/systemPrompt (the one starting at the current use and the
similar occurrences where extractStringValue is used at the other mentioned
spots) to read system_prompt via the parsed JSON object API (or reuse the
project’s JSON helper) and return the full, untruncated string; ensure you
adjust the calls where extractStringValue is used (and keep
max_tokens/temperature extraction consistent) so all three occurrences are
fixed.
* Fix parametre piping through sdk * Kotlin sdk updated, along with example app and cpp jni * TTS pitch ui removal For the example ios app, removed the tts pitch slider * Flutter sdk + example apps * React native sdk+ example apps * Fixed typos and Logging
Changes made:
Summary by CodeRabbit
New Features
Bug Fixes
Changes
Greptile Overview
Greptile Summary
This PR refactors parameter handling to configure sampling parameters (temperature, top_p, top_k, repetition_penalty) per-request instead of at model load time, and adds system prompt support across all SDK platforms.
Major Changes:
load_model()togenerate_stream(), rebuilding the sampler with request parameters for each generation callsystem_promptfield throughout the stack (C API → JNI/Swift/Dart/React Native bridges → public SDKs) with proper lifetime managementTechnical Details:
generate_stream()nlohmann::jsonlibrary in JNI layer for safer serializationConfidence Score: 4/5
Important Files Changed
Sequence Diagram
sequenceDiagram participant App as Example App<br/>(iOS/Android/Flutter/RN) participant SDK as SDK Layer<br/>(Swift/Kotlin/Dart/JS) participant Bridge as Language Bridge<br/>(FFI/JNI/C++) participant Backend as LlamaCpp Backend participant Sampler as Sampler Chain App->>App: User configures settings<br/>(temp, max_tokens, system_prompt) App->>SDK: generate(prompt, options)<br/>options.systemPrompt = "..."<br/>options.temperature = 0.7 SDK->>Bridge: Pass options via FFI/JNI<br/>(withCString / JSON / native struct) Bridge->>Backend: rac_llm_component_generate()<br/>rac_llm_options_t Backend->>Backend: generate_stream(request) Note over Backend,Sampler: 🔄 Per-Request Configuration (NEW) Backend->>Sampler: llama_sampler_free(old) Backend->>Sampler: llama_sampler_chain_init() Backend->>Sampler: add penalties(request.repetition_penalty) Backend->>Sampler: add top_k(request.top_k) Backend->>Sampler: add top_p(request.top_p) Backend->>Sampler: add temp(request.temperature) Backend->>Backend: apply_chat_template()<br/>+ request.system_prompt Backend->>Backend: llama_decode(prompt) loop Token Generation Backend->>Sampler: llama_sampler_sample() Sampler-->>Backend: next_token Backend-->>SDK: stream token SDK-->>App: display token end Backend-->>Bridge: generation complete Bridge-->>SDK: result + metrics SDK-->>App: LLMGenerationResultLast reviewed commit: fba5ccc