Skip to content

fix: persist /model selection across restarts#707

Open
ilblackdragon wants to merge 4 commits intomainfrom
fix/persist-model-switch
Open

fix: persist /model selection across restarts#707
ilblackdragon wants to merge 4 commits intomainfrom
fix/persist-model-switch

Conversation

@ilblackdragon
Copy link
Member

Summary

  • /model <name> only called set_model() in memory — the choice was lost on restart
  • Now persists to DB settings store (if available) and config.toml (if it exists)
  • Added regression test verifying TOML round-trip for model updates

Test plan

  • Run cargo test --lib toml_selected_model_update_persists — passes
  • Use /model <name> to switch models, restart ironclaw, verify the model persists
  • Verify clippy clean: cargo clippy --lib --all-features

🤖 Generated with Claude Code

The /model command called set_model() on the LLM provider but never
saved the choice to settings, so the model reverted on restart. Now
persists to both the DB settings store and config.toml.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings March 8, 2026 02:04
@github-actions github-actions bot added scope: agent Agent core (agent loop, router, scheduler) scope: config Configuration size: M 50-199 changed lines risk: medium Business logic, config, or moderate-risk modules labels Mar 8, 2026
@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request enhances the application's usability by ensuring that user-selected language model choices are no longer ephemeral. By implementing a robust persistence mechanism, the system now remembers the preferred model across sessions, significantly improving the user experience and reducing the need for repeated configuration after restarts.

Highlights

  • Model Selection Persistence: The /model <name> command now correctly persists the selected model across application restarts, addressing a previous issue where the choice was lost.
  • Configuration Storage: Model persistence is achieved by saving the selected model to both the database settings store (if available) and the config.toml file (if it exists).
  • Regression Test Added: A new regression test, toml_selected_model_update_persists, has been introduced to verify the correct round-trip persistence of model updates in the TOML configuration.
Changelog
  • src/agent/commands.rs
    • Modified the /model command handler to invoke a new persist_selected_model function after successfully setting the in-memory model.
    • Added a new asynchronous private function persist_selected_model to handle saving the chosen model to the database and/or config.toml.
  • src/settings.rs
    • Introduced a new unit test toml_selected_model_update_persists to validate that model selections are correctly saved to and loaded from the TOML configuration file.
Activity
  • The pull request description includes a detailed summary of the changes and a test plan for verification.
  • The PR was generated with Claude Code, indicating AI assistance in its creation.
  • The author provided specific instructions for testing, including running a new regression test and verifying model persistence across restarts.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request implements persistence for the /model selection, ensuring the choice is saved across application restarts to the database and the config.toml file, and includes a new regression test. However, this persistence significantly increases the impact of a lack of authorization on the /model command, which is a highly sensitive operation and requires restriction to authenticated channels (Rule 1). In multi-user deployments, any user can now globally change the agent's model for all other users and across restarts, potentially leading to unexpected costs or service disruption. An authorization check is recommended to restrict this command to administrative interfaces or users. Additionally, there is a potential read-modify-write race condition when updating the config.toml file, which could lead to lost configuration changes under concurrent access, highlighting the need for atomic operations to prevent data loss due to concurrency (Rule 7).

))),
Ok(()) => {
// Persist the model choice so it survives restarts.
self.persist_selected_model(requested).await;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-medium medium

The /model command allows any user to change the active LLM model for the entire agent. This change is now persisted to the database and the config.toml file (via persist_selected_model), meaning it survives restarts and affects all subsequent sessions for all users.

In a multi-user environment, this lacks proper authorization control. An unprivileged user could potentially switch the agent to a much more expensive model (leading to cost escalation) or to an invalid/unsupported model (leading to a denial of service for other users).

Consider adding an authorization check similar to the one used for the /restart command (line 518) to restrict model switching to trusted channels or administrative users.

References
  1. The /model command, now with persistence, acts as a highly sensitive command, similar to /restart, requiring proper authorization to prevent unauthorized changes, cost escalation, or denial of service in multi-user environments.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a pre-existing concern — the /model command already lacked authorization before this PR. Adding auth is a valid enhancement but is out of scope for this bug fix (persistence only). Happy to file a follow-up issue if desired.

Comment on lines +843 to +858
// 2. Update TOML config file if it exists.
let toml_path = crate::settings::Settings::default_toml_path();
if toml_path.exists() {
match crate::settings::Settings::load_toml(&toml_path) {
Ok(Some(mut settings)) => {
settings.selected_model = Some(model.to_string());
if let Err(e) = settings.save_toml(&toml_path) {
tracing::warn!("Failed to persist model to config.toml: {}", e);
}
}
Ok(None) => {}
Err(e) => {
tracing::warn!("Failed to load config.toml for model persistence: {}", e);
}
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The current read-modify-write pattern on config.toml is susceptible to a race condition. If another process modifies the file between Settings::load_toml and settings.save_toml, those concurrent changes will be lost.

To ensure atomicity, you should acquire an exclusive file lock before reading the file and release it after writing. The fs2 crate is a common choice for handling file locks in a cross-platform way. This would likely involve modifying load_toml and save_toml to operate on a locked file handle.

References
  1. This comment addresses a read-modify-write race condition on a file, which, while not a TOCTOU with read/write locks, shares the core concern of preventing data corruption or loss due to concurrent access, aligning with the principle of ensuring atomicity in critical operations.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The TOML read-modify-write is a rare user-initiated operation (manual /model command) in a single-process agent. Adding file locking would require a new dependency (fs2) for negligible practical benefit. The DB write is the authoritative persistence path; TOML is a convenience fallback.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Ensures the /model <name> command persists the selected model across restarts by writing the choice to persistent settings backends (DB settings store when available, and config.toml when present), and adds a regression test around TOML persistence.

Changes:

  • Persist /model changes via a new Agent::persist_selected_model() helper (DB + optional TOML update).
  • Update /model command flow to invoke persistence after a successful in-memory model switch.
  • Add a TOML round-trip regression test for selected_model updates.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 4 comments.

File Description
src/agent/commands.rs Adds best-effort persistence for /model selections to DB and config.toml.
src/settings.rs Adds a regression test asserting selected_model survives TOML update + reload.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 665 to +673
match self.llm().set_model(requested) {
Ok(()) => Ok(SubmissionResult::response(format!(
"Switched model to: {}",
requested
))),
Ok(()) => {
// Persist the model choice so it survives restarts.
self.persist_selected_model(requested).await;
Ok(SubmissionResult::response(format!(
"Switched model to: {}",
requested
)))
}
Copy link

Copilot AI Mar 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/model now awaits persist_selected_model(). Because that does I/O (DB + filesystem), a slow/unavailable store or a slow disk can delay or stall the user-visible command response. Consider making persistence fire-and-forget (e.g., tokio::spawn) and/or adding a short timeout so the model switch remains responsive while still attempting best-effort persistence.

Copilot uses AI. Check for mistakes.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The DB call is async (non-blocking) and the TOML I/O is already in spawn_blocking, so the actual executor blocking is minimal. Making it fire-and-forget via tokio::spawn would require 'static lifetimes for self.store() which isn't ergonomic here. Acceptable latency for a rare manual command.

Comment on lines +843 to +858
// 2. Update TOML config file if it exists.
let toml_path = crate::settings::Settings::default_toml_path();
if toml_path.exists() {
match crate::settings::Settings::load_toml(&toml_path) {
Ok(Some(mut settings)) => {
settings.selected_model = Some(model.to_string());
if let Err(e) = settings.save_toml(&toml_path) {
tracing::warn!("Failed to persist model to config.toml: {}", e);
}
}
Ok(None) => {}
Err(e) => {
tracing::warn!("Failed to load config.toml for model persistence: {}", e);
}
}
}
Copy link

Copilot AI Mar 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

persist_selected_model() performs synchronous filesystem operations (Path::exists, Settings::load_toml, save_toml) inside an async function. This can block the Tokio executor thread. Consider moving the TOML read/modify/write into tokio::task::spawn_blocking or switching load_toml/save_toml to tokio::fs-based I/O for this call path.

Copilot uses AI. Check for mistakes.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed in 1edbfb3 — TOML file I/O is now wrapped in tokio::task::spawn_blocking.

Comment on lines +846 to +852
match crate::settings::Settings::load_toml(&toml_path) {
Ok(Some(mut settings)) => {
settings.selected_model = Some(model.to_string());
if let Err(e) = settings.save_toml(&toml_path) {
tracing::warn!("Failed to persist model to config.toml: {}", e);
}
}
Copy link

Copilot AI Mar 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updating config.toml via settings.save_toml() rewrites the entire file, which will reformat and discard any user comments/hand-organization in their existing TOML. If preserving user-edited TOML is a goal, consider a more targeted update (e.g., edit only selected_model in the existing document, or prefer DB persistence and leave TOML untouched unless generated by config init).

Copilot uses AI. Check for mistakes.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

save_toml() rewriting the file is the existing pattern used by config init and the setup wizard. Changing to a targeted TOML edit would be a larger refactor affecting the shared Settings infrastructure — out of scope for this bug fix.

Comment on lines +1177 to +1198
/// Regression test: /model command must persist selected_model to TOML config.
/// Prior to the fix, `set_model()` only changed the in-memory provider and the
/// choice was lost on restart.
#[test]
fn toml_selected_model_update_persists() {
let dir = tempfile::tempdir().unwrap();
let path = dir.path().join("config.toml");

// Start with a config that has a different model.
let mut settings = Settings::default();
settings.selected_model = Some("old-model".to_string());
settings.save_toml(&path).unwrap();

// Simulate what persist_selected_model does: load, update, save.
let mut loaded = Settings::load_toml(&path).unwrap().unwrap();
loaded.selected_model = Some("new-model".to_string());
loaded.save_toml(&path).unwrap();

// Verify the change survived a reload.
let reloaded = Settings::load_toml(&path).unwrap().unwrap();
assert_eq!(reloaded.selected_model, Some("new-model".to_string()));
}
Copy link

Copilot AI Mar 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This regression test only exercises Settings::load_toml/save_toml and would still pass even if /model never called persist_selected_model(). To guard against the reported bug, consider factoring the TOML-update logic into a testable helper (or exposing persist_selected_model behind a smaller unit) and asserting that invoking it updates the on-disk TOML (and ideally the DB setting) as expected.

Copilot uses AI. Check for mistakes.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Constructing a full Agent in a unit test isn't feasible without extensive mocking infrastructure. The test covers the TOML round-trip path which was the specific untested mechanism. The wiring (calling persist_selected_model from /model) is straightforward and visible in the diff.

ilblackdragon and others added 2 commits March 7, 2026 18:20
- Use struct init syntax instead of field reassignment in test (clippy)
- Wrap sync filesystem operations in spawn_blocking to avoid blocking
  the tokio executor

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings March 8, 2026 02:25
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +861 to +863
})
.await;
}
Copy link

Copilot AI Mar 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The result of spawn_blocking(...).await is ignored, which can hide panics/cancellation in the blocking task (it returns a Result<_, JoinError>). Consider handling the await result and logging a warning on Err so persistence failures aren’t silently dropped.

Copilot uses AI. Check for mistakes.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in a792f54 — now logs a warning on JoinError instead of silently discarding with let _ =.

Comment on lines +847 to +859
if toml_path.exists() {
match crate::settings::Settings::load_toml(&toml_path) {
Ok(Some(mut settings)) => {
settings.selected_model = Some(model_owned);
if let Err(e) = settings.save_toml(&toml_path) {
tracing::warn!("Failed to persist model to config.toml: {}", e);
}
}
Ok(None) => {}
Err(e) => {
tracing::warn!("Failed to load config.toml for model persistence: {}", e);
}
}
Copy link

Copilot AI Mar 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

toml_path.exists() returns false on metadata errors (e.g., permission denied), which would silently skip persistence. Prefer attempting Settings::load_toml(&toml_path) directly and handling Ok(None) vs Err(e) so unexpected I/O errors can be logged.

Suggested change
if toml_path.exists() {
match crate::settings::Settings::load_toml(&toml_path) {
Ok(Some(mut settings)) => {
settings.selected_model = Some(model_owned);
if let Err(e) = settings.save_toml(&toml_path) {
tracing::warn!("Failed to persist model to config.toml: {}", e);
}
}
Ok(None) => {}
Err(e) => {
tracing::warn!("Failed to load config.toml for model persistence: {}", e);
}
}
match crate::settings::Settings::load_toml(&toml_path) {
Ok(Some(mut settings)) => {
settings.selected_model = Some(model_owned);
if let Err(e) = settings.save_toml(&toml_path) {
tracing::warn!("Failed to persist model to config.toml: {}", e);
}
}
Ok(None) => {
// No existing config file; nothing to update.
}
Err(e) => {
tracing::warn!(
"Failed to load config.toml for model persistence: {}",
e
);
}

Copilot uses AI. Check for mistakes.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in a792f54 — removed the .exists() guard. load_toml() already returns Ok(None) for missing files, so we now call it directly and let unexpected I/O errors surface in the Err branch.

- Log warning if spawn_blocking task panics/is cancelled (JoinError)
- Remove toml_path.exists() guard; load_toml already returns Ok(None)
  for missing files, so permission errors are no longer silently skipped

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@ilblackdragon ilblackdragon enabled auto-merge (squash) March 8, 2026 02:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

contributor: core 20+ merged PRs risk: medium Business logic, config, or moderate-risk modules scope: agent Agent core (agent loop, router, scheduler) scope: config Configuration size: M 50-199 changed lines

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants