Skip to content

feat(llm): switch model profile on user message#2192

Merged
VascoSch92 merged 11 commits intomainfrom
command-model
Mar 9, 2026
Merged

feat(llm): switch model profile on user message#2192
VascoSch92 merged 11 commits intomainfrom
command-model

Conversation

@VascoSch92
Copy link
Copy Markdown
Contributor

@VascoSch92 VascoSch92 commented Feb 23, 2026

Summary

SDK side of the /model <MODEL_NAME> [prompt] command (ref #2018 )

The SDK exposes a method conversation.switch_profile() to switch LLM, which client applications can use to implement commands like, e.g. /model <MODEL_NAME>: Switches the current LLM profile to the one specified.

Note:

  • I’m not certain this is the best approach, but it’s functional and keeps the LocalConversation class clean.
  • Happy to add an example if needed.

Checklist

  • If the PR is changing/adding functionality, are there tests to reflect this?
  • If there is an example, have you run the example to make sure that it works?
  • If there are instructions on how to run the code, have you followed the instructions and made sure that it works?
  • If the feature is significant enough to require documentation, is there a PR open on the OpenHands/docs repository with the same branch name?
  • Is the github CI passing?

Agent Server images for this PR

GHCR package: https://github.com/OpenHands/agent-sdk/pkgs/container/agent-server

Variants & Base Images

Variant Architectures Base Image Docs / Tags
java amd64, arm64 eclipse-temurin:17-jdk Link
python amd64, arm64 nikolaik/python-nodejs:python3.13-nodejs22 Link
golang amd64, arm64 golang:1.21-bookworm Link

Pull (multi-arch manifest)

# Each variant is a multi-arch manifest supporting both amd64 and arm64
docker pull ghcr.io/openhands/agent-server:63649ee-python

Run

docker run -it --rm \
  -p 8000:8000 \
  --name agent-server-63649ee-python \
  ghcr.io/openhands/agent-server:63649ee-python

All tags pushed for this build

ghcr.io/openhands/agent-server:63649ee-golang-amd64
ghcr.io/openhands/agent-server:63649ee-golang_tag_1.21-bookworm-amd64
ghcr.io/openhands/agent-server:63649ee-golang-arm64
ghcr.io/openhands/agent-server:63649ee-golang_tag_1.21-bookworm-arm64
ghcr.io/openhands/agent-server:63649ee-java-amd64
ghcr.io/openhands/agent-server:63649ee-eclipse-temurin_tag_17-jdk-amd64
ghcr.io/openhands/agent-server:63649ee-java-arm64
ghcr.io/openhands/agent-server:63649ee-eclipse-temurin_tag_17-jdk-arm64
ghcr.io/openhands/agent-server:63649ee-python-amd64
ghcr.io/openhands/agent-server:63649ee-nikolaik_s_python-nodejs_tag_python3.13-nodejs22-amd64
ghcr.io/openhands/agent-server:63649ee-python-arm64
ghcr.io/openhands/agent-server:63649ee-nikolaik_s_python-nodejs_tag_python3.13-nodejs22-arm64
ghcr.io/openhands/agent-server:63649ee-golang
ghcr.io/openhands/agent-server:63649ee-java
ghcr.io/openhands/agent-server:63649ee-python

About Multi-Architecture Support

  • Each variant tag (e.g., 63649ee-python) is a multi-arch manifest supporting both amd64 and arm64
  • Docker automatically pulls the correct architecture for your platform
  • Individual architecture tags (e.g., 63649ee-python-amd64) are also available if needed

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Feb 23, 2026

API breakage checks (Griffe)

Result: Passed

Action log

all-hands-bot

This comment was marked as outdated.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Feb 23, 2026

Coverage

Coverage Report •
FileStmtsMissCoverMissing
openhands-sdk/openhands/sdk/conversation/impl
   local_conversation.py3562194%284, 289, 317, 360, 378, 394, 456, 630–631, 634, 786, 794, 796, 807, 809–811, 836, 998, 1005–1006
TOTAL19755576070% 

Copy link
Copy Markdown
Collaborator

@all-hands-bot all-hands-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟢 Good taste - Clean, pragmatic implementation

What works well:

  • Simple command syntax that solves a real user need
  • Registry caching pattern (checks cache before disk I/O) - previous performance concern resolved ✅
  • Clear separation: parse → handle → switch
  • Tests verify actual behavior, not mocks
  • No breaking changes (additive only)

Linus would say: "This is how you do it. Straightforward data flow, no special cases, solves the actual problem."

LGTM! 🚀

@VascoSch92 VascoSch92 requested review from enyst and neubig February 23, 2026 21:12
@VascoSch92 VascoSch92 marked this pull request as ready for review February 23, 2026 21:12
Copy link
Copy Markdown
Collaborator

@all-hands-bot all-hands-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟢 Good taste - Clean, pragmatic implementation

What works well:

  • Simple command syntax that solves a real user need
  • Registry caching pattern (checks cache before disk I/O) - previous performance concern resolved ✅
  • Clear separation: parse → handle → switch
  • Tests verify actual behavior, not mocks
  • No breaking changes (additive only)

Linus would say: "This is how you do it. Straightforward data flow, no special cases, solves the actual problem."

LGTM! 🚀

Copy link
Copy Markdown
Collaborator

@enyst enyst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for this!

I think plugging this behavior in the conversation may be easy enough to do (and undo), so IMHO that is fine.

I think maybe we don’t want to send this information to the LLM, it’s just a toggle for the user, isn’t it?

@VascoSch92
Copy link
Copy Markdown
Contributor Author

Thank you for this!

I think plugging this behavior in the conversation may be easy enough to do (and undo), so IMHO that is fine.

I think maybe we don’t want to send this information to the LLM, it’s just a toggle for the user, isn’t it?

The most important things was to do a first try to understand better the problem.

I believe i can make it better :)

I will adress you suggestion ;)

Copy link
Copy Markdown
Collaborator

@all-hands-bot all-hands-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Acceptable - Solid architecture with a UX gap

What works well:

  • Clean separation of concerns (SwitchModelHandler is standalone and focused)
  • Registry caching pattern prevents redundant disk I/O (previous concern resolved ✅)
  • Tests verify actual behavior with real code paths (no mock spam)
  • Opt-in feature with no breaking changes
  • Solves a real user need pragmatically

Key issue:
User feedback disappears into logs instead of being shown in the conversation. See inline comments.

Linus would say: "The data flow is clean and the code does what it should, but users are left in the dark. When someone types /model, they expect to see something back. Logger output is not user output."

@VascoSch92
Copy link
Copy Markdown
Contributor Author

Hey @enyst

I’ve made some changes based on your feedback. The goal was to pollute the LocalConversation class as little as possible, so I extracted the logic into a new class within a separate module.

Additionally, I've addressed your comments and added an example (give it a try, it’s fun!). I also added a flag to enable or disable model switching.

Let me know what you think!

@enyst
Copy link
Copy Markdown
Collaborator

enyst commented Feb 24, 2026

Hey @enyst

I’ve made some changes based on your feedback. The goal was to pollute the LocalConversation class as little as possible, so I extracted the logic into a new class within a separate module.

Additionally, I've addressed your comments and added an example (give it a try, it’s fun!). I also added a flag to enable or disable model switching.

Let me know what you think!

Thank you! Left a few tiny comments

I think maybe it doesn't get saved, so it won't be restored, the initial profile is? As in, restoring a conversation from the saved base_state.json 🤔 It's just a quick question

Just a heads up, I won't be around for a bit, back later (some number of hours)

@VascoSch92
Copy link
Copy Markdown
Contributor Author

Hey @enyst
I’ve made some changes based on your feedback. The goal was to pollute the LocalConversation class as little as possible, so I extracted the logic into a new class within a separate module.
Additionally, I've addressed your comments and added an example (give it a try, it’s fun!). I also added a flag to enable or disable model switching.
Let me know what you think!

Thank you! Left a few tiny comments

I think maybe it doesn't get saved, so it won't be restored, the initial profile is? As in, restoring a conversation from the saved base_state.json 🤔 It's just a quick question

Just a heads up, I won't be around for a bit, back later (some number of hours)

So I ran a small experiment to see what is happening.

If we save a convo and we restore it later: the last used LLM will be used. Which makes sense, right?

@all-hands-bot
Copy link
Copy Markdown
Collaborator

[Automatic Post]: It has been a while since there was any activity on this PR. @VascoSch92, are you still working on it? If so, please go ahead, if not then please request review, close it, or request that someone else follow up.

@VascoSch92
Copy link
Copy Markdown
Contributor Author

Oh, you're right. Tempted to say "you're absolutely right!" - but then you'd think Sonnet took over this human 😅
😅 🤖 😄

I’ve been catching up on the conversation. We agree that /model is a totally different concept from Skills, Commands, or Hooks; let's call it a control command.

Should we implement this control command in the CLI?

I think this could be a good alternative. Since we have the LLMProfileStore now, I believe all the pieces are in place.

@enyst :-)

Copy link
Copy Markdown
Contributor

@neubig neubig left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This basically looks good to me! Thanks for implementing this.

@neubig
Copy link
Copy Markdown
Contributor

neubig commented Mar 6, 2026

cc @enyst if you still think changes should be made could you list up the changes that need to be made, otherwise approve?

Copy link
Copy Markdown
Collaborator

@enyst enyst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we maybe do it like this:

  • SDK does the profile switch
    • SDK exposes a little API for it, e.g. maybe LLMRegistry.switch_profile() or even conversation.switch_profile if we need to place it into conversation (it does a little much though)
  • CLI owns user interaction
    • parses /model as command, calls the method to do stuff

WDYT?

@enyst
Copy link
Copy Markdown
Collaborator

enyst commented Mar 8, 2026

Maybe like this separation: #2336 (comment)

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Mar 9, 2026

Agent server REST API breakage checks (OpenAPI)

Result: Failed

Log excerpt (first 1000 characters)
{"asctime": "2026-03-09 12:03:30,217", "levelname": "WARNING", "name": "openhands.agent_server.config", "filename": "config.py", "lineno": 173, "message": "\u26a0\ufe0f OH_SECRET_KEY was not defined. Secrets will not be persisted between restarts."}
::error title=openhands-agent-server REST API::Breaking REST API change detected without MINOR version bump (1.12.0 -> 1.12.0).

Breaking REST API changes detected compared to baseline release:
- the 'file' request property type/format changed from 'string'/'' to 'string'/'binary'
/home/runner/work/software-agent-sdk/software-agent-sdk/.venv/lib/python3.13/site-packages/litellm/llms/custom_httpx/async_client_cleanup.py:66: DeprecationWarning: There is no current event loop
  loop = asyncio.get_event_loop()

Action log

@VascoSch92 VascoSch92 requested a review from all-hands-bot March 9, 2026 10:19
@VascoSch92
Copy link
Copy Markdown
Contributor Author

@enyst

I have rewritten the code to follow the comments. Now it is truly minimalistic, and the responsibility for catching the command is left to the CLI.

Note that I used model_copy twice. I'm forced to do so because the ProfileStore uses the profile name as a primary key, while the LLM registry uses the usage_id. This accounts for the discrepancy.

all-hands-bot

This comment was marked as outdated.

@VascoSch92 VascoSch92 requested a review from all-hands-bot March 9, 2026 10:28
Copy link
Copy Markdown
Collaborator

@all-hands-bot all-hands-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟢 Good taste - Clean, pragmatic implementation

Linus would say: "This is exactly how you do it. Straightforward data flow, solves the actual problem, no special cases."

What works well:

  • Simple, focused API that solves a real user need (model switching mid-conversation)
  • Registry caching pattern eliminates redundant disk I/O ✅
  • Duplicate registration fix in _ensure_agent_ready() prevents potential issues ✅
  • Tests verify actual behavior with real code paths, not mock spam ✅
  • No breaking changes (additive only) ✅

Minor observation: Registry operations happen outside the state lock in switch_profile() but inside the lock in _ensure_agent_ready(). This inconsistency is probably fine if the registry is internally thread-safe, but worth noting for future maintainers. Not blocking.

LGTM! 🚀

@enyst
Copy link
Copy Markdown
Collaborator

enyst commented Mar 9, 2026

Note that I used model_copy twice. I'm forced to do so because the ProfileStore uses the profile name as a primary key, while the LLM registry uses the usage_id. This accounts for the discrepancy.

Noted, thank you, will do!

I just want to note quickly what OpenHands-GPT-5.4 just told me in another PR:
(I told it to remove some caching of base url / provider it was trying to add, because LLM instances are supposed to be immutable -ish)

image

Yes, usage_id and profile_id are a problem to solve. I would be happy even with the solution I found elsewhere, which was basically to remove random usage_id, restrict it to agent, maybe at most condenser.

for llm in list(self.agent.get_all_llms()):
self.llm_registry.add(llm)
if llm.usage_id not in registered:
self.llm_registry.add(llm)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, not about this PR, but this looks curious actually. Are we saying that the agent may have gotten an LLM instance that wasn't registered? Do we know that?

except KeyError:
new_llm = self._profile_store.load(profile_name)
new_llm = new_llm.model_copy(update={"usage_id": usage_id})
self.llm_registry.add(new_llm)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really only a tiny thought, this code could be out of here and in the registry I think

Not for this PR, just thinking of responsibilities; now the Conversation knows both llm registry and llm profile store and does stuff between them, even though those two are so related they could arguably be the same. Maybe worth thinking about it later, sorry

Copy link
Copy Markdown
Collaborator

@enyst enyst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm happy with this solution because it solves a real problem and it's indeed minimal for what it does, thank you!

@enyst
Copy link
Copy Markdown
Collaborator

enyst commented Mar 9, 2026

On a side note, as you have seen, I am slowly growing a bit concerned with how we are doing things 😄

If we want to change the underlying design of the codebase on the current statelessness aspect, we can, of course we can, we just could maybe give it some good thought and submit it for discussion. I do think maybe we shouldn't do a lot of "sins" or I know my slow brain is guaranteed to lose track 😢 ... and I also know others may keep theirs for longer, but last I checked we're humans here so... idk if forever! 😹

@VascoSch92 VascoSch92 enabled auto-merge (squash) March 9, 2026 12:03
@VascoSch92 VascoSch92 merged commit 87fb615 into main Mar 9, 2026
22 of 23 checks passed
@VascoSch92 VascoSch92 deleted the command-model branch March 9, 2026 12:05
@VascoSch92 VascoSch92 linked an issue Apr 10, 2026 that may be closed by this pull request
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support switching LLM models in the middle of conversation

4 participants