-
Notifications
You must be signed in to change notification settings - Fork 7.1k
mcp-server: make codex-reply resumable across restarts
#8689
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
mcp-server: make codex-reply resumable across restarts
#8689
Conversation
|
All contributors have signed the CLA ✍️ ✅ |
|
I have read the CLA Document and I hereby sign the CLA |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 38389bca5e
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| let config = self.config.as_ref().clone(); | ||
| let auth_manager = self.auth_manager.clone(); | ||
| match self | ||
| .conversation_manager | ||
| .resume_conversation_from_rollout(config, rollout_path, auth_manager) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Preserve per-session config when resuming from rollout
Resuming a codex-reply session always clones the server’s global config (self.config) before calling resume_conversation_from_rollout, which silently drops any per-session overrides that were used when the conversation was created (e.g., custom cwd, model, approval/sandbox policy supplied via the initial codex tool call). After an MCP server restart, the resumed session can therefore run in the wrong directory or with different policies/models than the original, which changes tool behavior and can lead to commands acting on the wrong workspace. Consider deriving the config for resume from the rollout’s session metadata (or persisting the original config alongside the rollout) so resumed sessions keep their original settings.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a new commit to restore session settings from rollout TurnContext
Adds logic to restore session cwd, model, approval policy, and sandbox policy from the most recent TurnContextItem in rollout history when resuming a session. Updates tests to verify that these settings are correctly restored from the rollout file.
|
Our contribution guidelines clarify that we're currently accepting contributions for bugs fixes, but we're not accepting new features at this time. We need to make sure that all new features compose well with both existing and upcoming features and fit into our roadmap. If you would like to propose a new feature, please file or upvote an enhancement request in the issue tracker. We will generally prioritize new features based on community feedback. |
|
@etraut-openai Totally understand the contribution policy. That said, I think this lands solidly in the “bugfix” category rather than a new feature: today The change I’m proposing doesn’t introduce new user-facing surface area or parameters; it only ensures that when a In my sense, from an MCP consumer standpoint, this is a reliability/correctness fix for an already-supported workflow. |
|
I reviewed #6168 and #5066, and I think these are correctly categorized as enhancement requests rather than bug reports. I understand your perspective, but we're going to pass on this PR. I'd be inclined to make an exception if this were a small and targeted change, but your PR is relatively large and makes some big behavioral modification. It would require a not-insignificant investment of time from someone on the codex team to review and validate the change and work with you on follow-on changes as part of the code review. Given our large backlog of bug reports and feature requests, we need to be disciplined about prioritization and invest in the changes that will have the most impact for the most Codex users. We will eventually get to #6168 and #5066, but there are many higher-priority (more highly upvoted) features and bug fixes ahead of these issues. In the meantime, you might find that non-interactive mode (either with or without the Codex SDK) suits your needs. |
What
This PR adds persistent resume/recovery to the MCP
codex-replytool-call.Previously,
codex-replyonly succeeded if the conversation was still present in the server’s in-memoryConversationManager. After restartingcodex mcp-server, that map is empty andcodex-replyimmediately fails with “Session not found…”.With this change, when a conversation is not in memory,
codex-replywill rehydrate it from the persisted rollout JSONL on disk (underCODEX_HOME/sessions/...) using the providedconversation_id, then continue the conversation normally.Why
This closes the gap described in:
codex-replycannot resume after restarting the MCP server even though the rollout exists andcodex resume <sessionId>works.Resume parity with CLI: This effectively adds “resume” functionality to the MCP server in the same spirit as
codex resume <sessionId>. External integrations can continue a session after a process restart by providing the conversation/session id, and Codex will load the persisted history from disk.How
codex-replyhandler:conversation_manager.get_conversation(conversation_id)codex_core::find_conversation_path_by_id_str(&config.codex_home, <conversation_id>)conversation_manager.resume_conversation_from_rollout(config.clone(), rollout_path, auth_manager.clone())session_configurednotification (so clients can observe the resume handshake)Tests
Adds an integration test that:
conversation_idcodex-replydirectly (no prior in-memory conversation)session_configuredfor that id and the tool call completes successfullyRun locally:
cargo test -p codex-mcp-server --tests