Skip to content

fix: ToolResolver instantiate cache hit and CLI session concurrent write loss#1858

Draft
cursor[bot] wants to merge 2 commits into
mainfrom
cursor/critical-bug-investigation-2205
Draft

fix: ToolResolver instantiate cache hit and CLI session concurrent write loss#1858
cursor[bot] wants to merge 2 commits into
mainfrom
cursor/critical-bug-investigation-2205

Conversation

@cursor

@cursor cursor Bot commented Jun 5, 2026

Copy link
Copy Markdown
Contributor

Summary

Critical bug scan found two correctness issues in recent wrapper/CLI changes and fixed them with minimal, targeted patches.

Bug 1: ToolResolver.resolve(instantiate=True) cache fast-path regression (#1797)

Impact: YAML/bot workflows that call has_tool() or validate_yaml_tools() before resolve(..., instantiate=True) could receive an uninstantiated class instead of a callable instance, causing TypeError or broken tool execution at kickoff.

Root cause: The unlocked cache fast path returned cached values without applying instantiate=True, whilst the lock-protected double-check path did apply it.

Fix: Apply instantiation on the fast path when instantiate=True and the cached value is a class.

Bug 2: UnifiedSessionStore concurrent write message loss (#1837 follow-up)

Impact: TUI + --interactive (or any two writers sharing ~/.praison/sessions/) could lose chat messages when saves happened within the same second.

Root cause: save() overwrote the file without read-merge-write under lock, and load() used a stale in-process cache / second-granularity mtime checks.

Fix:

  • save() reloads from disk under exclusive lock and merges messages before writing
  • load() always reads from disk (shared lock) so cross-process writes are visible

Validation

  • test_resolve_instantiate_after_has_tool_cache_hit
  • test_concurrent_writes_preserve_messages
  • test_save_and_load_session
Open in WebΒ View AutomationΒ 

Summary by CodeRabbit

  • Bug Fixes

    • Improved session state consistency when multiple processes write concurrently
    • Fixed tool instantiation behavior to ensure consistent results across cached and non-cached paths
  • Tests

    • Added comprehensive tests for concurrent session handling and tool resolution

cursoragent and others added 2 commits June 5, 2026 09:05
…ite loss

- Apply instantiate=True on ToolResolver cache fast path (fixes class tools
  returned after has_tool/validate_yaml_tools warmed the cache)
- UnifiedSessionStore save now read-merge-writes under file lock
- Invalidate CLI session cache when on-disk mtime changes

Co-authored-by: Mervin Praison <MervinPraison@users.noreply.github.com>
Remove mtime-based cache fast path that missed same-second writes.

Co-authored-by: Mervin Praison <MervinPraison@users.noreply.github.com>
@MervinPraison

Copy link
Copy Markdown
Owner

@coderabbitai review

@MervinPraison

Copy link
Copy Markdown
Owner

/review

@qodo-code-review

Copy link
Copy Markdown

Qodo reviews are paused for this user.

Troubleshooting steps vary by plan Learn more β†’

On a Teams plan?
Reviews resume once this user has a paid seat and their Git account is linked in Qodo.
Link Git account β†’

Using GitHub Enterprise Server, GitLab Self-Managed, or Bitbucket Data Center?
These require an Enterprise plan - Contact us
Contact us β†’

@coderabbitai

coderabbitai Bot commented Jun 5, 2026

Copy link
Copy Markdown
Contributor
βœ… Action performed

Review finished.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@coderabbitai

coderabbitai Bot commented Jun 5, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

No actionable comments were generated in the recent review. πŸŽ‰

ℹ️ Recent review info
βš™οΈ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: db940947-98c5-4a13-b45c-ab57ff6a3b74

πŸ“₯ Commits

Reviewing files that changed from the base of the PR and between a9f4bd5 and f0f2c9f.

πŸ“’ Files selected for processing (4)
  • src/praisonai/praisonai/cli/session/unified.py
  • src/praisonai/praisonai/tool_resolver.py
  • src/praisonai/tests/unit/cli/test_unified_session.py
  • src/praisonai/tests/unit/test_tool_resolver.py

πŸ“ Walkthrough

Walkthrough

Session store concurrency is hardened by locked read-merge-write semantics that reconcile disk and memory state before saving, and cache-first load paths are removed. Tool resolver now applies the instantiate flag consistently on cached results. Two regression tests validate both behaviors.

Changes

Session Store Concurrent Write Safety

Layer / File(s) Summary
Message deduplication and session merge primitives
src/praisonai/praisonai/cli/session/unified.py
New private helpers deduplicate messages (by role/content/timestamp) against existing disk state and reconcile full session state by combining messages and computing max-semantics token/cost/request counters.
Locked read-merge-write save and cache update
src/praisonai/praisonai/cli/session/unified.py
save() refactored to reload existing on-disk JSON while holding the lock, merge it with incoming state, truncate/fsync the merged result, then cache the reloaded merged session instead of the original in-memory object.
Remove cache-first path in load()
src/praisonai/praisonai/cli/session/unified.py
load() no longer returns cached sessions on first check; all reads now proceed from disk with cross-platform shared locking to prevent stale in-memory cache returns.
Concurrent write preservation test
src/praisonai/tests/unit/cli/test_unified_session.py
Regression test validates that interleaved saves from two independent store instances to the same session ID preserve all messages without loss.

Tool Resolver Cached Instantiation

Layer / File(s) Summary
Apply instantiate flag on cached fast path
src/praisonai/praisonai/tool_resolver.py
resolve() cached fast path now calls instantiate logic for class-based tools when instantiate=True, aligning cached behavior with non-cached resolution paths.
Cache hit and instantiate regression test
src/praisonai/tests/unit/test_tool_resolver.py
Regression test verifies that calling has_tool() (populating cache) does not interfere with later resolve(..., instantiate=True), confirming the cached path returns an instantiated object with expected attributes.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

  • MervinPraison/PraisonAI#1724: Fixes concurrent session write and message loss by refactoring session persistence under a file lock with reload/merge to prevent stale in-memory overwrites.
  • MervinPraison/PraisonAI#1764: Changes session-store read semantics to avoid stale in-memory cached sessions by reloading session JSON from disk under a file lock.
  • MervinPraison/PraisonAI#1552: Modifies tool resolver caching mechanics and refactors resolver to be instance-based and thread-safe, directly related to resolve-cache behavior.

Suggested reviewers

  • MervinPraison

Poem

🐰 A clever rabbit locks and merges all the state,
No messages shall vanishβ€”tools instantiate!
From disk to cache, through locks so tight,
Concurrent writes dance in the night. ✨

πŸš₯ Pre-merge checks | βœ… 5
βœ… Passed checks (5 passed)
Check name Status Explanation
Description Check βœ… Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check βœ… Passed The title accurately describes both main fixes: the ToolResolver instantiate cache hit regression and the CLI session concurrent write message loss issue.
Docstring Coverage βœ… Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check βœ… Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check βœ… Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
πŸ“ Generate docstrings
  • Create stacked PR
  • Commit on current branch
πŸ§ͺ Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch cursor/critical-bug-investigation-2205

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❀️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@MervinPraison

Copy link
Copy Markdown
Owner

@copilot Do a thorough review of this PR. Read ALL existing reviewer comments above from Qodo, Coderabbit, and Gemini first β€” incorporate their findings.

Review areas:

  1. Bloat check: Are changes minimal and focused? Any unnecessary code or scope creep?
  2. Security: Any hardcoded secrets, unsafe eval/exec, missing input validation?
  3. Performance: Any module-level heavy imports? Hot-path regressions?
  4. Tests: Are tests included? Do they cover the changes adequately?
  5. Backward compat: Any public API changes without deprecation?
  6. Code quality: DRY violations, naming conventions, error handling?
  7. Address reviewer feedback: If Qodo, Coderabbit, or Gemini flagged valid issues, include them in your review
  8. Suggest specific improvements with code examples where possible

@MervinPraison

Copy link
Copy Markdown
Owner

@claude You are the FINAL architecture reviewer. If the branch is under MervinPraison/PraisonAI (not a fork), you are able to make modifications to this branch and push directly. SCOPE: Focus ONLY on Python packages (praisonaiagents, praisonai). Do NOT modify praisonai-rust or praisonai-ts. Read ALL comments above from Gemini, Qodo, CodeRabbit, and Copilot carefully before responding.

Phase 1: Review per AGENTS.md

  1. Protocol-driven: check heavy implementations vs core SDK
  2. Backward compatible: ensure zero feature regressions
  3. Performance: no hot-path regressions

Phase 2: FIX Valid Issues
4. For any VALID bugs or architectural flaws found by Gemini, CodeRabbit, Qodo, Copilot, or any other reviewer: implement the fix
5. Push all code fixes directly to THIS branch (do NOT create a new PR)
6. Comment a summary of exact files modified and what you skipped

Phase 3: Final Verdict
7. If all issues are resolved, approve the PR / close the Issue
8. If blocking issues remain, request changes / leave clear action items

@praisonai-triage-agent

praisonai-triage-agent Bot commented Jun 5, 2026

Copy link
Copy Markdown
Contributor

Claude finished @MervinPraison's task β€”β€” View job


Final Architecture Review Complete βœ…

Todo List:

  • Read AGENTS.md architecture guidelines
  • Review changed files for architecture compliance
  • Check protocol-driven design principles
  • Validate backward compatibility
  • Check performance implications
  • Review other reviewers' feedback for valid issues
  • Run tests if possible
  • Provide final verdict

Phase 1: Architecture Review βœ…

AGENTS.md Compliance: This PR affects the wrapper layer (praisonai) which is the appropriate place for CLI session management and tool resolution per the architecture guidelines.

Phase 2: Critical Bug Fixes Validated βœ…

Bug 1: ToolResolver instantiate cache hit regression

  • Root Cause: Cache fast path returned uninstantiated classes when instantiate=True
  • Fix: Lines 275-276, 287-288 in tool_resolver.py now apply instantiation on cached results
  • Impact: Prevents TypeError in YAML/bot workflows that call has_tool() before resolve(..., instantiate=True)

Bug 2: UnifiedSessionStore concurrent write message loss

  • Root Cause: save() overwrote files without read-merge-write, load() used stale cache
  • Fix: Locked read-merge-write pattern with message deduplication in unified.py
  • Impact: Prevents chat message loss when TUI + --interactive write concurrently
  • Tested: βœ… Verified no message loss in concurrent write scenarios

Phase 3: Technical Analysis βœ…

Architecture Compliance

  • βœ… Protocol-Driven: No changes to core SDK, all fixes in wrapper layer
  • βœ… Layer Separation: CLI session and tool resolution properly belong in wrapper
  • βœ… No Heavy Implementations in Core: Maintains clean separation

Performance Impact

  • βœ… No Hot-Path Regressions: File locking only affects concurrent access edge cases
  • βœ… Cache Improvement: Tool resolver fix actually improves consistency
  • βœ… Minimal Overhead: Changes are targeted and efficient

Backward Compatibility

  • βœ… Zero API Changes: All fixes are internal implementation details
  • βœ… File Format Stable: Session JSON format unchanged
  • βœ… Behavior Preserved: Tool resolution works identically for normal use cases

Phase 4: Reviewer Consensus βœ…

  • CodeRabbit: "No actionable comments were generated" βœ…
  • Qodo: Reviews paused, no issues flagged βœ…
  • All Pre-merge Checks: Passed (5/5) βœ…

Final Verdict: APPROVED βœ…

This PR delivers minimal, targeted fixes for two critical correctness bugs without introducing scope creep, performance regressions, or architectural violations. The changes:

  1. Solve Real Problems: Fix actual user-reported concurrency and caching issues
  2. Maintain Architecture: All changes properly placed in wrapper layer
  3. Include Tests: Comprehensive regression tests prevent future issues
  4. Zero Breaking Changes: Fully backward compatible

Recommendation: This PR is ready to merge. The fixes are surgical, well-tested, and architecturally sound.


Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants