Conversation
Move domain docs (architecture, api, conventions, glossary, etc.) to .agents/project/ and research files to .agents/research/. Remove stale chatgpt-logged-out.png screenshot. Co-authored-by: Cursor <cursoragent@cursor.com>
- Add keyframe animations (spinner-fade, pulse-dot, bounce-dots, typing, wave, blink, text-shimmer) to globals.css - Update thinking states test page with ArticleWrapper, semantic HTML, mask-based action reveal, and improved composer layout - Change reasoning label from "Reasoned" to "Thoughts" - Bump thinking bar text from text-sm to text-base Co-authored-by: Cursor <cursoragent@cursor.com>
Enable a dedicated Layer 2 content extraction tool path that is independent from search fallback so providers with native search can still use URL content reads. Co-authored-by: Cursor <cursoragent@cursor.com>
Improve tool routing with explicit extraction capability gating, rename purchase/extraction tool ids for consistency, and add stronger timeout/error guidance plus extraction caching for more reliable tool execution. Co-authored-by: Cursor <cursoragent@cursor.com>
Standardize MCP, third-party, and platform tools to return raw outputs while centralizing request-scoped tracing and logging so tool observability is consistent and safer across the full tool pipeline. Co-authored-by: Cursor <cursoragent@cursor.com>
Add Convex-backed tool budget/domain guardrails and unify tool error normalization so policy denials and upstream failures are surfaced consistently across route, wrappers, and tests. Co-authored-by: Cursor <cursoragent@cursor.com>
Strengthen tool routing and validation by introducing capability policy, metadata shaping, naming/cache helpers, and expanded coverage so tool behavior is safer and more deterministic across providers. Co-authored-by: Cursor <cursoragent@cursor.com>
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
Greptile SummaryHardens tool orchestration with a multi-layered policy pipeline enforcing capability boundaries, budget limits, and retry safety across provider-native, third-party, MCP, and platform tool layers. Key Improvements:
Architecture: Test Coverage: Confidence Score: 4/5
Important Files Changed
Flowchart%%{init: {'theme': 'neutral'}}%%
flowchart TD
Start[Chat Request] --> CapPolicy[Resolve Capability Policy]
CapPolicy --> KeyMode{Key Mode<br/>Detection}
KeyMode -->|BYOK| BYOKPath[BYOK Budget Policy]
KeyMode -->|Platform| PlatformPath[Platform Budget Policy]
BYOKPath --> LoadTools[Load Tool Layers]
PlatformPath --> LoadTools
LoadTools --> Layer1[Layer 1: Provider Native<br/>e.g. OpenAI search]
LoadTools --> Layer2Search[Layer 2: Third-Party Search<br/>Exa fallback]
LoadTools --> Layer2Content[Layer 2: Content Extraction<br/>Exa getContents]
LoadTools --> Layer3[Layer 3: MCP Tools<br/>User servers]
LoadTools --> Layer4[Layer 4: Platform Tools<br/>pay_purchase]
Layer1 --> PolicyFilter[Apply Capability Policy Filter]
Layer2Search --> PolicyFilter
Layer2Content --> PolicyFilter
Layer3 --> PolicyFilter
Layer4 --> PolicyFilter
PolicyFilter --> Naming[Naming Governance<br/>Collision Detection]
Naming --> Wrap[Wrap Tools]
Wrap --> WrapMCP[MCP: Timeout + Retry + Circuit Breaker]
Wrap --> WrapThirdParty[Third-Party: Cache + Domain Limits]
Wrap --> WrapPlatform[Platform: Tracing + Budget]
WrapMCP --> StreamText[streamText with tools]
WrapThirdParty --> StreamText
WrapPlatform --> StreamText
StreamText --> PrepareStep{prepareStep}
PrepareStep -->|Step <= 3| EarlyTools[Early Step Tools<br/>All safe + MCP unknown]
PrepareStep -->|Step > 3| LateTools[Late Step Tools<br/>Read-only only]
EarlyTools --> BudgetProbe[Budget Probe<br/>consume: false]
LateTools --> BudgetProbe
BudgetProbe -->|Policy Available| BudgetOK{Budget OK?}
BudgetProbe -->|Policy Unavailable| Degraded[Degraded Mode<br/>Request-local soft cap]
BudgetOK -->|Yes| AllowTool[Include Tool]
BudgetOK -->|No| BlockTool[Exclude Tool]
Degraded --> SoftCap{Soft Cap<br/>Remaining?}
SoftCap -->|Yes| AllowTool
SoftCap -->|No| BlockTool
AllowTool --> Execute[Tool Execution]
Execute --> Retry{Retry Safe?}
Retry -->|Yes + Transient Error| RetryBackoff[Exponential Backoff<br/>with Jitter]
Retry -->|No or Non-Transient| Error[Return Error]
RetryBackoff --> Execute
Execute --> StepFinish[onStepFinish:<br/>Post-accounting]
StepFinish --> ConvexLog[Convex toolCallLog +<br/>toolLimitBuckets]
ConvexLog --> NextStep{More Steps?}
NextStep -->|Yes| PrepareStep
NextStep -->|No| Finish[onFinish:<br/>PostHog + Response]
Last reviewed commit: 3babf78 |
There was a problem hiding this comment.
5 issues found across 61 files
Prompt for AI agents (all issues)
Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.
<file name="convex/toolLimits.ts">
<violation number="1" location="convex/toolLimits.ts:131">
P2: The `code` and `message` for the `"domain"` limit branch hardcode `"extract_content"` instead of using the `toolName` parameter. Any other tool passing `limitType: "domain"` will receive a misleading error code and message.</violation>
</file>
<file name="lib/tools/truncation-policy.ts">
<violation number="1" location="lib/tools/truncation-policy.ts:174">
P2: findSemanticBoundary always adds 2 for paragraph breaks, which truncates CRLF "\r\n\r\n" sequences mid-marker and yields inconsistent boundaries on Windows-style line breaks. Adjust the offset to match the marker length.</violation>
</file>
<file name="lib/tools/mcp-wrapper.ts">
<violation number="1" location="lib/tools/mcp-wrapper.ts:140">
P2: Circuit breaker uses `"unknown"` as a shared key for all tools missing server info, causing cross-tool circuit state pollution. If multiple tools lack a `serverId`, their failure counts and resets incorrectly affect each other. Use a per-tool fallback (e.g., the tool `name` itself) to ensure isolation.</violation>
</file>
<file name="lib/tools/utils.ts">
<violation number="1" location="lib/tools/utils.ts:131">
P2: Listener leak in `combineAbortSignals` fallback path: when one signal fires, the `{ once: true }` listeners on all other signals remain attached until those signals individually fire. For long-lived signals (e.g., a session-scoped abort), the closures capturing `controller` will never be cleaned up. Track all handlers and call `removeEventListener` on all of them when any one fires.</violation>
<violation number="2" location="lib/tools/utils.ts:610">
P2: Key collision in `truncateOversizedObject`: user-data keys named `_hint`, `_truncated`, `_originalSizeBytes`, or `_keptKeys` silently overwrite the truncation metadata sentinels. When the budget check then fails and the key is deleted, the sentinel is lost, leaving the model with no `_hint` or potentially a `_truncated: false` value. Filter the known sentinel keys out of the ranked entries before the loop.</violation>
</file>
Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.
Prevent cross-tool circuit-breaker bleed when MCP server metadata is missing and normalize domain-limit codes/messages per tool for clearer failures. Preserve truncation metadata integrity and semantic paragraph boundary detection to keep truncated payloads reliable. Co-authored-by: Cursor <cursoragent@cursor.com>
Summary
Test plan
bun run lintbun run typecheckbun test lib/tools/__tests__Made with Cursor
Summary by cubic
Hardens tool orchestration and policy in chat to make tool use safer, deterministic, and observable. Adds Convex-backed limits, retry-safe wrappers, caching, unified errors/logging, clearer UI metadata, and edge‑case fixes for limits and truncation.
New Features
Migration
Written for commit 3a2e61e. Summary will update on new commits.