mnfst · brunobuddy · Apr 21, 2026 · Apr 21, 2026 · Apr 22, 2026 · Apr 22, 2026
diff --git a/.changeset/advertise-honest-context-window.md b/.changeset/advertise-honest-context-window.md
@@ -0,0 +1,5 @@
+---
+"manifest": minor
+---
+
+Advertise an honest context window for `manifest/auto` via a new OpenAI-compatible `GET /v1/models` endpoint. The `context_length` returned is the minimum across every model the agent could be routed to (tier primaries, fallbacks, specificity overrides) — any routed model is guaranteed to accept at least that many tokens, so clients that compact against this value stop overflowing the routed model. Adds a per-agent **Context window** card in Settings to override the computed floor, and a helper endpoint `GET /api/v1/routing/:agentName/context-window` for the dashboard. Addresses #1617, #1612, and #1450.
diff --git a/.changeset/context-aware-routing.md b/.changeset/context-aware-routing.md
@@ -0,0 +1,5 @@
+---
+"manifest": minor
+---
+
+Context-aware routing: the proxy now estimates token counts up front (js-tiktoken cl100k_base + 10% safety margin) and filters tier candidates by whether their context window can fit `estimatedTokens + max_tokens`. When no model in the scored tier fits, the router escalates upward (simple → standard → complex → reasoning) instead of silently routing to a too-small model. When no model in any tier can fit, the user gets an actionable error that breaks out input vs reserved-output rather than an opaque 400. Emits `X-Manifest-Context-Estimated`, `X-Manifest-Context-Used`, and `X-Manifest-Context-Escalated: <fromTier>-><toTier>` response headers so agents can adapt per-response. Addresses the #1617 RFC (context window awareness for agentic integrations) end-to-end.
diff --git a/CLAUDE.md b/CLAUDE.md
@@ -333,6 +333,8 @@ All analytics queries filter by user via `addTenantFilter(qb, userId)` from `que
 | POST | `/api/v1/routing/:agentName/ollama/sync` | Session/API Key | Sync Ollama models |
 | POST | `/api/v1/routing/resolve` | Bearer (mnfst_*) | Model resolution |
 | POST | `/v1/chat/completions` | Bearer (mnfst_*) | LLM proxy (OpenAI-compatible) |
+| GET | `/v1/models` | Bearer (mnfst_*) | OpenAI-compatible model listing; advertises `manifest/auto` with the honest `context_length` floor for the agent (Phase 1, #1617) |
+| GET | `/api/v1/routing/:agentName/context-window` | Session/API Key | Effective advertised context window + whether the user set an override (drives the Settings "Context window" card) |
 | GET | `/api/v1/events` | Session | SSE real-time events |
 | GET | `/api/v1/github/stars` | Public | GitHub star count |
 
@@ -363,7 +365,7 @@ See `packages/backend/.env.example` for all variables. Key ones:
 
 ## Domain Terminology
 
-- **Message**: The primary entity in the system. Every row in `agent_messages` is a Message. The UI labels them "Messages" everywhere. Key routing columns: `routing_tier` (complexity tier used), `routing_reason` (why — `scored`, `specificity`, `heartbeat`, etc.), `specificity_category` (which task-type category, null if complexity-routed).
+- **Message**: The primary entity in the system. Every row in `agent_messages` is a Message. The UI labels them "Messages" everywhere. Key routing columns: `routing_tier` (complexity tier used), `routing_reason` (why — `scored`, `specificity`, `heartbeat`, `size_escalated`, `context_window_exceeded`, …), `specificity_category` (which task-type category, null if complexity-routed).
 - **Tenant**: A user's data boundary. Created from `user.id` on first agent creation.
 - **Agent**: An AI agent owned by a tenant. Has a unique OTLP ingest key.
 
@@ -410,7 +412,9 @@ To add a new font or icon library:
 - **LLM Routing**: Two-layer routing system with provider key management (AES-256-GCM encrypted) and OpenAI-compatible proxy at `/v1/chat/completions`:
   - **Complexity tiers** (always active): 4 tiers (simple/standard/complex/reasoning) based on request content scoring with 23 weighted keyword dimensions.
   - **Specificity routing** (opt-in): 9 task-type categories (coding, web_browsing, data_analysis, image_generation, video_generation, social_media, email_management, calendar_management, trading). When enabled, overrides complexity tiers. Detection uses keyword analysis on the last user message + tool name heuristics. Categories defined in `shared/src/specificity.ts`, keywords in `scoring/keywords.ts`, detection in `scoring/specificity-detector.ts`.
-  - **Resolution order**: specificity check (if any category active) → complexity scoring → tier assignment → provider/model resolution → proxy forward.
+  - **Context-aware size check** (Phase 2 — always active): before forwarding, the proxy estimates tokens for the full request via `routing/proxy/token-estimate.ts` (js-tiktoken `cl100k_base` + 1.1× safety multiplier). `ResolveService.resolveWithSizeCheck` then walks the scored tier's `(primary, ...fallbacks)` in order and picks the first candidate whose `contextWindow >= estimated + reservedOutput` (reservedOutput = `body.max_tokens ?? 4096`). If no model in the scored tier fits, escalate upward (`simple → standard → complex → reasoning`). If *nothing* fits in any tier, ResolveService returns `reason: 'context_window_exceeded'` and ProxyService surfaces a friendly chat-completion response with the estimated token count, largest available window, and a link to the Routing page — no silent truncation. Size-aware candidate filtering is the pure function `findFittingCandidate` in `routing/resolve/context-fit.ts`.
+  - **Resolution order**: specificity check (if any category active) → complexity scoring → tier assignment → **size-aware candidate pick within tier** → **tier escalation if no candidate fits** → provider/model resolution → proxy forward.
+  - **Size signals on the wire**: successful proxy responses emit `X-Manifest-Context-Estimated` and `X-Manifest-Context-Used`; when the size check bumped the tier, an additional `X-Manifest-Context-Escalated: <fromTier>-><toTier>` is emitted. Agents use these to adapt per-response without a second round trip (#1617). Header values are ASCII-only — Node's http layer silently drops headers containing Latin-1-incompatible characters.
 
 ## Providers & Models
 

diff --git a/package-lock.json b/package-lock.json
diff --git a/packages/backend/package.json b/packages/backend/package.json
@@ -18,7 +18,6 @@
     "migration:create": "typeorm-ts-node-commonjs migration:create"
   },
   "dependencies": {
-    "manifest-shared": "*",
     "@nestjs/cache-manager": "^3.1.0",
     "@nestjs/common": "^11.0.0",
     "@nestjs/config": "^4.0.0",
@@ -38,6 +37,8 @@
     "compression": "^1.8.1",
     "express-rate-limit": "^8.3.1",
     "helmet": "^8.1.0",
+    "js-tiktoken": "^1.0.21",
+    "manifest-shared": "*",
     "pg": "^8.13.0",
     "react": "^19.2.4",
     "react-dom": "^19.2.4",

diff --git a/packages/backend/src/analytics/controllers/agents.controller.spec.ts b/packages/backend/src/analytics/controllers/agents.controller.spec.ts
@@ -19,6 +19,8 @@ describe('AgentsController', () => {
   let mockConfigGet: jest.Mock;
   let mockDeleteAgent: jest.Mock;
   let mockRenameAgent: jest.Mock;
+  let mockUpdateAgentType: jest.Mock;
+  let mockUpdateContextFloorOverride: jest.Mock;
   let mockTenantResolve: jest.Mock;
 
   beforeEach(async () => {
@@ -31,6 +33,8 @@ describe('AgentsController', () => {
     mockConfigGet = jest.fn().mockReturnValue('');
     mockDeleteAgent = jest.fn().mockResolvedValue(undefined);
     mockRenameAgent = jest.fn().mockResolvedValue(undefined);
+    mockUpdateAgentType = jest.fn().mockResolvedValue(undefined);
+    mockUpdateContextFloorOverride = jest.fn().mockResolvedValue(undefined);
     mockTenantResolve = jest.fn().mockResolvedValue('tenant-123');
 
     const module: TestingModule = await Test.createTestingModule({
@@ -46,7 +50,8 @@ describe('AgentsController', () => {
           useValue: {
             deleteAgent: mockDeleteAgent,
             renameAgent: mockRenameAgent,
-            updateAgentType: jest.fn(),
+            updateAgentType: mockUpdateAgentType,
+            updateContextFloorOverride: mockUpdateContextFloorOverride,
           },
         },
         {
@@ -255,7 +260,6 @@ describe('AgentsController', () => {
   });
 
   it('passes category/platform to updateAgentType on PATCH', async () => {
-    const mockUpdateType = jest.fn().mockResolvedValue(undefined);
     const user = { id: 'u1' };
     const result = await controller.updateAgent(user as never, 'bot-1', {
       agent_category: 'app',
@@ -268,6 +272,57 @@ describe('AgentsController', () => {
     });
   });
 
+  /**
+   * context_floor_override PATCH cases — the UI "Custom context window" card
+   * writes through this path. Defends issues #1617 / #1612 by making sure a
+   * user-entered override is actually persisted (not swallowed by the
+   * controller ignoring the field because the DTO treats it as optional).
+   */
+  it('calls updateContextFloorOverride with a numeric value on PATCH', async () => {
+    const user = { id: 'u1' };
+    const result = await controller.updateAgent(user as never, 'bot-1', {
+      context_floor_override: 50_000,
+    } as never);
+
+    expect(mockUpdateContextFloorOverride).toHaveBeenCalledWith('u1', 'bot-1', 50_000);
+    expect(result).toEqual({ context_floor_override: 50_000 });
+  });
+
+  it('calls updateContextFloorOverride with null to clear the override', async () => {
+    const user = { id: 'u1' };
+    const result = await controller.updateAgent(user as never, 'bot-1', {
+      context_floor_override: null,
+    } as never);
+
+    expect(mockUpdateContextFloorOverride).toHaveBeenCalledWith('u1', 'bot-1', null);
+    expect(result).toEqual({ context_floor_override: null });
+  });
+
+  it('does not call updateContextFloorOverride when the field is omitted', async () => {
+    // The field is optional — PATCH { agent_category: 'app' } must not wipe
+    // any existing override.
+    const user = { id: 'u1' };
+    await controller.updateAgent(user as never, 'bot-1', {
+      agent_category: 'app',
+    } as never);
+
+    expect(mockUpdateContextFloorOverride).not.toHaveBeenCalled();
+  });
+
+  it('routes context_floor_override to the NEW slug when renaming in the same PATCH', async () => {
+    // Rename-and-set-override in one PATCH is a real path in the Settings
+    // page. After renaming to `new-slug`, the lifecycle call has to target
+    // the new name, not the original.
+    const user = { id: 'u1' };
+    await controller.updateAgent(user as never, 'bot-1', {
+      name: 'New Slug',
+      context_floor_override: 64_000,
+    } as never);
+
+    expect(mockRenameAgent).toHaveBeenCalledWith('u1', 'bot-1', 'new-slug', 'New Slug');
+    expect(mockUpdateContextFloorOverride).toHaveBeenCalledWith('u1', 'new-slug', 64_000);
+  });
+
   it('invalidates agent list cache after successful createAgent', async () => {
     const mockOnboard = jest.fn().mockResolvedValue({
       tenantId: 't1',

diff --git a/packages/backend/src/analytics/controllers/agents.controller.ts b/packages/backend/src/analytics/controllers/agents.controller.ts
@@ -128,6 +128,15 @@ export class AgentsController {
       if (body.agent_platform !== undefined) result['agent_platform'] = body.agent_platform;
     }
 
+    if (body.context_floor_override !== undefined) {
+      await this.lifecycle.updateContextFloorOverride(
+        user.id,
+        body.name ? slugify(body.name)! : agentName,
+        body.context_floor_override,
+      );
+      result['context_floor_override'] = body.context_floor_override;
+    }
+
     await this.cacheManager.del(this.agentListCacheKey(user.id));
     return result;
   }

diff --git a/packages/backend/src/analytics/services/agent-lifecycle.service.spec.ts b/packages/backend/src/analytics/services/agent-lifecycle.service.spec.ts
@@ -154,6 +154,75 @@ describe('AgentLifecycleService', () => {
     });
   });
 
+  describe('updateContextFloorOverride', () => {
+    it('writes the override value on the agents row when the agent exists', async () => {
+      mockAgentGetOne.mockResolvedValueOnce({ id: 'agent-id-1', name: 'my-agent' });
+
+      const mockExecute = jest.fn().mockResolvedValue({});
+      const mockUpdateQb = {
+        update: jest.fn().mockReturnThis(),
+        set: jest.fn().mockReturnThis(),
+        where: jest.fn().mockReturnThis(),
+        execute: mockExecute,
+      };
+
+      mockAgentCreateQueryBuilder
+        .mockReturnValueOnce({
+          select: jest.fn().mockReturnThis(),
+          leftJoin: jest.fn().mockReturnThis(),
+          where: jest.fn().mockReturnThis(),
+          andWhere: jest.fn().mockReturnThis(),
+          orderBy: jest.fn().mockReturnThis(),
+          getOne: mockAgentGetOne,
+          getMany: jest.fn().mockResolvedValue([]),
+        })
+        .mockReturnValueOnce(mockUpdateQb);
+
+      await service.updateContextFloorOverride('test-user', 'my-agent', 50_000);
+
+      expect(mockUpdateQb.update).toHaveBeenCalledWith('agents');
+      expect(mockUpdateQb.set).toHaveBeenCalledWith({ context_floor_override: 50_000 });
+      expect(mockUpdateQb.where).toHaveBeenCalledWith('id = :id', { id: 'agent-id-1' });
+      expect(mockExecute).toHaveBeenCalledTimes(1);
+    });
+
+    it('writes null when clearing the override', async () => {
+      mockAgentGetOne.mockResolvedValueOnce({ id: 'agent-id-1', name: 'my-agent' });
+
+      const mockExecute = jest.fn().mockResolvedValue({});
+      const mockUpdateQb = {
+        update: jest.fn().mockReturnThis(),
+        set: jest.fn().mockReturnThis(),
+        where: jest.fn().mockReturnThis(),
+        execute: mockExecute,
+      };
+
+      mockAgentCreateQueryBuilder
+        .mockReturnValueOnce({
+          select: jest.fn().mockReturnThis(),
+          leftJoin: jest.fn().mockReturnThis(),
+          where: jest.fn().mockReturnThis(),
+          andWhere: jest.fn().mockReturnThis(),
+          orderBy: jest.fn().mockReturnThis(),
+          getOne: mockAgentGetOne,
+          getMany: jest.fn().mockResolvedValue([]),
+        })
+        .mockReturnValueOnce(mockUpdateQb);
+
+      await service.updateContextFloorOverride('test-user', 'my-agent', null);
+
+      expect(mockUpdateQb.set).toHaveBeenCalledWith({ context_floor_override: null });
+    });
+
+    it('throws NotFoundException when the agent does not belong to the user', async () => {
+      mockAgentGetOne.mockResolvedValueOnce(null);
+
+      await expect(
+        service.updateContextFloorOverride('test-user', 'nonexistent', 50_000),
+      ).rejects.toThrow(NotFoundException);
+    });
+  });
+
   describe('renameAgent', () => {
     it('should throw NotFoundException when agent not found', async () => {
       mockAgentGetOne.mockResolvedValueOnce(null);

diff --git a/packages/backend/src/analytics/services/agent-lifecycle.service.ts b/packages/backend/src/analytics/services/agent-lifecycle.service.ts
@@ -55,6 +55,22 @@ export class AgentLifecycleService {
       .execute();
   }
 
+  async updateContextFloorOverride(
+    userId: string,
+    agentName: string,
+    value: number | null,
+  ): Promise<void> {
+    const agent = await this.findAgentByUser(userId, agentName);
+    if (!agent) throw new NotFoundException(`Agent "${agentName}" not found`);
+
+    await this.agentRepo
+      .createQueryBuilder()
+      .update('agents')
+      .set({ context_floor_override: value })
+      .where('id = :id', { id: agent.id })
+      .execute();
+  }
+
   async renameAgent(
     userId: string,
     currentName: string,

diff --git a/packages/backend/src/analytics/services/timeseries-queries.service.spec.ts b/packages/backend/src/analytics/services/timeseries-queries.service.spec.ts
@@ -356,5 +356,35 @@ describe('TimeseriesQueriesService', () => {
       expect(result[0].total_cost).toBe(0);
       expect(result[0].sparkline).toEqual([]);
     });
+
+    /**
+     * The Settings page loads the current override from GET /api/v1/agents
+     * (to prefill the Auto/Custom radio). If this projection ever drops the
+     * field, the Settings page silently reverts to "Auto" every reload.
+     */
+    it('surfaces a configured context_floor_override on each agent row', async () => {
+      mockGetMany.mockResolvedValueOnce([
+        {
+          name: 'custom-bot',
+          display_name: null,
+          created_at: '2026-02-16',
+          context_floor_override: 50_000,
+        },
+      ]);
+      mockGetRawMany.mockResolvedValueOnce([]).mockResolvedValueOnce([]);
+
+      const result = await service.getAgentList('u1');
+      expect(result[0].context_floor_override).toBe(50_000);
+    });
+
+    it('defaults context_floor_override to null when the agent row has no override', async () => {
+      mockGetMany.mockResolvedValueOnce([
+        { name: 'auto-bot', display_name: null, created_at: '2026-02-16' },
+      ]);
+      mockGetRawMany.mockResolvedValueOnce([]).mockResolvedValueOnce([]);
+
+      const result = await service.getAgentList('u1');
+      expect(result[0].context_floor_override).toBeNull();
+    });
   });
 });
diff --git a/packages/backend/src/analytics/services/timeseries-queries.service.ts b/packages/backend/src/analytics/services/timeseries-queries.service.ts
@@ -230,6 +230,7 @@ export class TimeseriesQueriesService {
         display_name: a.display_name ?? name,
         agent_category: a.agent_category ?? null,
         agent_platform: a.agent_platform ?? null,
+        context_floor_override: a.context_floor_override ?? null,
         message_count: Number(stats?.['message_count'] ?? 0),
         last_active: String(stats?.['last_active'] ?? a.created_at ?? ''),
         total_cost: Number(stats?.['total_cost'] ?? 0),