Skip to content

Commit 6aa4e3b

Browse files
authored
feat(exa): add $10/month free allowance with credit billing for overages (#2169)
* feat(exa): add $10/month free allowance with credit billing for overages Implement per-user monthly Exa usage tracking with a free tier + overage model: - First $10/month is free for all authenticated users - Beyond $10, usage is charged to the user's (or org's) Kilo credit balance - Streaming disabled to guarantee cost tracking via costDollars in JSON responses - Two-table storage strategy: pre-aggregated counter (O(1) lookups) + partitioned audit log * feat(exa): store per-user free allowance on monthly usage row Add free_allowance_microdollars column to exa_monthly_usage with lock-in semantics: the allowance is set on the first request of the month (INSERT) and not overwritten on subsequent requests (excluded from ON CONFLICT UPDATE). This enables per-user tiered allowances in the future by modifying a single pure function (getExaFreeAllowanceMicrodollars) without changing the billing flow. The route handler now uses the stored allowance instead of the global constant, and the 402 error message reflects the actual allowance. * feat(exa): route exa API through global backend and use read replica for reads * fix(exa): make recomputeBalance account for paid Exa usage Add organization_id to exa_monthly_usage so charged amounts are tracked per-context (personal vs org). Both recompute functions now include total_charged_microdollars from exa_monthly_usage in cumulative usage, preventing paid Exa charges from vanishing on balance recomputation. Squashes branch migrations 0077+0078 into a single 0077. * fix(exa): match readDb arg in getBalanceAndOrgSettings assertions The route now passes readDb as a 3rd argument to getBalanceAndOrgSettings, but two toHaveBeenCalledWith assertions only expected 2 arguments. The failing assertion triggers Jest's pretty-printer on the real Drizzle client (a Proxy-backed object), which hangs indefinitely. * style: format unformatted files * fix(exa): use per-request exa_usage_log for balance recomputation Replace the exa_monthly_usage lump-sum approach with a chronological merge-sort of exa_usage_log records into the usage stream. This gives correct credit-expiration baselines when Exa charges are interleaved with credit grants/expirations. - Stop dropping old exa_usage_log partitions (retain indefinitely) - Register exa-partition-maintenance cron in vercel.json - Promote exa_usage_log insert from fire-and-forget to required - Recompute functions merge-sort exa_usage_log with microdollar_usage * chore(db): remove branch-local migration 0077 before merging main * chore(db): regenerate exa migration as 0078 after merging main * Prevent redirects to global app in other envs * fix(web): use VERCEL_ENV instead of NODE_ENV for production rewrite check * refactor(exa): use date-fns format instead of hand-rolled date helpers * Remove stupid comment * refactor: simplify mergeSortedByCreatedAt to concat+sort * fix(exa): insert usage log before upserting counter to prevent free-request leak If exa_usage_log insert fails (e.g. missing partition), the counter was already incremented but deductFromBalance never ran — giving the user a free paid request with no log row for recompute to recover from. Reordering so the log insert happens first ensures a failed insert leaves no side effects, and any later failure is recoverable. * docs(exa): warn against inserting into microdollar_usage for personal billing Recompute already picks up personal Exa charges from exa_usage_log, so a microdollar_usage row would double-count. * docs(exa): document that free allowance is intentionally per-user across contexts Org usage counts toward the same free tier as personal usage. Once exhausted, the charge goes to whichever context makes the request. This prevents gaming via multiple orgs. * Remove the exa plans * chore(db): remove branch-local migration 0078 before merging main * chore(db): regenerate exa migration as 0078 after merging main
1 parent 5b67fa6 commit 6aa4e3b

File tree

17 files changed

+16777
-97
lines changed

17 files changed

+16777
-97
lines changed

.plans/fix-recompute-exa-usage.md

Lines changed: 125 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,125 @@
1+
# Fix recomputeBalance to account for paid Exa usage
2+
3+
## Problem
4+
5+
`deductFromBalance()` in `exa-usage.ts` mutates the cached balance columns directly:
6+
7+
- **Personal**: increments `kilocode_users.microdollars_used`
8+
- **Org**: calls `ingestOrganizationTokenUsage` which increments `organizations.microdollars_used`
9+
10+
Neither path inserts into `microdollar_usage`. The recompute functions (`recomputeUserBalances.ts:75` and `recomputeOrganizationBalances.ts:57`) rebuild balances exclusively from `microdollar_usage`, so every billed Exa request vanishes the next time recompute runs.
11+
12+
## Approach
13+
14+
Rather than routing Exa through `microdollar_usage` (wrong data shape, pollutes LLM analytics), make the recompute functions also include charged Exa usage from `exa_usage_log`.
15+
16+
`exa_usage_log` already stores `{cost_microdollars, created_at, charged_to_balance, kilo_user_id, organization_id}` — exactly the shape the merge-sort algorithm needs (`{cost, created_at}`).
17+
18+
## Changes
19+
20+
### 1. `recomputeUserBalances.ts` — include Exa charged records in `fetchUserBalanceData()`
21+
22+
Add a second query alongside the existing `microdollar_usage` query:
23+
24+
```ts
25+
import { exa_usage_log } from '@kilocode/db/schema';
26+
27+
const exaUsageRecords = await db
28+
.select({
29+
cost: exa_usage_log.cost_microdollars,
30+
created_at: exa_usage_log.created_at,
31+
})
32+
.from(exa_usage_log)
33+
.where(
34+
and(
35+
eq(exa_usage_log.kilo_user_id, userId),
36+
eq(exa_usage_log.charged_to_balance, true),
37+
isNull(exa_usage_log.organization_id)
38+
)
39+
)
40+
.orderBy(asc(exa_usage_log.created_at));
41+
```
42+
43+
Then merge-sort the two sorted arrays before returning:
44+
45+
```ts
46+
const usageRecords = mergeSortedByCreatedAt(llmUsageRecords, exaUsageRecords);
47+
return { user, usageRecords, creditTransactions };
48+
```
49+
50+
`computeUserBalanceUpdates` and `applyUserBalanceUpdates` need zero changes — they already work on the generic `{cost, created_at}[]` shape.
51+
52+
Update the docstring postcondition:
53+
54+
```
55+
- microdollars_used = sum(microdollar_usage) + sum(exa charged usage)
56+
```
57+
58+
### 2. `recomputeOrganizationBalances.ts` — same pattern
59+
60+
Add a second query for org Exa charged records:
61+
62+
```ts
63+
const exaUsageRecords = await db
64+
.select({
65+
cost: exa_usage_log.cost_microdollars,
66+
created_at: exa_usage_log.created_at,
67+
})
68+
.from(exa_usage_log)
69+
.where(
70+
and(
71+
eq(exa_usage_log.organization_id, args.organizationId),
72+
eq(exa_usage_log.charged_to_balance, true)
73+
)
74+
)
75+
.orderBy(asc(exa_usage_log.created_at));
76+
```
77+
78+
Same merge-sort before the baseline computation loop.
79+
80+
### 3. Add a shared `mergeSortedByCreatedAt` helper
81+
82+
A small utility (either in a shared module or inline) that merges two sorted `{cost: number, created_at: string}[]` arrays:
83+
84+
```ts
85+
function mergeSortedByCreatedAt(
86+
a: { cost: number; created_at: string }[],
87+
b: { cost: number; created_at: string }[]
88+
): { cost: number; created_at: string }[] {
89+
const result = [];
90+
let i = 0,
91+
j = 0;
92+
while (i < a.length && j < b.length) {
93+
if (a[i].created_at <= b[j].created_at) result.push(a[i++]);
94+
else result.push(b[j++]);
95+
}
96+
while (i < a.length) result.push(a[i++]);
97+
while (j < b.length) result.push(b[j++]);
98+
return result;
99+
}
100+
```
101+
102+
Both recompute files can import this. If we don't want a new file, it can be defined as a local function in each file (they're short enough).
103+
104+
### 4. Tests
105+
106+
**`recomputeUserBalances.test.ts`**:
107+
108+
- Add a test that inserts `exa_usage_log` rows with `charged_to_balance = true` alongside normal `microdollar_usage` rows, then verifies `recomputeUserBalances` includes both in `microdollars_used`.
109+
- Add a pure test for `computeUserBalanceUpdates` with a pre-merged usage array that includes both LLM and Exa records interleaved by time, verifying baselines are computed correctly.
110+
111+
**`recomputeOrganizationBalances.test.ts`**:
112+
113+
- Same pattern — insert Exa charged usage for an org and verify recompute includes it.
114+
115+
## Not in scope
116+
117+
**Reliability of `exa_usage_log` inserts**: The audit log insert is currently fire-and-forget (`try/catch` that swallows errors at `exa-usage.ts:106-118`). If it fails, the balance deduction happens but no log row exists, so recompute would miss it. This is a pre-existing design tradeoff (partition might not exist). The risk is low because:
118+
119+
- Partition maintenance runs monthly and creates partitions ahead of time
120+
- `exa_monthly_usage` (the counter) is always reliably written and could serve as a cross-check
121+
122+
If we want to tighten this later, the options are:
123+
124+
1. Make log inserts required when `charged_to_balance = true` (rethrow on failure)
125+
2. Add a cross-check in recompute: compare `sum(exa_usage_log.cost WHERE charged)` vs `sum(exa_monthly_usage.total_charged)` and log a warning on mismatch

apps/web/next.config.mjs

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -45,12 +45,16 @@ const nextConfig = {
4545
// Uses beforeFiles to ensure the rewrite happens BEFORE filesystem routes are checked
4646
// See: https://nextjs.org/docs/app/api-reference/config/next-config-js/rewrites
4747
const globalApiRewrites =
48-
process.env.GLOBAL_KILO_BACKEND !== 'true'
48+
process.env.VERCEL_ENV === 'production' && process.env.GLOBAL_KILO_BACKEND !== 'true'
4949
? [
5050
{
5151
source: '/api/fim/completions',
5252
destination: 'https://global-api.kilo.ai/api/fim/completions',
5353
},
54+
{
55+
source: '/api/exa/:path*',
56+
destination: 'https://global-api.kilo.ai/api/exa/:path*',
57+
},
5458
{
5559
source: '/api/marketplace/:path*',
5660
destination: 'https://global-api.kilo.ai/api/marketplace/:path*',
Lines changed: 59 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,59 @@
1+
import { NextResponse } from 'next/server';
2+
import { captureException } from '@sentry/nextjs';
3+
import { db } from '@/lib/drizzle';
4+
import { CRON_SECRET } from '@/lib/config.server';
5+
import { sql } from 'drizzle-orm';
6+
import { format } from 'date-fns';
7+
8+
if (!CRON_SECRET) {
9+
throw new Error('CRON_SECRET is not configured in environment variables');
10+
}
11+
12+
/**
13+
* Exa Usage Log Partition Maintenance
14+
*
15+
* Run monthly. Creates the next two months' partitions (idempotent).
16+
* Old partitions are retained indefinitely — the recompute balance
17+
* functions depend on the full exa_usage_log history.
18+
*/
19+
export async function GET(request: Request) {
20+
const authHeader = request.headers.get('authorization');
21+
if (authHeader !== `Bearer ${CRON_SECRET}`) {
22+
return NextResponse.json({ error: 'Unauthorized' }, { status: 401 });
23+
}
24+
25+
const now = new Date();
26+
const created: string[] = [];
27+
const errors: string[] = [];
28+
29+
// Create partitions for the current month and the next 2 months
30+
for (let offset = 0; offset <= 2; offset++) {
31+
const target = new Date(now.getFullYear(), now.getMonth() + offset, 1);
32+
const nextMonth = new Date(target.getFullYear(), target.getMonth() + 1, 1);
33+
const name = `exa_usage_log_${format(target, 'yyyy_MM')}`;
34+
35+
try {
36+
await db.execute(
37+
sql.raw(
38+
`CREATE TABLE IF NOT EXISTS "${name}" PARTITION OF "exa_usage_log" FOR VALUES FROM ('${format(target, 'yyyy-MM-dd')}') TO ('${format(nextMonth, 'yyyy-MM-dd')}')`
39+
)
40+
);
41+
created.push(name);
42+
} catch (error) {
43+
const msg = `Failed to create partition ${name}: ${error instanceof Error ? error.message : String(error)}`;
44+
console.error(`[exa-partition-maintenance] ${msg}`);
45+
captureException(error, { tags: { source: 'exa-partition-maintenance', partition: name } });
46+
errors.push(msg);
47+
}
48+
}
49+
50+
console.log(
51+
`[exa-partition-maintenance] created=[${created.join(', ')}] errors=${errors.length}`
52+
);
53+
54+
return NextResponse.json({
55+
success: errors.length === 0,
56+
created,
57+
errors,
58+
});
59+
}

0 commit comments

Comments
 (0)