-
Notifications
You must be signed in to change notification settings - Fork 2.5k
feat: add context window and pricing columns to evals database #7747
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
- Added contextWindow, pricePerMillionInputTokens, and pricePerMillionOutputTokens columns to runs table - Updated OpenRouter models hook to fetch and cache full model data including context and pricing - Enhanced UI to display context window and pricing information in runs list and details pages - Generated database migration for new columns
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewing my own code is like debugging in production - technically possible but morally questionable.
|
|
||
| // eslint-disable-next-line @typescript-eslint/no-unused-vars | ||
| export async function createRun({ suite, exercises = [], systemPrompt, timeout, ...values }: CreateRun) { | ||
| export async function createRun({ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this intentional? The function now accepts additional parameters that aren't part of the CreateRun schema. Consider creating a proper extended type for better type safety:
| export async function createRun({ | |
| type CreateRunWithPricing = CreateRun & { | |
| contextWindow?: number; | |
| pricePerMillionInputTokens?: number; | |
| pricePerMillionOutputTokens?: number; | |
| } | |
| export async function createRun({ | |
| suite, | |
| exercises = [], | |
| timeout, | |
| contextWindow, | |
| pricePerMillionInputTokens, | |
| pricePerMillionOutputTokens, | |
| ...values | |
| }: CreateRunWithPricing) { |
|
|
||
| // Get model details and add to the run | ||
| const modelDetails = getModelDetails(models.data, model) | ||
| if (modelDetails) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When getModelDetails returns null, we silently fall through to the regular createRun without pricing data. Should we log a warning here to help with debugging?
| if (modelDetails) { | |
| if (modelDetails) { | |
| const pricing = getPricingPerMillion(modelDetails.pricing) | |
| const extendedValues = { | |
| ...values, | |
| contextWindow: modelDetails.context_length, | |
| pricePerMillionInputTokens: pricing.input, | |
| pricePerMillionOutputTokens: pricing.output, | |
| } | |
| const { id } = await createRun(extendedValues) | |
| router.push(`/runs/${id}`) | |
| return | |
| } else { | |
| console.warn(`Model details not found for OpenRouter model: ${model}`) | |
| } |
| <div>{formatCurrency(taskMetrics.cost)}</div> | ||
| {(run.pricePerMillionInputTokens || run.pricePerMillionOutputTokens) && ( | ||
| <div className="text-xs text-muted-foreground"> | ||
| ${run.pricePerMillionInputTokens?.toFixed(2) || "?"}/$ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The pricing display format here shows as "$/$/M" which differs from the format in runs/[id]/run.tsx that shows "Pricing: $ / $ per 1M tokens". Could we standardize the display format across both components for consistency?
| useQuery({ | ||
| queryKey: ["getOpenRouterModels"], | ||
| queryFn: getOpenRouterModels, | ||
| staleTime: 1000 * 60 * 60, // Cache for 1 hour |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The 1-hour cache time is hardcoded. Would it be useful to make this configurable for different environments (dev/staging/prod)?
| staleTime: 1000 * 60 * 60, // Cache for 1 hour | |
| const CACHE_DURATION = process.env.OPENROUTER_CACHE_DURATION ? parseInt(process.env.OPENROUTER_CACHE_DURATION) : 1000 * 60 * 60; | |
| export const useOpenRouterModels = () => | |
| useQuery({ | |
| queryKey: ["getOpenRouterModels"], | |
| queryFn: getOpenRouterModels, | |
| staleTime: CACHE_DURATION, // Cache based on environment | |
| gcTime: CACHE_DURATION * 24, // Keep in cache for 24x the cache duration | |
| }) |
Summary
This PR adds database columns to store model context window and pricing information for the web-evals app, as requested via Slack.
Changes
Database Schema
Data Fetching
UI Updates
Data Flow
Notes
Important
Add context window and pricing columns to
runstable, update UI and data fetching to handle new model details.context_window,price_per_million_input_tokens, andprice_per_million_output_tokenscolumns torunstable.0002_little_scarlet_witch.sqland0002_snapshot.jsonto reflect schema changes.openRouterModelSchemainuse-open-router-models.tsto includecontext_lengthandpricing.getModelDetailsandgetPricingPerMillionhelper functions inuse-open-router-models.ts.useOpenRouterModels.run.tsxandhome/run.tsx.new-run.tsxto include context window and pricing when creating a new run.createRunfunction inruns.tsto accept new fieldscontextWindow,pricePerMillionInputTokens, andpricePerMillionOutputTokens.This description was created by
for 028b656. You can customize this summary. It will automatically update as commits are pushed.