-
Notifications
You must be signed in to change notification settings - Fork 73
feat: cloudflare workers AI + statsig experiments and analytics integration #3182
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from 7 commits
Commits
Show all changes
12 commits
Select commit
Hold shift + click to select a range
089063d
Create workersai.md
tore-statsig ce4c20a
Update sidebars.ts
tore-statsig 56ef94f
Update workersai.md
tore-statsig 60ccff8
Update workersai.md
tore-statsig 53f6a3a
Update cspell.json
tore-statsig 569fe9f
Update workersai.md
tore-statsig c7eadc0
Update workersai.md
tore-statsig 649880c
Update workersai.md
tore-statsig 0ef47bf
Update docs/integrations/workersai.md
tore-statsig 97f76df
Update workersai.md
tore-statsig 1e1de1f
Update docs/integrations/workersai.md
tore-statsig 8336d3a
Update workersai.md
tore-statsig File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,143 @@ | ||
| --- | ||
| title: Cloudflare Workers AI | ||
| sidebar_label: Cloudflare Workers AI | ||
| keywords: | ||
| - owner:tore | ||
| last_update: | ||
| date: 2025-07-03 | ||
| --- | ||
|
|
||
| ## Statsig Cloudflare Workers AI Integration | ||
|
|
||
| By integrating Statsig with Cloudflare Workers AI, you can easily conduct experiments on different prompts and models, and gather real-time analytics on model performance and usage. Statsig provides the tools to dynamically control variations, measure success metrics, and gain insights into your AI deployments at the edge. | ||
|
|
||
| For generic setup of Statsig with Cloudflare Workers (including KV namespace configuration and SDK installation), please refer to our [Cloudflare Workers Integration documentation](/integrations/cloudflare). | ||
|
|
||
| For setting up Workers AI itself, please refer to the [Cloudflare Workers AI documentation](https://developers.cloudflare.com/workers-ai/). | ||
|
|
||
| ### Vision | ||
|
|
||
| When you deploy a Cloudflare Worker running AI code, Statsig can automatically inject lightweight instrumentation to capture inference requests and responses. Statsig can track the key metadata for each request (models, latency, token usage), but you can include any others you find valuable (success rates, user interactions, etc). | ||
|
|
||
| This integration empowers developers to: | ||
|
|
||
| * **Experimentation:** Easily set up experiments (e.g., prompt “A” vs. prompt “B”, llama vs deepseek models) and define success metrics (conversion, quality rating, user retention). Statsig dynamically determines which variation each request should use, ensuring statistically valid traffic splits. | ||
| * **Real-time Analytics:** The integrated Statsig SDK sends anonymized event data (model outputs, user interactions, metrics) back to Statsig’s servers in real time. Data is gathered at the edge with minimal overhead, then streamed to Statsig for fast analysis. | ||
|
|
||
| ### Use Case 1: Prompt and/or Model Experiments | ||
|
|
||
| This use case demonstrates how to use Statsig experiments to test different prompts and AI models within your Cloudflare Worker. For the sake of this example, we have 4 groups in our experiment. A control, with our default prompt and llama model, and then each posible variant switching to a different prompt and/or model (deepseek, in this case). | ||
tore-statsig marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
|
|
||
| #### Sample Experiment Setup in Statsig Console | ||
|
|
||
|  | ||
|
|
||
| #### Worker Code for Prompt/Model Experimentation | ||
|
|
||
| ```typescript | ||
| import { CloudflareKVDataAdapter } from 'statsig-node-cloudflare-kv'; | ||
| import Statsig from 'statsig-node'; | ||
| import { StatsigUser } from 'statsig-node'; | ||
|
|
||
| export default { | ||
| async fetch(request: Request, env: Env, ctx: ExecutionContext): Promise<Response> { | ||
| await initStatsig(env); | ||
|
|
||
| // ideally, use a logged in userid. In this example, I have the RayID from cloudflare | ||
| const rayID = request.headers.get('cf-ray') || ''; | ||
| const user = { | ||
| userID: rayID, | ||
| }; | ||
|
|
||
| const promptExp = Statsig.getExperimentSync( | ||
| user, | ||
| "workers_ai_experiment", // Name of your experiment in Statsig Console | ||
tore-statsig marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| ); | ||
| // fetch the prompt and model to use for this ray ID | ||
| // providing default values in case of failure to initialize statsig from the kv store | ||
| const prompt = promptExp.get("prompt", "What is the origin of the phrase Hello, World"); | ||
tore-statsig marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| const model = promptExp.get("model", "@cf/meta/llama-3.1-8b-instruct"); | ||
|
|
||
| const start = performance.now(); | ||
| const response = await env.AI.run(model, { | ||
| prompt, | ||
| }); | ||
| const end = performance.now(); | ||
| const aiInferenceMs = end - start; | ||
|
|
||
| logUsageToStatsig(user, model, response, aiInferenceMs); | ||
| ctx.waitUntil(Statsig.flush(1000)); | ||
| return new Response(JSON.stringify(response.response)); | ||
| }, | ||
| } satisfies ExportedHandler<Env>; | ||
|
|
||
| /** | ||
| * Logs AI model usage and performance metrics to Statsig. | ||
| * @param user The StatsigUser object. | ||
| * @param model The name of the AI model used. | ||
| * @param response The response object from the AI model (expected to contain a 'usage' field). | ||
| * @param aiInferenceMs The time taken for AI inference in milliseconds. | ||
| */ | ||
| function logUsageToStatsig(user: StatsigUser, model: string, response: any, aiInferenceMs?: number) { | ||
| const metadata = { | ||
| ...(response?.usage || {}), | ||
| ai_inference_ms: aiInferenceMs, | ||
| }; | ||
|
|
||
| Statsig.logEvent(user, "cf_ai", model, metadata); | ||
| } | ||
|
|
||
| /** | ||
| * Initializes the Statsig SDK. | ||
| * Make sure you have the right bindings configured for the KV, and a secret for the Statsig API key | ||
| * Refer to https://docs.statsig.com/integrations/cloudflare for more details on integrating Statsig with Cloudflare workers | ||
| * @param env The Workers environment variables. | ||
| */ | ||
| async function initStatsig(env: Env) { | ||
| const dataAdapter = new CloudflareKVDataAdapter(env.STATSIG_KV, 'statsig-YOUR_STATSIG_PROJECT_ID'); // Replace with your actual project ID | ||
| await Statsig.initialize( | ||
| env.STATSIG_SERVER_API_KEY, // Your Statsig secret key | ||
| { | ||
| dataAdapter: dataAdapter, | ||
| postLogsRetryLimit: 0, | ||
| initStrategyForIDLists: 'none', | ||
| initStrategyForIP3Country: 'none', | ||
| disableIdListsSync: true, | ||
| disableRulesetsSync: true, // Optimizations for fast initialization in Cloudflare Workers | ||
| }, | ||
| ); | ||
| } | ||
| ``` | ||
|
|
||
| **Explanation:** | ||
|
|
||
| 1. **`initStatsig(env)`**: This function initializes the Statsig SDK using the `CloudflareKVDataAdapter` to fetch configurations from Cloudflare KV, ensuring low-latency access to your experiment setups. Make sure to replace `'statsig-YOUR_STATSIG_PROJECT_ID'` with your actual Statsig project ID and configure `STATSIG_SERVER_API_KEY` and `STATSIG_KV` as environment variables in your Worker. | ||
| 2. **`Statsig.getExperimentSync(...)`**: This is the core of the experimentation. It retrieves the assigned experiment variant for the current user (based on `rayID`) for the `workers_ai_experiment` experiment. The `get()` method then safely retrieves the `prompt` and `model` parameters defined in your Statsig experiment, falling back to default values if the experiment or parameter is not found. | ||
| 3. **`env.AI.run(model, { prompt })`**: This executes the AI model provided by Cloudflare Workers AI with the dynamically chosen `model` and `prompt`. | ||
| 4. **Latency Measurement**: `performance.now()` is used to capture the start and end times of the AI inference, allowing you to track the `ai_inference_ms` metric. | ||
| 5. **`logUsageToStatsig(...)`**: This function logs a custom event (`cf_ai`) to Statsig. It includes the `model` used as the event value and attaches metadata such as `ai_inference_ms` and any `usage` information (e.g., token counts) returned by the AI model. This data is crucial for analyzing model performance and cost. | ||
tore-statsig marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
| 6. **`ctx.waitUntil(Statsig.flush(1000))`**: This ensures that all logged events are asynchronously sent to Statsig before the Worker's execution context is terminated, without blocking the response to the user. | ||
|
|
||
| ### Use Case 2: Model Analytics | ||
|
|
||
| Beyond experiments, the logging mechanism demonstrated above provides valuable insights into your AI model's performance and usage patterns. You could keep the default parameters above and still get insights from the metadata you log to Statsig. | ||
|
|
||
| #### What to track for Model Analytics: | ||
|
|
||
| * **Latency (`ai_inference_ms`):** Crucial for understanding user experience. You can monitor average, P90, P99 latencies in Statsig. | ||
| * **Model Usage (e.g., `prompt_tokens`, `completion_tokens`):** If your AI provider returns token counts, logging these allows you to track cost and efficiency. | ||
| * **Error Rates:** Log events when the AI model returns an error or an unexpected response. | ||
| * **Output Quality (via custom events):** | ||
| * **User Feedback:** If your application allows users to rate the AI's response (e.g., thumbs up/down), log these as Statsig events. | ||
| * **Downstream Metrics:** Track how the AI's output influences key business metrics (e.g., conversion rates if the AI is generating product descriptions, or user engagement if it's a chatbot). | ||
|
|
||
| #### How to view Model Analytics in Statsig | ||
|
|
||
| By consistently logging these metrics, you can create custom dashboards in Statsig Pulse to monitor the health and effectiveness of your AI models in real-time. This allows you to identify performance bottlenecks, cost inefficiencies, and areas for improvement. | ||
|
|
||
| ### Example Use Cases enabled by this Integration | ||
|
|
||
| * **Prompt Tuning:** An e-commerce app running on Workers AI tries two different prompt styles for product descriptions. Statsig tracks cart conversion and time on site, revealing which prompt yields higher sales. | ||
| * **Model Selection:** A developer tests GPT-3.5 vs. GPT-4 within Cloudflare Workers AI. Statsig shows which model, combined with specific temperature or frequency penalty values, generates more accurate or user-satisfying results. | ||
| * **Response Latency vs. Quality:** By varying max token length and frequency penalties within an experiment, Statsig helps optimize for speed without sacrificing accuracy, crucial for user-facing chat applications. | ||
| * **Cost Optimization:** Monitor `prompt_tokens` and `completion_tokens` by model and prompt variant to identify the most cost-effective AI configurations. | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.