|
| 1 | +# openai-caching-proxy-worker |
| 2 | + |
| 3 | +Basic caching proxy for OpenAI API, deployable as a [Cloudflare Worker](https://workers.cloudflare.com/). |
| 4 | + |
| 5 | +This can help reduce OpenAI costs (and get faster results) by returning cached responses for repeated requests. |
| 6 | + |
| 7 | +It only caches `POST` requests that have a JSON request body, as these tend to be the slowest and are the only ones that cost money (for now). |
| 8 | + |
| 9 | +### Setup |
| 10 | + |
| 11 | +Clone the repo and install dependencies. |
| 12 | + |
| 13 | +You will need to sign up for two services if you haven't already: |
| 14 | + |
| 15 | +- [Cloudflare](https://www.cloudflare.com): |
| 16 | +- [Upstash](https://upstash.com): Cache is stored using Upstash's Worker-compatible redis-over-HTTP service. |
| 17 | + |
| 18 | +Both Cloudflare and Upstash have generous free plans. |
| 19 | + |
| 20 | +Set up your redis secrets based on instructions in `wrangler.toml`. |
| 21 | + |
| 22 | +### Usage |
| 23 | + |
| 24 | +Start the proxy server (will start at http://localhost:8787 by default): |
| 25 | + |
| 26 | +``` |
| 27 | +yarn start |
| 28 | +``` |
| 29 | + |
| 30 | +Then, in your [openai/openai-node](https://github.com/openai/openai-node) configuration, pass in the new `basePath` so that it sends requests through your proxy rather than directly to OpenAI: |
| 31 | + |
| 32 | +```diff |
| 33 | +const { Configuration, OpenAIApi } = require("openai"); |
| 34 | + |
| 35 | +const configuration = new Configuration({ |
| 36 | + apiKey: process.env.OPENAI_API_KEY, |
| 37 | ++ basePath: 'http://localhost:8787/proxy', |
| 38 | +}); |
| 39 | +const openai = new OpenAIApi(configuration); |
| 40 | +``` |
| 41 | + |
| 42 | +You can then try a few sample requests. The first will be proxied to OpenAI since a cached response isn't yet saved for it, but the second repeated/duplicate request will return the cached result instead. |
| 43 | + |
| 44 | +```ts |
| 45 | +const options = { model: 'text-ada-001', prompt: 'write a poem about computers' }; |
| 46 | + |
| 47 | +// This first request will be proxied as-is to OpenAI API, since a cached |
| 48 | +// response does not yet exist for it: |
| 49 | +const completion = await openai.createCompletion(options); |
| 50 | +console.log('completion:', completion); |
| 51 | + |
| 52 | +// This second request uses the same options, so it returns nearly instantly from |
| 53 | +// local cache and does not make a request to OpenAI: |
| 54 | +const completionCached = await openai.createCompletion(options); |
| 55 | +console.log('completionCached:', completionCached); |
| 56 | +``` |
| 57 | + |
| 58 | +### Specifying a cache TTL |
| 59 | + |
| 60 | +If you don't want to indefinitely cache results, or you don't have an eviction policy set up on your redis instance, you can specify a TTL in seconds using the `X-Proxy-TTL` header. |
| 61 | + |
| 62 | +```diff |
| 63 | +const configuration = new Configuration({ |
| 64 | + ... |
| 65 | ++ baseOptions: { |
| 66 | ++ // In this example, specify a cache TTL of 24 hours before it expires: |
| 67 | ++ headers: { 'X-Proxy-TTL': 60 * 60 * 24 } |
| 68 | ++ } |
| 69 | +}); |
| 70 | +``` |
| 71 | + |
| 72 | +### Refreshing the cache |
| 73 | + |
| 74 | +If you need to force refresh the cache, you can use the header `X-Proxy-Refresh`. This will fetch a new response from OpenAI and cache this new response. |
| 75 | + |
| 76 | +```diff |
| 77 | +const configuration = new Configuration({ |
| 78 | + ... |
| 79 | ++ baseOptions: { |
| 80 | ++ headers: { 'X-Proxy-Refresh': 'true' } |
| 81 | ++ } |
| 82 | +}); |
| 83 | +``` |
| 84 | + |
| 85 | +### Samples |
| 86 | + |
| 87 | +See `/samples/sample-usage.ts` for a full example of how to call this proxy with your openai client. |
0 commit comments