-
Notifications
You must be signed in to change notification settings - Fork 79
Implement OpenAI passthrough backend #105
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This commit adds an OpenAI passthrough backend. In order to do this, a
few minor tweaks to the Backend interface were. More significantly, the
OpenAI API handling had to be tweaked to allow some additional methods.
The backend operates as a standard backend, but uses a placeholder model
name ("passthrough") to avoid allocating one runner per OpenAI model.
I've added a few more methods (most notably the rest of the chat
completions API and the responses API), but not all methods yet because
many of the multimodal APIs return responses that we can't record.
Signed-off-by: Jacob Howard <[email protected]>
Signed-off-by: Jacob Howard <[email protected]>
Signed-off-by: Jacob Howard <[email protected]>
df6d3c0 to
e9f3b2f
Compare
|
Some example commands to test: Model CLI support is pending. |
|
@doringeman I'd like to add support for other API endpoints (e.g. audio and images). The code here can handle it, but I don't think we want to record those responses (since they could be big), so I've intentionally avoided registering those endpoints. I'm thinking we only record responses if in a text-based mode. WDYT? |
|
@ilopezluna Just reminded me that you can already do audio via the completions endpoint (and I'd assume images too), so maybe we should adjust the recorder to avoid capturing that. |
| // that acts as a proxy for inference infrastructure that's managed outside | ||
| // of the model runner. This also implies that the backend uses external | ||
| // model management. | ||
| Passthrough() bool |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We currently use this function to determine whether the backend is "passthrough," and based on that, we perform either A or B. This works perfectly fine for now. However, if we introduce a new type of backend in the future, we might need to add a new isWhatever() function that returns false for all cases except the new one.
I was wondering if it might make sense to use a function that returns a backend type instead, something like Type() BackendType. No need to change anything right now, I’m just sharing the thought in case we end up adding another backend. It might make future refactoring easier.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a good though. I'm not sure how many "types" we'll end up with, so I agree, let's wait. I was also thinking maybe backends should support some sort of SupportsMode(mode BackendMode) method to control which APIs get routed to them; maybe that could be done simultaneously.
doringeman
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
I missed the initial discussion about this, but shouldn't we allow configuring an upstream URL other than https://api.openai.com/v1/?
E.g., https://generativelanguage.googleapis.com/v1beta/openai
(https://ai.google.dev/gemini-api/docs/openai#rest)
Perhaps via a custom X-Upstream-URL HTTP header.
Of course this would be for later, not for this initial PR.
p1-0tr
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Co-authored-by: Ignasi <[email protected]>
Co-authored-by: Dorin-Andrei Geman <[email protected]>
|
@doringeman it's a good question re: other URLs. I had thought about maybe making this a more general passthrough backend (since there's very little here specific to OpenAI). It should be an easy lift - the most critical part is maybe the |
Signed-off-by: Jacob Howard <[email protected]>
doringeman
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
7d5932e to
d370bbd
Compare
|
Closing since we're going to take a slightly different approach. I'll leave the branch intact for now. |
Skip copy config on DD local and cloud
This PR implements our first passthrough backend, in this case going out to OpenAI. This is mostly an exercise in ensuring that these types of backends will fit into our architecture. A few adjustments had to be made, but otherwise things worked pretty well. A few notes:
Backendinterface had to be tweaked slightlyBackendModeconcept with aBackendModePassthroughtype"passthrough"and mode"passthrough";psandunloadworkcontext_sizeisn't configurable for OpenAI models and runtime flags don't map to any API concept; but other configuration would be easy to add later via the reverse proxy directorThis is still in draft pending tests.